Llama-cpp-python binding missing symbol in llama.cpp libllama docker image

Just got my ALtra system and running aiming to deploy it for lower power consumption inference Using Rocky Linux 9.5

wanted to try out the llama.cpp docker image and used AmpereComputingAI:llama.cpp as the basis of a Docker that also serves out Jupyter notebooks.

I pip installed the llama-cpp-python (0.3.8) into the image but can’t import the Llama class.

error message when it gets to the line
from llama_cpp import Llama
is:

RuntimeError: Failed to load shared library ‘/usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so’: /usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so: undefined symbol: ggml_backend_reg_count

seems to be. looking of for a symbol in libllama.so which is not there so ? is the version of llama.cpp in the docker image a bit older ? Tried with llama-cpp-python 0.3.6 in case but no joy.

update:
so installing llama-cpp-python involves compiling a new version of libllama.so (the one in at /usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so) while the Ampere optimised llama .cpp in the docker (in /llm) is then, presumably ignored. Not sure if the compilation in the docker environment captures the Ampere optimisations - also given that llama-cpp-python is pointing at the version of libllama.so created during its own installation, I’m guessing the issue is primarily with llama-cpp-python ? Not sure if I shoudl be passing some parameters to CMake prior to the pip install to get correct compilation in this docker ?

3 Likes

Hi @pjmnoble ! This is a question that maybe @kkrysa can help answer. I am not sure if pip installing a module on top of the Ampere base image would cause problems.

Dave.

1 Like

Popping this back to the top of the stack - anyone familiar with the llama.cpp image from AmpereComputing? Tagging @trigoni for awareness - Tony, do you know who might be able to help @pjmnoble ?

1 Like

I’m happy to help in bringing the right people with the required technical expertise. I’ll share the link to the thread to our AI engineering team.

Already heard back. “[The Ampere AI engineering team is] already working on updating our fork of llama-cpp-python”. Also, sorry for such a late reply, I was away at a conference and sick since coming back.

1 Like

Sounds great - I’ll keep an eye out for releases

We’ve published a new version of llama-cpp-python and our docker image: AmpereComputingAI/llama.cpp:2.2.1. Our version of llama-cpp-python is based on a slightly older version of llama-cpp-python (v0.3.2) for compatibility with the AmpereComputingAI/llama.cpp docker image. We’re currently working on bringing our optimized llama.cpp docker image to a newer version and we’ll release a newer version of llama-cpp-python then.

To install it just download the .whl file and run python3 -m pip install llama_cpp_python-0.3.2-cp312-cp312-linux_aarch64.whl in a docker container.

4 Likes