Llama-cpp-python binding missing symbol in llama.cpp libllama docker image

pjmnoble · March 17, 2025, 8:39pm

Just got my ALtra system and running aiming to deploy it for lower power consumption inference Using Rocky Linux 9.5

wanted to try out the llama.cpp docker image and used AmpereComputingAI:llama.cpp as the basis of a Docker that also serves out Jupyter notebooks.

I pip installed the llama-cpp-python (0.3.8) into the image but can’t import the Llama class.

error message when it gets to the line
from llama_cpp import Llama
is:

RuntimeError: Failed to load shared library ‘/usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so’: /usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so: undefined symbol: ggml_backend_reg_count

seems to be. looking of for a symbol in libllama.so which is not there so ? is the version of llama.cpp in the docker image a bit older ? Tried with llama-cpp-python 0.3.6 in case but no joy.

update:
so installing llama-cpp-python involves compiling a new version of libllama.so (the one in at /usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so) while the Ampere optimised llama .cpp in the docker (in /llm) is then, presumably ignored. Not sure if the compilation in the docker environment captures the Ampere optimisations - also given that llama-cpp-python is pointing at the version of libllama.so created during its own installation, I’m guessing the issue is primarily with llama-cpp-python ? Not sure if I shoudl be passing some parameters to CMake prior to the pip install to get correct compilation in this docker ?

dneary · March 18, 2025, 6:10pm

Hi @pjmnoble ! This is a question that maybe @kkrysa can help answer. I am not sure if pip installing a module on top of the Ampere base image would cause problems.

Dave.

dneary · March 25, 2025, 2:13pm

Popping this back to the top of the stack - anyone familiar with the llama.cpp image from AmpereComputing? Tagging @trigoni for awareness - Tony, do you know who might be able to help @pjmnoble ?

kkrysa · March 27, 2025, 4:24pm

I’m happy to help in bringing the right people with the required technical expertise. I’ll share the link to the thread to our AI engineering team.

kkrysa · March 27, 2025, 4:33pm

Already heard back. “[The Ampere AI engineering team is] already working on updating our fork of llama-cpp-python”. Also, sorry for such a late reply, I was away at a conference and sick since coming back.

pjmnoble · March 28, 2025, 5:32pm

Sounds great - I’ll keep an eye out for releases

dkupnicki · March 31, 2025, 1:51pm

We’ve published a new version of llama-cpp-python and our docker image: AmpereComputingAI/llama.cpp:2.2.1. Our version of llama-cpp-python is based on a slightly older version of llama-cpp-python (v0.3.2) for compatibility with the AmpereComputingAI/llama.cpp docker image. We’re currently working on bringing our optimized llama.cpp docker image to a newer version and we’ll release a newer version of llama-cpp-python then.

To install it just download the .whl file and run python3 -m pip install llama_cpp_python-0.3.2-cp312-cp312-linux_aarch64.whl in a docker container.

Topic		Replies	Views
Hello, world. The Ampere Porting Advisor - now on github General Discussion ampere	2	443	July 19, 2023
New quantization methods for llama.cpp AI/ML	5	103	August 11, 2025
Hosting and scaling LLMs on OKE for production-grade GenAI solutions AI/ML	5	81	December 6, 2024
Anyone runs HPL on Ampere platform? Which BLAS library is recommended? General Discussion	3	80	August 4, 2025
Tutorial for cryptography on Ampere processor is online! General Discussion ampere	3	54	August 14, 2025

Llama-cpp-python binding missing symbol in llama.cpp libllama docker image

Related topics