seems to be. looking of for a symbol in libllama.so which is not there so ? is the version of llama.cpp in the docker image a bit older ? Tried with llama-cpp-python 0.3.6 in case but no joy.
update:
so installing llama-cpp-python involves compiling a new version of libllama.so (the one in at /usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so) while the Ampere optimised llama .cpp in the docker (in /llm) is then, presumably ignored. Not sure if the compilation in the docker environment captures the Ampere optimisations - also given that llama-cpp-python is pointing at the version of libllama.so created during its own installation, I’m guessing the issue is primarily with llama-cpp-python ? Not sure if I shoudl be passing some parameters to CMake prior to the pip install to get correct compilation in this docker ?
Hi @pjmnoble ! This is a question that maybe @kkrysa can help answer. I am not sure if pip installing a module on top of the Ampere base image would cause problems.
Popping this back to the top of the stack - anyone familiar with the llama.cpp image from AmpereComputing? Tagging @trigoni for awareness - Tony, do you know who might be able to help @pjmnoble ?
Already heard back. “[The Ampere AI engineering team is] already working on updating our fork of llama-cpp-python”. Also, sorry for such a late reply, I was away at a conference and sick since coming back.
We’ve published a new version of llama-cpp-python and our docker image: AmpereComputingAI/llama.cpp:2.2.1. Our version of llama-cpp-python is based on a slightly older version of llama-cpp-python (v0.3.2) for compatibility with the AmpereComputingAI/llama.cpp docker image. We’re currently working on bringing our optimized llama.cpp docker image to a newer version and we’ll release a newer version of llama-cpp-python then.
To install it just download the .whl file and run python3 -m pip install llama_cpp_python-0.3.2-cp312-cp312-linux_aarch64.whl in a docker container.