It can be done. If you are in an Ampere chassis they have GPU h/w that Ampere has qualified. Putting a usual PCIE card (geforce 2080ti say) can also be done, but may be tough bcos space is very tight. Also need to get cables to go from the Ampere provided cables to ones that fit a GPU like a 2080Ti. hth. -Steve
NVIDIA GPUs I’ve seen people use in their Ampere systems:
L4 L40 L40S A10 A16 A40 A100 H100 with NVLink
RTX A400, 2000 Ada SFF, 4000 Ada, 4000 Ada SFF, 5000 Ada, 6000 Ada, A4500 with NVLink, 4070, 4080, 4080 Ti, 4090, 3070
GTX 750 TI, 1050, 1060, 1070
There may be a few others. NVIDIA and Ampere have worked together many years, NVIDIA has used Ampere processors in some of their own products including android in cloud gaming and Arm HPC Dev Kit. NVIDIA has lots of Ampere gear for driver dev/test. And NVIDIA is “all-in” on Arm with their own products including Jetson and Grace. So arm64 is heavily invested in and well supported by NVIDIA.
Also various recent AMD GPUs. Those run as-is with AmpereOne and require PCIe patch with Altra. For example Radeon RT 7800 XT used by user cryptoclixer.
I know I can use it on the host OS, I need to use it inside a VM, using GPU passthrough, and so far I haven’t been able to do that. Drivers were installed and modules were loaded ok (on Debian trixie/testing). I could even call nvitop, but as soon as I tried running actual software it would just throw multiple PCI errors both on host kernel and guest. I tried a 4060.
hmm, that is puzzling. Automakers use it in VMs I think using virtio-gpu without issue. Maybe start a topic and share details for us to sort it with you. @bexcran
Just try to run GPU passthrough on my VM, everything is ok.
Tried to run ollama model inside VM, and there is no problem.
Host: Altrad8dud + AltraMax
GPU: NV 4080 16G
Host OS: ubuntu 22.04 + HWE kernel
Guest OS: ubuntu 22.04 default kenel
If you see Nvidia device in the VM but cannot install Nvidia driver, just disable secure boot option in UEFI.
Tested on RTX A4000, works well on Ubuntu 22.04, 24.04 server and desktop. For 24.04 desktop it’s a bit tricky, but doable to make it working from Ubuntu iso.
Don’t know if any of you have noticed but there are NVidia kernels. Like linux-image-6.5.0-31-nvidia, 6.8.0-1009-nvidia and -64k as well as 6.11.0-1003-nvidia and -64k.
There are more, but I have tested only 3 mentioned above.
Now on 6.11.0-1003-nvidia-64k, with CUDA toolkit 12.8, so far works well, nevertheless I didn’t perform any more in depth testing. Ollama with DeepSeek-R1 32B, works, a bit slow “thinking”, but that’s the issue with 16GB VRAM. Waiting for RTX 6000 Blackwell with as they are saying 96GB VRAM :).