I have a number of other cards I’d like to test out, but I was wondering if anyone else has tested any and found them to work (either Nvidia or AMD).
I just plugged in my RTX 8000 and tried the latest Nvidia Professional driver for aarch64, version 525.105.17, but when it tries initializing the GPU, dmesg outputs:
[ 477.995897] NVRM: GPU 000d:01:00.0: RmInitAdapter failed! (0x24:0x65:1423)
[ 477.995979] NVRM: GPU 000d:01:00.0: rm_init_adapter failed, device minor number 0
I am able to get video output over DisplayPort, and Ubuntu 22.04 sees the card just fine, but it doesn’t seem to be able to use any card features, and nvidia-smi returns No devices were found.
I know cloud providers are pairing up Altra Max with cards like the A100, so there must be known-good combinations. Do I have to drop back to Ubuntu 20.04 to make things work?
from what I can tell, Ampere has some of the same memory mapping / caching issues (related to write combining, etc.) again, which you’re already very familiar with
The good news is, there are some hacks/workarounds:
Those won’t be good for performance at all, but it could (should?) work.
What kernel version are you using? You might even want to just compile yourself a linux/master kernel.
There was some work on AMD GPU support on Arm64 in the recent 6.2 kernel release, so you might even give your AMD GPU a shot with a very recent kernel.
Unfortunately, I don’t have a workstation Ampere yet, only servers in the datacenter, so I can’t join into the fun yet