GPU support for Ampere Altra?

geerlingguy · April 18, 2023, 3:38am

…specifically for the Ampere Altra Development Platform or Dev Kit.

Right now the list of supported graphics cards for the Dev Kit / Dev Platform is a bit slim:

GeForce® RTX™ 3070 Ti VENTUS 3X 8G OC
PNY GeForce® RTX™ 3060 Ti 8GB
GeForce® GTX 1050 2G OC
ZOTAC GeForce® GTX 1060 3GB AMP Core Edition

I have a number of other cards I’d like to test out, but I was wondering if anyone else has tested any and found them to work (either Nvidia or AMD).

I just plugged in my RTX 8000 and tried the latest Nvidia Professional driver for aarch64, version 525.105.17, but when it tries initializing the GPU, dmesg outputs:

[  477.995897] NVRM: GPU 000d:01:00.0: RmInitAdapter failed! (0x24:0x65:1423)
[  477.995979] NVRM: GPU 000d:01:00.0: rm_init_adapter failed, device minor number 0

I am able to get video output over DisplayPort, and Ubuntu 22.04 sees the card just fine, but it doesn’t seem to be able to use any card features, and nvidia-smi returns No devices were found.

I know cloud providers are pairing up Altra Max with cards like the A100, so there must be known-good combinations. Do I have to drop back to Ubuntu 20.04 to make things work?

I’ve been documenting my adventure in this issue.

Manawyrm · April 18, 2023, 9:54am

Hi Jeff!

from what I can tell, Ampere has some of the same memory mapping / caching issues (related to write combining, etc.) again, which you’re already very familiar with

The good news is, there are some hacks/workarounds:

github.com/Tencent/TencentOS-kernel

ampere/arm64: Add a fixup handler for alignment faults in aarch64 code

committed 04:21PM - 05 Jul 21 UTC

+745 -0

A later patch will hand out Device memory in some cases to code which expects a …Normal memory type, as an errata workaround. Unaligned accesses to Device memory will fault though, so here we add a fixup handler to emulate faulting accesses, at a performance penalty. Many of the instructions in the Loads and Stores group are supported, but these groups are not handled here: * Advanced SIMD load/store multiple structures * Advanced SIMD load/store multiple structures (post-indexed) * Advanced SIMD load/store single structure * Advanced SIMD load/store single structure (post-indexed) * Load/store memory tags * Load/store exclusive * LDAPR/STLR (unscaled immediate) * Load register (literal) [cannot Alignment fault] * Load/store register (unprivileged) * Atomic memory operations * Load/store register (pac) Instruction implementations are translated from the Exploration tools' ASL specifications. Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>

Those won’t be good for performance at all, but it could (should?) work.

What kernel version are you using? You might even want to just compile yourself a linux/master kernel.

There was some work on AMD GPU support on Arm64 in the recent 6.2 kernel release, so you might even give your AMD GPU a shot with a very recent kernel.

Unfortunately, I don’t have a workstation Ampere yet, only servers in the datacenter, so I can’t join into the fun yet

Please keep us updated

q66 · May 27, 2023, 2:59am

some good news: AMD GPUs on the Altra devkit and other Altras - patches available now

shadethegrey · March 9, 2024, 6:29am

I’m running an Nvidia T600 on the asrock rack motherboard in Fedora 39 with nvidia drivers.

JoeSpeed · March 10, 2024, 12:25am

Nice! So now the ones I know of being used with Ampere are NVIDIA T600, RTX A4500, A6000, 4060, 4070, 4080, 4000 Ada SFF, 3070 Ti, 3060 GTX 1050, A100, A100X, A10, A10G, A16, A30. I think also L4, L40.

anernest · March 21, 2024, 8:01pm

I just picked up an ASRock Rack ALTRAD8UD-1L2T and plugged an RTX 4090 into it with no luck yet. It goes into a bootloop prior to loading the OS. ASRock Rack Tech Support thinks it might be some BIOS settings for PCIe. I’ll keep trying when I get some down time.

pimzand · March 31, 2024, 2:46pm

The list of supported graphics cards are all Nvidia.
Does this imply Nvidia closed source drivers are required?
Or would the open source “nouveau” driver work?

geerlingguy · April 1, 2024, 10:21pm

4070 Ti here, haven’t tried a non-Ti version.

dgilmore · April 1, 2024, 11:50pm

I am using a GeForce RTX 3060 with nouveau drivers

TheComputerGuy · April 13, 2024, 4:00am

I’m curious if the RX 6800 works with the amdgpu driver in Linux.

pimzand · June 28, 2024, 10:12am

Hi all,

I’m really having a hard time to get any GPU properly working on my ASRockRack board using Fedora 40.

@bexcran, @Civiloid , @shadethegrey , you have had more success. Any idea?
What really stomps me is the “no signal” part. I have used three different cables, mini-DP to mini-DP, mini-DP to DP and mini-DP to HDMI.

I was really reluctant to get an NVidia T1000, because I want to use manufacturer supported open source drivers, but even that did not work.

card	architecture	driver	kernel	pcie patches applied	pre-boot graphics using GOP	result
NVidia GF 610	Curie	nouveau	6.9.4-200.fc40	no	no	driver does not load, firmware missing
AMD Radeon WX3200 Pro	Polaris	amdgpu	6.9.4-200.fc40	no	manual from UEFI shell only	garbled screen, kernel errors
AMD Radeon WX3200 Pro	Polaris	amdgpu	6.9.0 vanilla	yes	manual from UEFI shell only	works, but really sluggish, kernel errors
Sapphire Pulse Radeon RX 6400	Navi 2	amdgpu	6.9.4-200.fc40	no	yes, starting in grub	driver crashes
Sapphire Pulse Radeon RX 6400	Navi 2	amdgpu	6.9.0 vanilla	yes	yes, starting in grub	driver crashes
Sparkle Intel Arc A310 Eco	DG2	xe	6.9.0 vanilla	yes	no	driver loads OK but no signal on monitor
Sparkle Intel Arc A310 Eco	DG2	xe from intel drm-next	6.10.0.-rc3	yes	no	driver loads OK but no signal on monitor
NVidia T1000	Turing	nouveau	6.9.5-200.fc40	no	no	driver loads OK but no signal on monitor
NVidia T1000	Turing	nvidia from rpmfusion	6.9.5-200.fc40	no	no	akmod fails to build driver

sevo · June 28, 2024, 3:58pm

I am using AlmaLinux 9.4 with an RTX 3060. I started with Fedora 40 but the newer gcc and kernel 6.9 were leading to compile errors for the nvidia 550 akmod from rpmfusion. I switched to AlmaLinux and the nvidia 550 drivers compile successfully (older kernel 5.14 and gcc is version 11). I started with the nvidia driver installer from their website but I have switched, not to rpmfusion, but to the nvidia driver repos from https://negativo17.org/nvidia-driver/.

I haven’t circled back to Fedora 40 yet, but I would suggest trying the negativo17 repo for Fedora 40. They use akmod for Fedora 40 just like rpmfusion, but their Fedora 40 repo uses the 555 version of the nvidia driver, which I assume compiles successfully with the newer kernel and gcc 14 in Fedora 40. Their repo also will enable two important nvidia_drm module options (via adding a file in /etc/modprobe.d) to get you a text console on your nvidia card, namely modeset=1 and fbdev=1. Without those two options, X will use your nvidia card but you won’t have text tty console on the card, and wayland compositors likely won’t work either.

I should mention that the negativo17 nvidia repo packages do not add nvidia modules to your initramfs, so the nvidia display won’t initialize (and start giving you console messages) until after the root fs is mounted, i.e. you won’t see console output quite as early as when the modules are loaded in the initramfs. You could add the modules to the initramfs yourself if desired.

An alternative option would be to download the nvidia installer for version 555. In that case, you will need to manually enable the two nvidia_drm module options mentioned above if you want to see any text console, as the nvidia installer doesn’t enable those options (at least it didn’t when I used it to install version 550).

Then there’s the whole arm64 option ROM issue. My RX 6800 came out of box with an arm64 option ROM, but I didn’t want to maintain the PCIe erratum kernel patch to make the amdgpu driver work, so I switched to an nvidia card. It’s disappointing to hear that even with the patch you weren’t having good success with your AMD gpu’s. Of course, nvidia has no arm64 option ROM so their can never be a UEFI GOP to display boot menus, etc. Perhaps they will release one at some point…

Good luck!

shadethegrey · June 28, 2024, 6:17pm

I was on Fedora 39, but I used a mixture of the Nvidia driver directly from nvidia, and the nvidia xconfig command from the ampere script for ubuntu after I installed the nvidia driver. ( sudo nvidia-xconfig -a --cool-bits=31 --allow-empty-initial-configuration) If the the nvidia driver does not complete cleanly, it won’t work. I think I had to set something in grub when it booted or I wouldn’t get any display at all, but I cannot remember what I put, I think nomodeset, but I’m not sure. I haven’t used Fedora in a while, I switched to Ubuntu for the 32-bit libraries for gaming.

sevo · June 28, 2024, 8:42pm

The specific problem with Fedora 40 is compile errors with nvidia driver 550, regardless how it is packaged. And because there is no efi gop framebuffer, you need nvidia_drm.modeset=1 and nvidia_drm.fbdev=1 to get a framebuffer for the linux console.

Also add console=tty1 to kernel command line

pimzand · June 28, 2024, 9:27pm

Thanks for pointing out Nvidia driver, CUDA tools and libraries – negativo17.org . Indeed it installs and builds fine. I might have achieved the same results by switching to rpmfusion rawhide. I can’t try the effect right now, I am accessing my server remotely now and there is no gpu card in it now.

The modeset instructions might help. Perhaps for nouveau too. The reason that AMD cards gave me a picture at all may be caused by the fact that there is a GOP. For Intel and NVidia there is none, which makes my monitor go to deep sleep immediately, or switch to another active port. This might cause the driver to not detect the monitor.

sevo · June 28, 2024, 9:58pm

The initial framebuffer the linux kernel uses for the console with these cards is typically the efi gop framebuffer. The linux device driver is efifb. In the case of these cards with no arm64 option rom, there is no UEFI GOP driver to create that framebuffer, which gets passed to the kernel and used by the efifb driver. By telling the nvidia_drm module fbdev=1, it will create a framebuffer that the linux console can use. Not sure what, if any, equivalent there is for the intel gpus.

Of course, the cards will output video if the X server starts and is configured to use the card, even if there is no linux console framebuffer, as the X server creates its own framebuffer. But when the X server quits, the monitor will again go blank. So in your nice table above, I wonder if some of the “driver loads but monitor is blank” is because there is no linux console framebuffer and X server has not been started using that card.

pimzand · July 1, 2024, 11:53am

Finally got a picture from the T1000 on my 4K monitor, using the negativo17 repo, using nvidia-open kernel modules. There’s an rpmfusion nvidia beta (555) repo in copr that I tried first, but that does not support aarch64.

But I did not get a picture immediately. I switched from the default Wayland to Xorg, restarted gdm, then switched back to Wayland, again restarted gdm, and then finally I got a login screen.

I have yet to see whether this is going to be a standard routine after each boot.

It surprises me that there’s nothing in dmesg or journalctl that seems to log the detection of displays.

bexcran · July 2, 2024, 3:45am

Of course, nvidia has no arm64 option ROM so their can never be a UEFI GOP to display boot menus, etc.

For ADLINK systems, I’ve added an x86 emulator driver to my firmware builds (GitHub - bcran/edk2_aadp: EDK2 UEFI for ADLINK Systems) so the x86 optrom gets loaded and you get to use the display during boot.

sevo · July 2, 2024, 5:27pm

I might have to try building an aarch64 MultiArchUefiPkg and trying it out on the Asrock Ampere board.

sevo · July 4, 2024, 9:40pm

The emulator driver loads on the Asrock board with latest 2.06 firmware from the uefi shell, and the device it creates shows up as expected from the documentation. However, X64 binaries (eg. X64 build of the edk2 uefi shell) fail with an “unsupported” error message when attempting to execute them. The same emulator driver and X64 binaries work as expected when tested in a qemu kvm aarch64 virtual machine. I haven’t delved into what AMI might have done to prevent this from working… I assume it is that the uefi firmware from AMI doesn’t support the EFI protocol that the emulator driver produces, EDKII_PECOFF_IMAGE_EMULATOR_PROTOCOL but I haven’t done the work to prove that yet

Topic		Replies	Views
AMD GPUs on the Altra devkit and other Altras - patches available now General Discussion	23	2133	June 12, 2024
Intel Arc on Ampere Altra (unstable but somewhat working) General Discussion	22	923	March 15, 2025
Current nVidia consumer GPUs General Discussion	12	130	June 7, 2025
Dogfooding Ampere (or Arm) General Discussion	10	781	November 19, 2024
[Solved] Ampere Altra Dev Kit no video signal via pcie GPU General Discussion ampere	6	482	October 7, 2023

GPU support for Ampere Altra?

Related topics