AMD GPUs on the Altra devkit and other Altras - patches available now

Hi,

I’ve made it possible to use AMD GPUs on the Altras without graphical glitches.
As it turns out, the Altra has a buggy PCIe controller, which means you can
technically modeset with the GPUs and launch a desktop, but you will get
garbage in the output, like this:

I’ve come across some patches in Tencent’s kernel tree, which is a fork of
an old LTS. The patches do not apply on modern kernels:

I ended up rebasing them, however:

For 6.1: main/linux-lts: update to 6.1.30 and fix GPUs on ampere altra · chimera-linux/cports@da73b68 · GitHub
for 6.3: main/linux-stable: update to 6.3.4 and fix GPUs on ampere altra · chimera-linux/cports@38692a6 · GitHub

Now you get this:

You just need to apply them into the kernel. The kernels in my distribution
will come patched out of the box, people on other distributions will need
to apply it and rebuild their kernel. Mesa does not need to be rebuilt,
it works as is.

Tested GPUs are Radeon Pro WX 2100 (works out of the box, any kernel version, as long as patched) and Radeon RX 5500 XT (needs kernel 6.2 or newer, just like everything newer than AMD Polaris, as before 6.2 the DC component was not ready on AArch64 due to kernel hardware floating point trouble, and also kernel command line parameters pcie_aspm=off amdgpu.aspm=0 amdgpu.runpm=0).

Note that these patches are a workaround and come with a performance penalty,
especially when unaligned access is involved. However, things run quite
satisfactorily as far as I can tell.

8 Likes

@q66 thank you for the post, that is awesome…

1 Like

Thanks a lot for sharing @q66 and an even bigger thanks for the solution itself!

I really hope the forum gets more people writing about their personal projects or anything AArch64-related done as a part of their regular work.

Even outside our group, there’s a lot of great stuff going on that I’m sure many end up reading about and not everyone here is necessarily aware of.

1 Like

@q66 Very cool, great job - thanks for shaing!
Question - does the AMD GPU work during POST and EFI boot?
I was informed that it only works once you load the OS due to some Oprom thing missing from AMD cards or something.

no, it only works once the kernel loads

AFAIK, EDK2 has the option of an emulator to run x86 oproms on other architectures (i had that working in an older UEFI build for solidrun honeycomb lx2) but it does not seem to be enabled here

AMD has AArch64 OptionROM for Radeon cards. You can either flash it to the card or integrate into firmware.

I have wx2100 which gives graphics output when used in Solidrun Honeycomb. Card was reflashed.

https://www.workofard.com/2020/12/aarch64-option-roms-for-amd-gpus/

that’s good to know, unfortunately these instructions no longer work; for one the ati-branch flashrom will not detect any devices (this can be patched around, it’s just broken usage of pciutils api where the necessary info in pci_dev was not being filled in because of missing pci_fill_info call), after patching that it will try accessing a wrong device because the domain is hardcoded to 0, and after patching around that too, flashrom_pci_mmio_map will fail with “invalid argument” when trying to map resource5 (on this GPU at least)

This question is in @geerlingguy 's video at 13 min :slight_smile:

1 Like

This thread shall live on in infamy!

2 Likes

Agree completely! Please post things that you find interesting, odds are others will too. And we have all written blog posts and we don’t get comments or anything. So when we learn that someone, read it, liked it and then posted it somewhere, it really does make your day. :slight_smile:

2 Likes

Why are only AMD cards affected by the “buggy PCIe controller”?
Or would Nvdia cards using nouveau require the same patches?

I am using Nouveau without any patches. I have a few different Radeon cards, the Radeon RX 6500XT shipped with an AArch64 optionrom as well as an x86_64 one. I assume that the issue is probably a bad interaction with whatever Switch AMD uses and Amperes PCIe implementation. whatever Nvidia uses is seemingly not as bad. though I have some major regressions with 6.7 and 6.8. 6.8 is better and mostly works. the video stack blew up on 6.7.

1 Like

Kernel 6.8 introduces Intel’s Xe driver, which is architecture independent.
It would be interesting to see an Intel Arc card working on aarch64.
See https://www.phoronix.com/review/intel-arc-early-2024

3 Likes

@pimzand I‘m currently trying to get an Intel Arc A310 working on aarch64 (Rockchip RK3588) but not having any luck yet as the GPU seems to shutdown during/after the boot process. Under Linux 6.8 it’s like there is no card plugged it (compared to Nvidia GPU‘s which I got much further with)

1 Like

You did compile your own kernel, right?
You need the Xe driver that is not being built by default in kernel 6.8, even though the code is present.
On x86_64, you would need to blacklist the old i915 driver, but in aarch64 there is no i915 driver.
You would probably still need to whitelist the Xe driver for the pci id of your card.

1 Like

I did. I need to look into the whitelisting bit. Not sure if u-boot also plays a role

Edit: Progress here Intel Arc on aarch64: Sparkle A310 · Issue #3 · HeyMeco/Rockchip-pcie-devices · GitHub

1 Like

NVIDIA driver has included the fix for years

You mean the closed source NVIDIA driver includes the fixes for Ampere Altra erratum #82288?
If that is the case, then potentially installing NVIDIA closed source drivers might fix issues with AMD or Intel cards.

@meco not sure if that would help, but I managed to get a picture out of my Intel Arc 750 on 6.9@master. One change was required:

--- a/drivers/gpu/drm/i915/display/intel_vga.c
+++ b/drivers/gpu/drm/i915/display/intel_vga.c
@@ -80,6 +80,7 @@ void intel_vga_redisable(struct drm_i915_private *i915)

 void intel_vga_reset_io_mem(struct drm_i915_private *i915)
 {
+       return;
        struct pci_dev *pdev = to_pci_dev(i915->drm.dev);

        /*

That is an ugly hack though. I’ve described the journey to that in a thread about asrock rack altra board.

1 Like

Can you link to the thread? Awesome you got it so far though! Before you had a output was it recognized it lspci?