Nvidia GPU troubleshooting help

I am trying to troubleshoot why I am having problems with Ubuntu running with nvidia drivers on my ASRock board. I am getting a strange error, and I am not sure how to troubleshoot this. I load the driver, could be the nvidia installer or the ubuntu installer, doesn’t matter, doesn’t matter the driver version, doesn’t matter the ubuntu version, I tried 22.04, 23.10, and 24.04 beta and they all do the same thing.

It acts like it hangs after I reboot from installing the driver. the console output shows rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:

then a bunch of output about rcu cputime, jiffies etc. I could use some help in what next to look at, if there are logs or anything for this. It is a new thing I’ve never seen. The system is unresponsive, just keeps showing the rcu_preempt error, and show it for different cpu’s.

Where you able to find any solution to this?

I have not yet. I know ASRock doesn’t list ubuntu as supported, but I’m trying to see if I can get it to work. I can install server fine, but as soon as you load any kind of nvidia driver, it will stall out like that. I’ve tried 22.04 HWE, 23.10, 24.04 beta (desktop and server). They all stall out the same way, even with different nvidia driver versions. There aren’t many settings in the bios to check either, I’ve tried it with resizeable bar on and off, and that’s about all you get for options. I figured it was worth asking if anyone knew of how to dig into this deeper, but it may not be worth the time.

Just use Ubuntu 22.04 regular kernel 5.15, don’t use HWE kernel.

The system I have will not even boot with the 22.04 regular kernel.

it will work, did you use KVM or serial console?
Suppose that serial console will always work.

Could you provide more information about why you cannot boot with ubuntu 22.04 regular kernel ?

Ok, after battling with it for quite a while, I’ve now got 22.04 booted with the nvidia drivers loaded. Apparently when installing ubuntu, it keeps creating 2 UEFI entries to boot from. The first one if booted causes problems, the second one seems to work. I’m still on the HWE kernel, but it’s working! Thanks Richard for the reinforcement to get it to work! So now I’m dual booting Fedora and Ubuntu.

I have seem some guys also said the HWE can work with latest NVidia driver, but I cannot see the driver README talks about it.
And also this combination doesn’t work for me, so just keep on 5.15 environment and run some applications, it works good.

1 Like