Intel Arc on Ampere Altra (unstable but somewhat working)

So, I’ve compiled a kernel from drm / xe / xe kernel driver · GitLab with Altra’s PCIe Errata patch applied and with my patch to make it not-OOPS, and with mesa and libdrm patches I’ve mentioned I can actually run doom3 there (though sometimes it locks up, but I don’t know is it because of the Xe driver state or not)

It is slow (probably because I still have debug drm flags enabled), but it works.

When I’ve tried to run Quake 2 RTX, I got a kernel panic:

[  524.968400] SError Interrupt on CPU38, code 0x00000000be000411 -- SError
[  524.968409] CPU: 38 PID: 5610 Comm: kworker/u295:4 Tainted: G     U             6.9.0-rc6+ #2
[  524.968412] Hardware name:  ALTRAD8UD-1L2T/ALTRAD8UD-1L2T, BIOS 2.05 04/12/2024
[  524.968413] Workqueue: events_unbound xe_devcoredump_deferred_snap_work [xe]
[  524.968461] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  524.968463] pc : __memcpy_fromio+0x54/0x98
[  524.968469] lr : xe_vm_snapshot_capture_delayed+0x25c/0x310 [xe]
[  524.968508] sp : ffff80008d2ebd00
[  524.968509] x29: ffff80008d2ebd20 x28: ffff0800d2ab61b0 x27: ffff07ffebe28000
[  524.968512] x26: 0000000000120000 x25: ffff07ffd2400000 x24: ffff0800c3b32c00
[  524.968514] x23: 0000000000000000 x22: 0000000000100cc0 x21: 0000000000000009
[  524.968516] x20: ffff0800d2ab6000 x19: 0000000000000009 x18: ffffffffffffffff
[  524.968518] x17: 4d45545359534255 x16: ffffd09354044d18 x15: 2f706d756465726f
[  524.968520] x14: 0000000000000000 x13: 0000000000000030 x12: ffff800080000000
[  524.968523] x11: 0000000000040dc0 x10: dead000000000040 x9 : ffffd0931cc4f05c
[  524.968525] x8 : 000028000011f000 x7 : 000000000000003f x6 : 0000000000120000
[  524.968527] x5 : 0000000000000000 x4 : ffff80008e000060 x3 : ffff07ffd2520000
[  524.968529] x2 : 0000000000120000 x1 : ffff80008e000000 x0 : ffff07ffd2400060
[  524.968532] Kernel panic - not syncing: Asynchronous SError Interrupt
[  524.968534] CPU: 38 PID: 5610 Comm: kworker/u295:4 Tainted: G     U             6.9.0-rc6+ #2
[  524.968536] Hardware name:  ALTRAD8UD-1L2T/ALTRAD8UD-1L2T, BIOS 2.05 04/12/2024
[  524.968537] Workqueue: events_unbound xe_devcoredump_deferred_snap_work [xe]
[  524.968576] Call trace:
[  524.968577]  dump_backtrace+0x9c/0x128
[  524.968580]  show_stack+0x20/0x38
[  524.968582]  dump_stack_lvl+0x34/0x90
[  524.968587]  dump_stack+0x18/0x28
[  524.968589]  panic+0x3b4/0x3f0
[  524.968593]  nmi_panic+0x50/0xa8
[  524.968596]  arm64_serror_panic+0x78/0x90
[  524.968598]  do_serror+0x30/0x78
[  524.968600]  el1h_64_error_handler+0x30/0x48
[  524.968602]  el1h_64_error+0x64/0x68
[  524.968604]  __memcpy_fromio+0x54/0x98
[  524.968606]  xe_devcoredump_deferred_snap_work+0x5c/0x90 [xe]
[  524.968644]  process_one_work+0x18c/0x400
[  524.968648]  worker_thread+0x204/0x420
[  524.968650]  kthread+0xe8/0xf8
[  524.968653]  ret_from_fork+0x10/0x20
[  524.968656] SMP: stopping secondary CPUs
[  524.968673] Kernel Offset: 0x5092d4020000 from 0xffff800080000000
[  524.968675] PHYS_OFFSET: 0xfff1000080000000
[  524.968676] CPU features: 0x0,0000010b,80140528,4241720b
[  524.968677] Memory Limit: none
[  525.304772] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---

It was showing picture but extremely slow.

Also so far I see a pattern: on a first run on GDB it locks up when I try to enter password, then if I restart gdm (over ssh for example) it allows me to login and OpenGL seems to be working to some extent (doom3 was running idle for 30 minutes without any issues)

At some point I managed to get an OOPS instead of panic:

[   77.164211] Unable to handle kernel paging request at virtual address 007e78c50c533354
[   77.172131] Mem abort info:
[   77.174912]   ESR = 0x0000000096000004
[   77.178655]   EC = 0x25: DABT (current EL), IL = 32 bits
[   77.183959]   SET = 0, FnV = 0
[   77.187000]   EA = 0, S1PTW = 0
[   77.190132]   FSC = 0x04: level 0 translation fault
[   77.195000] Data abort info:
[   77.197872]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[   77.203347]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[   77.208389]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[   77.213691] [007e78c50c533354] address between user and kernel address ranges
[   77.220817] Internal error: Oops: 0000000096000004 [#1] SMP
[   77.226378] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device qrtr snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core binfmt_misc snd_hwdep snd_pcm aes_ce_blk aes_ce_cipher polyval_ce snd_timer acpi_ipmi polyval_generic ghash_ce snd ipmi_ssif gf128mul nls_ascii sha2_ce nls_cp437 ipmi_devintf vfat sha256_arm64 arm_spe_pmu ipmi_msghandler soundcore arm_cmn sbsa_gwdt fat sha1_ce xgene_hwmon arm_dsu_pmu joydev cppc_cpufreq acpi_tad evdev dm_mod configfs loop dax efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 cdc_ether usbnet mii hid_generic usbhid hid xe drm_gpuvm drm_exec drm_buddy gpu_sched video drm_suballoc_helper drm_ttm_helper ttm cec rc_core drm_display_helper drm_kms_helper xhci_pci ixgbe drm xhci_hcd nvme usbcore nvme_core xfrm_algo mdio_devres igb of_mdio fixed_phy fwnode_mdio t10_pi libphy crc64_rocksoft crc64 crc_t10dif crct10dif_generic i2c_designware_platform crct10dif_ce usb_common i2c_algo_bit mdio crct10dif_common
[   77.226454]  i2c_designware_core
[   77.319778] CPU: 16 PID: 341 Comm: kworker/u259:0 Tainted: G     U             6.9.0-rc6+ #2
[   77.328203] Hardware name:  ALTRAD8UD-1L2T/ALTRAD8UD-1L2T, BIOS 2.05 04/12/2024
[   77.335498] Workqueue: events_unbound xe_devcoredump_deferred_snap_work [xe]
[   77.342578] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   77.349527] pc : __ww_mutex_lock.constprop.0+0xa8/0xae8
[   77.354744] lr : __ww_mutex_lock.constprop.0+0x4c/0xae8
[   77.359957] sp : ffff800082043c40
[   77.363259] x29: ffff800082043c70 x28: ffff07ff81ba9440 x27: ffff07ff8001f098
[   77.370383] x26: ffff07ff8001f000 x25: ffff07ff81ba94c0 x24: ffff07ff80dcf605
[   77.377506] x23: ffff0800cbfec640 x22: 0000000000000002 x21: ffff800082043c48
[   77.384628] x20: 0000000000000000 x19: ffff07ff8e344900 x18: 0000000000000000
[   77.391751] x17: ffff55fb02c14000 x16: ffffb223ebd3a5b8 x15: 0000000000000000
[   77.398874] x14: 0000000000000004 x13: 0000000000000000 x12: 0000000000000000
[   77.405997] x11: 0000000000052cc0 x10: ffff081f700512e8 x9 : ffffb223ebd37b88
[   77.413120] x8 : 0000000000000030 x7 : 0000000000002c38 x6 : 0000000000001000
[   77.420243] x5 : 00000000ffffffff x4 : 0000000000012cc0 x3 : ffff07ff81da2480
[   77.427366] x2 : 0000000000000000 x1 : 257e78c50c533320 x0 : 257e78c50c533320
[   77.434489] Call trace:
[   77.436923]  __ww_mutex_lock.constprop.0+0xa8/0xae8
[   77.441790]  __ww_mutex_lock_slowpath+0x20/0x38
[   77.446309]  ww_mutex_lock+0x98/0x148
[   77.449960]  xe_bo_lock+0x24/0x50 [xe]
[   77.453734]  xe_lrc_snapshot_capture_delayed+0x64/0x188 [xe]
[   77.459417]  xe_guc_exec_queue_snapshot_capture_delayed+0x48/0x70 [xe]
[   77.465967]  xe_devcoredump_deferred_snap_work+0x64/0x90 [xe]
[   77.471736]  process_one_work+0x18c/0x400
[   77.475735]  worker_thread+0x204/0x420
[   77.479472]  kthread+0xe8/0xf8
[   77.482514]  ret_from_fork+0x10/0x20
[   77.486080] Code: f9400260 f1001c1f 54000ee9 927df000 (b9403401)
[   77.492160] ---[ end trace 0000000000000000 ]---

Actually reproduction there is somewhat easy - you need to build and run GitHub - GPSnoopy/RayTracingInVulkan: Implementation of Peter Shirley's Ray Tracing In One Weekend book using Vulkan and NVIDIA's RTX extension.

5 Likes