Results for igt@xe_exec_system_allocator@once-free-race

Result: Dmesg-Fail 29 Warning(s)

i915_display_info9 igt_runner9 results9.json results9-xe-load.json guc_logs9.tar boot9 dmesg9

DetailValue
Duration 28.37 seconds
Hostname
shard-bmg-2
Igt-Version
IGT-Version: 2.4-g2d1f7b290 (x86_64) (Linux: 7.0.0-rc7-lgci-xe-xe-pw-164479v2-debug+ x86_64)
Out
Using IGT_SRANDOM=1775814917 for randomisation
Opened device: /dev/dri/card0
Starting subtest: once-free-race
Stack trace:
  #0 ../lib/igt_core.c:2075 __igt_fail_assert()
  #1 [xe_wait_ufence+0x57]
  #2 ../tests/intel/xe_exec_system_allocator.c:1757 test_exec()
  #3 ../tests/intel/xe_exec_system_allocator.c:2537 __igt_unique____real_main2349()
  #4 ../tests/intel/xe_exec_system_allocator.c:2349 main()
  #5 [__libc_init_first+0x8a]
  #6 [__libc_start_main+0x8b]
  #7 [_start+0x25]
Subtest once-free-race: FAIL (28.374s)
Err
Starting subtest: once-free-race
(xe_exec_system_allocator:4796) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:763:
(xe_exec_system_allocator:4796) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0
(xe_exec_system_allocator:4796) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired
(xe_exec_system_allocator:4796) xe/xe_ioctl-CRITICAL: error: -62 != 0
Subtest once-free-race failed.
**** DEBUG ****
(xe_exec_system_allocator:4796) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:763:
(xe_exec_system_allocator:4796) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0
(xe_exec_system_allocator:4796) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired
(xe_exec_system_allocator:4796) xe/xe_ioctl-CRITICAL: error: -62 != 0
(xe_exec_system_allocator:4796) igt_core-INFO: Stack trace:
(xe_exec_system_allocator:4796) igt_core-INFO:   #0 ../lib/igt_core.c:2075 __igt_fail_assert()
(xe_exec_system_allocator:4796) igt_core-INFO:   #1 [xe_wait_ufence+0x57]
(xe_exec_system_allocator:4796) igt_core-INFO:   #2 ../tests/intel/xe_exec_system_allocator.c:1757 test_exec()
(xe_exec_system_allocator:4796) igt_core-INFO:   #3 ../tests/intel/xe_exec_system_allocator.c:2537 __igt_unique____real_main2349()
(xe_exec_system_allocator:4796) igt_core-INFO:   #4 ../tests/intel/xe_exec_system_allocator.c:2349 main()
(xe_exec_system_allocator:4796) igt_core-INFO:   #5 [__libc_init_first+0x8a]
(xe_exec_system_allocator:4796) igt_core-INFO:   #6 [__libc_start_main+0x8b]
(xe_exec_system_allocator:4796) igt_core-INFO:   #7 [_start+0x25]
****  END  ****
Subtest once-free-race: FAIL (28.374s)
Dmesg

<6> [200.877726] Console: switching to colour dummy device 80x25
<6> [200.878047] [IGT] xe_exec_system_allocator: executing
<3> [203.181433] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18560 recv=18559
<6> [203.302699] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [203.302725] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [203.302735] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [203.302745] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [205.484797] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18560 recv=18559
<3> [207.788170] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18561 recv=18559
<6> [209.511054] xe 0000:03:00.0: [drm] PL2 disabled for channel 0, val 0x00000000
<7> [209.523950] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2a2a2728
<7> [209.524113] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2a2a2a2b
<3> [210.092068] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18561 recv=18559
<6> [210.210141] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [210.210167] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [210.210177] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [210.210186] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [212.396104] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18562 recv=18559
<3> [214.700004] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18562 recv=18559
<6> [214.709935] [IGT] xe_exec_system_allocator: starting subtest once-free-race
<7> [214.710987] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<3> [217.003981] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18563 recv=18559
<3> [219.307854] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18563 recv=18559
<3> [221.611896] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18564 recv=18559
<3> [221.621115] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18565 recv=18559
<3> [223.915898] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18564 recv=18559
<3> [223.924809] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18565 recv=18559
<6> [224.043333] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [224.043359] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [224.043370] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [224.043379] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [224.152864] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [224.152891] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [224.152901] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [224.152910] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [224.261741] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [224.261767] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [224.261777] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [224.261786] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [224.533764] xe 0000:03:00.0: [drm] PL2 disabled for channel 0, val 0x00000000
<7> [224.550789] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2b2b2829
<7> [224.551049] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2b2b2a2b
<6> [224.656323] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [224.656349] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [224.656359] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [224.656369] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [226.219806] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18566 recv=18559
<6> [226.335100] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [226.335126] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [226.335136] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [226.335145] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [228.523832] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18566 recv=18559
<3> [230.827737] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18567 recv=18559
<6> [231.893493] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [231.893527] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [231.893538] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [231.893547] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [233.131698] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18567 recv=18559
<3> [235.435699] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18568 recv=18559
<6> [235.552152] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [235.552179] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [235.552189] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [235.552199] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [235.659723] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [235.659750] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [235.659760] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [235.659769] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [237.739560] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18568 recv=18559
<6> [237.854893] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [237.854920] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [237.854930] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [237.854940] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [239.509321] xe 0000:03:00.0: [drm] PL2 disabled for channel 0, val 0x00000000
<7> [239.512431] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2c2c292a
<7> [239.512572] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2c2b2b2c
<6> [239.616651] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [239.616677] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [239.616687] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [239.616697] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [243.085069] [IGT] xe_exec_system_allocator: finished subtest once-free-race, FAIL
<6> [243.085728] [IGT] xe_exec_system_allocator: exiting, ret=98
<6> [243.086242] Console: switching to colour frame buffer device 240x67
<6> [247.257664] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [247.257690] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [247.257700] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [247.257710] nvme 0000:05:00.0: [ 0] RxErr (First)
<4> [248.171443] xe 0000:03:00.0: [drm] Tile0: GT0: Schedule disable failed to respond, guc_id=2
<7> [248.359892] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] Tile0: GT0: Proceeding with manual engine snapshot
<6> [248.360165] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [248.360197] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [248.360198] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<6> [248.360295] xe 0000:03:00.0: [drm] Tile0: GT0: reset queued
<3> [248.360565] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=18569 recv=18559
<6> [248.369922] xe 0000:03:00.0: [drm] Tile0: GT0: reset started
<7> [248.370060] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [248.370628] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: Applying GT save-restore MMIOs
<7> [248.370718] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x4148] = 0x00000000
<7> [248.370806] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x8828] = 0x00800000
<7> [248.370889] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb0c8] = 0x11111440
<7> [248.370972] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb104] = 0x08104440
<7> [248.371060] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb108] = 0x30200000
<7> [248.371149] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb158] = 0x0000007f
<7> [248.371239] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xe7cc] = 0x00000100
<7> [248.371325] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] WOPCM: 4096K
<7> [248.371440] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [6144K, 832K)
<7> [248.371595] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel disabled
<7> [248.372667] xe 0000:03:00.0: [drm:xe_guc_ads_populate [xe]] Tile0: GT0: Updated ADS capture size 20480 (was 49152)
<3> [248.383398] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status = 0x400000A0, time = 10ms, freq = 2150MHz (req 2133MHz)
<3> [248.395518] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
<3> [248.408314] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: firmware signature verification failed
<3> [248.417053] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: reset failed (-EPROTO)
<3> [248.424209] xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
IOCTLs and executions are blocked.
For recovery procedure, refer to https://docs.kernel.org/gpu/drm-uapi.html#device-wedging
Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new
<7> [248.456076] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [248.456257] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<3> [248.521685] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: GuC mmio request 0x5507: no reply 0x5507
<6> [248.531471] xe 0000:03:00.0: [drm] device wedged, needs recovery
<7> [248.534111] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<7> [248.545861] xe 0000:03:00.0: [drm:drm_client_dev_restore] fbdev: ret=0
Created at 2026-04-10 10:22:02