Results for igt@xe_exec_system_allocator@twice-large-mmap-shared-remap

Result: Dmesg-Fail 29 Warning(s)

i915_display_info15 igt_runner15 results15.json results15-xe-load.json guc_logs15.tar i915_display_info_post_exec15 serial_data15 boot15 dmesg15

DetailValue
Duration 28.50 seconds
Hostname
shard-bmg-2
Igt-Version
IGT-Version: 2.4-g22222b7d9 (x86_64) (Linux: 7.1.0-rc3-lgci-xe-xe-pw-166583v1-debug+ x86_64)
Out
Using IGT_SRANDOM=1778836380 for randomisation
Opened device: /dev/dri/card0
Starting subtest: twice-large-mmap-shared-remap
Stack trace:
  #0 ../lib/igt_core.c:2074 __igt_fail_assert()
  #1 [xe_wait_ufence+0x57]
  #2 ../tests/intel/xe_exec_system_allocator.c:1789 test_exec()
  #3 ../tests/intel/xe_exec_system_allocator.c:2588 __igt_unique____real_main2385()
  #4 ../tests/intel/xe_exec_system_allocator.c:2385 main()
  #5 [__libc_init_first+0x8a]
  #6 [__libc_start_main+0x8b]
  #7 [_start+0x25]
Subtest twice-large-mmap-shared-remap: FAIL (28.496s)
Err
Starting subtest: twice-large-mmap-shared-remap
[126.943774] (xe_exec_system_allocator:2259) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:763:
[126.943890] (xe_exec_system_allocator:2259) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0
[126.943945] (xe_exec_system_allocator:2259) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired
[126.943986] (xe_exec_system_allocator:2259) xe/xe_ioctl-CRITICAL: error: -62 != 0
Subtest twice-large-mmap-shared-remap failed.
**** DEBUG ****
[126.943774] (xe_exec_system_allocator:2259) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:763:
[126.943890] (xe_exec_system_allocator:2259) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0
[126.943945] (xe_exec_system_allocator:2259) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired
[126.943986] (xe_exec_system_allocator:2259) xe/xe_ioctl-CRITICAL: error: -62 != 0
[126.955928] (xe_exec_system_allocator:2259) igt_core-INFO: Stack trace:
[126.969621] (xe_exec_system_allocator:2259) igt_core-INFO:   #0 ../lib/igt_core.c:2074 __igt_fail_assert()
[126.971938] (xe_exec_system_allocator:2259) igt_core-INFO:   #1 [xe_wait_ufence+0x57]
[126.973015] (xe_exec_system_allocator:2259) igt_core-INFO:   #2 ../tests/intel/xe_exec_system_allocator.c:1789 test_exec()
[126.973057] (xe_exec_system_allocator:2259) igt_core-INFO:   #3 ../tests/intel/xe_exec_system_allocator.c:2588 __igt_unique____real_main2385()
[126.973094] (xe_exec_system_allocator:2259) igt_core-INFO:   #4 ../tests/intel/xe_exec_system_allocator.c:2385 main()
[126.975563] (xe_exec_system_allocator:2259) igt_core-INFO:   #5 [__libc_init_first+0x8a]
[126.976313] (xe_exec_system_allocator:2259) igt_core-INFO:   #6 [__libc_start_main+0x8b]
[126.976493] (xe_exec_system_allocator:2259) igt_core-INFO:   #7 [_start+0x25]
****  END  ****
Subtest twice-large-mmap-shared-remap: FAIL (28.496s)
Dmesg

<6> [86.157432] Console: switching to colour dummy device 80x25
<6> [86.157584] [IGT] xe_exec_system_allocator: executing
<7> [86.520533] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2b2b2828
<7> [86.520704] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2b2b2a2b
<3> [88.517032] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=94 recv=93
<3> [90.820386] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=94 recv=93
<3> [93.123749] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=95 recv=93
<6> [93.243666] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [93.243691] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [93.243702] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [93.243711] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [93.352033] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [93.352058] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [93.352068] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [93.352077] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [93.459787] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [93.459811] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [93.459822] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [93.459831] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [95.427786] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=95 recv=93
<3> [97.731714] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=96 recv=93
<6> [97.851241] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [97.851266] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [97.851275] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [97.851285] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [100.036674] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=96 recv=93
<7> [100.045704] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<6> [100.052174] [IGT] xe_exec_system_allocator: starting subtest twice-large-mmap-shared-remap
<7> [101.536883] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2b2b2829
<7> [101.537048] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2b2b2a2c
<3> [102.339627] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=97 recv=93
<3> [104.643695] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=97 recv=93
<3> [106.947623] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=98 recv=93
<3> [106.956254] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=99 recv=93
<3> [109.251608] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=98 recv=93
<3> [109.259990] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=99 recv=93
<3> [111.555619] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=100 recv=93
<6> [111.675115] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [111.675140] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [111.675149] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [111.675159] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [113.859613] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=100 recv=93
<3> [116.163617] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=101 recv=93
<6> [116.513147] pcieport 0000:00:06.0: AER: Correctable error message received from 0000:05:00.0
<4> [116.513153] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [116.513155] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [116.513157] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [116.539663] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2c2c2929
<7> [116.539816] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2c2b2b2c
<3> [118.467542] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=101 recv=93
<3> [120.771554] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=102 recv=93
<3> [123.075559] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=102 recv=93
<6> [128.548957] [IGT] xe_exec_system_allocator: finished subtest twice-large-mmap-shared-remap, FAIL
<6> [128.549516] [IGT] xe_exec_system_allocator: exiting, ret=98
<6> [128.550047] Console: switching to colour frame buffer device 240x67
<7> [131.536925] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2b2c292a
<7> [131.537075] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2c2c2b2c
<4> [133.635544] xe 0000:03:00.0: [drm] Tile0: GT0: Schedule disable failed to respond, guc_id=2
<7> [133.824957] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] Tile0: GT0: Proceeding with manual engine snapshot
<6> [133.825184] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [133.825202] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [133.825203] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<6> [133.825497] xe 0000:03:00.0: [drm] Tile0: GT0: reset queued
<3> [133.825670] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=103 recv=93
<6> [133.834721] xe 0000:03:00.0: [drm] Tile0: GT0: reset started
<7> [133.834816] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [133.835222] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: Applying GT save-restore MMIOs
<7> [133.835369] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x4148] = 0x00000000
<7> [133.835475] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x8828] = 0x00800000
<7> [133.835558] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb0c8] = 0x11111440
<7> [133.835639] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb104] = 0x08104440
<7> [133.835723] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb108] = 0x30200000
<7> [133.835811] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb158] = 0x0000007f
<7> [133.835900] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xe7cc] = 0x00000100
<7> [133.835984] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] WOPCM: 4096K
<7> [133.836084] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [6144K, 832K)
<7> [133.836179] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel disabled
<7> [133.837353] xe 0000:03:00.0: [drm:xe_guc_ads_populate [xe]] Tile0: GT0: Updated ADS capture size 20480 (was 49152)
<3> [133.847243] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status = 0x400000A0, time = 9ms, freq = 2150MHz (req 2133MHz)
<3> [133.859329] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
<3> [133.872149] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: firmware signature verification failed
<3> [133.880838] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: reset failed (-EPROTO)
<3> [133.888015] xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
IOCTLs and executions are blocked.
For recovery procedure, refer to https://docs.kernel.org/gpu/drm-uapi.html#device-wedging
Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new
<7> [133.920012] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [133.920174] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<3> [133.985324] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: GuC mmio request 0x5507: no reply 0x5507
<6> [133.994482] xe 0000:03:00.0: [drm] device wedged, needs recovery
<7> [133.995543] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<7> [134.004751] xe 0000:03:00.0: [drm:drm_client_dev_restore] fbdev: ret=0
Created at 2026-05-15 11:36:25