Result: 29 Warning(s)
i915_display_info7 igt_runner7 results7.json results7-xe-load.json guc_logs7.tar i915_display_info_post_exec7 boot7 dmesg7
| Detail | Value |
|---|---|
| Duration | 23.96 seconds |
| Hostname |
shard-bmg-2 |
| Igt-Version |
IGT-Version: 2.3-g0a4274645 (x86_64) (Linux: 7.0.0-rc4-lgci-xe-xe-4752-7535044a2418d22b5-debug+ x86_64) |
| Out |
Using IGT_SRANDOM=1774088439 for randomisation Opened device: /dev/dri/card0 Starting subtest: once-large-new Stack trace: #0 ../lib/igt_core.c:2075 __igt_fail_assert() #1 [xe_wait_ufence+0x57] #2 ../tests/intel/xe_exec_system_allocator.c:1757 test_exec() #3 ../tests/intel/xe_exec_system_allocator.c:2542 __igt_unique____real_main2349() #4 ../tests/intel/xe_exec_system_allocator.c:2349 main() #5 [__libc_init_first+0x8a] #6 [__libc_start_main+0x8b] #7 [_start+0x25] Subtest once-large-new: FAIL (23.956s) |
| Err |
Starting subtest: once-large-new (xe_exec_system_allocator:2214) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:758: (xe_exec_system_allocator:2214) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0 (xe_exec_system_allocator:2214) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired (xe_exec_system_allocator:2214) xe/xe_ioctl-CRITICAL: error: -62 != 0 Subtest once-large-new failed. **** DEBUG **** (xe_exec_system_allocator:2214) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:758: (xe_exec_system_allocator:2214) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0 (xe_exec_system_allocator:2214) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired (xe_exec_system_allocator:2214) xe/xe_ioctl-CRITICAL: error: -62 != 0 (xe_exec_system_allocator:2214) igt_core-INFO: Stack trace: (xe_exec_system_allocator:2214) igt_core-INFO: #0 ../lib/igt_core.c:2075 __igt_fail_assert() (xe_exec_system_allocator:2214) igt_core-INFO: #1 [xe_wait_ufence+0x57] (xe_exec_system_allocator:2214) igt_core-INFO: #2 ../tests/intel/xe_exec_system_allocator.c:1757 test_exec() (xe_exec_system_allocator:2214) igt_core-INFO: #3 ../tests/intel/xe_exec_system_allocator.c:2542 __igt_unique____real_main2349() (xe_exec_system_allocator:2214) igt_core-INFO: #4 ../tests/intel/xe_exec_system_allocator.c:2349 main() (xe_exec_system_allocator:2214) igt_core-INFO: #5 [__libc_init_first+0x8a] (xe_exec_system_allocator:2214) igt_core-INFO: #6 [__libc_start_main+0x8b] (xe_exec_system_allocator:2214) igt_core-INFO: #7 [_start+0x25] **** END **** Subtest once-large-new: FAIL (23.956s) |
| Dmesg |
<6> [60.072032] Console: switching to colour dummy device 80x25
<6> [60.072186] [IGT] xe_exec_system_allocator: executing
<3> [62.431393] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=94 recv=93
<6> [62.543111] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [62.543135] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [62.543144] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [62.543153] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [64.734786] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=94 recv=93
<7> [66.321988] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2c2c2829
<7> [66.322133] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2c2c2b2c
<3> [67.038055] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=95 recv=93
<3> [69.342310] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=95 recv=93
<3> [71.646255] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=96 recv=93
<6> [71.759081] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [71.759112] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [71.759125] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [71.759139] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [73.950262] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=96 recv=93
<7> [73.955584] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<6> [73.959255] [IGT] xe_exec_system_allocator: starting subtest once-large-new
<3> [76.254276] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=97 recv=93
<3> [76.254345] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=98 recv=93
<6> [76.366226] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [76.366250] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [76.366259] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [76.366268] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [78.558706] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=97 recv=93
<3> [78.558779] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=98 recv=93
<3> [80.863123] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=99 recv=93
<3> [80.863195] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=100 recv=93
<6> [80.975163] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [80.975187] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [80.975196] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [80.975205] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [81.311227] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2c2c292a
<7> [81.311371] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2c2c2c2d
<6> [81.847093] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [81.847125] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [81.847138] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [81.847151] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [83.167483] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=99 recv=93
<3> [83.167556] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=100 recv=93
<3> [85.471734] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=101 recv=93
<6> [85.584275] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [85.584301] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [85.584310] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [85.584319] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [87.776076] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=101 recv=93
<6> [87.887951] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [87.887975] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [87.887984] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [87.887993] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [90.080191] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=102 recv=93
<6> [90.192294] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [90.192319] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [90.192327] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [90.192336] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [92.384500] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=102 recv=93
<6> [95.950498] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [95.950522] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [95.950531] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [95.950540] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [96.327702] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2d2d292b
<7> [96.327846] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2d2d2c2d
<6> [97.916048] [IGT] xe_exec_system_allocator: finished subtest once-large-new, FAIL
<6> [97.916571] [IGT] xe_exec_system_allocator: exiting, ret=98
<6> [97.917226] Console: switching to colour frame buffer device 240x67
<4> [103.008973] xe 0000:03:00.0: [drm] Tile0: GT0: Schedule disable failed to respond, guc_id=2
<7> [103.196842] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] Tile0: GT0: Proceeding with manual engine snapshot
<6> [103.197066] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [103.197083] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [103.197084] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<6> [103.197157] xe 0000:03:00.0: [drm] Tile0: GT0: reset queued
<3> [103.197315] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=103 recv=93
<6> [103.197618] xe 0000:03:00.0: [drm] Tile0: GT0: reset started
<7> [103.197800] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [103.198221] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: Applying GT save-restore MMIOs
<7> [103.198307] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x4148] = 0x00000000
<7> [103.198394] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x8828] = 0x00800000
<7> [103.198475] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb0c8] = 0x11111440
<7> [103.198560] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb104] = 0x08104440
<7> [103.198640] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb108] = 0x30200000
<7> [103.198721] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb158] = 0x0000007f
<7> [103.198968] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] WOPCM: 4096K
<7> [103.199057] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [6144K, 832K)
<7> [103.199178] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel disabled
<7> [103.200299] xe 0000:03:00.0: [drm:xe_guc_ads_populate [xe]] Tile0: GT0: Updated ADS capture size 20480 (was 49152)
<3> [103.211077] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status = 0x400000A0, time = 10ms, freq = 2150MHz (req 2133MHz)
<3> [103.211374] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
<3> [103.211394] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: firmware signature verification failed
<3> [103.211527] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: reset failed (-EPROTO)
<3> [103.211581] xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
IOCTLs and executions are blocked.
For recovery procedure, refer to https://docs.kernel.org/gpu/drm-uapi.html#device-wedging
Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new
<7> [103.211624] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [103.211924] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<3> [103.265009] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: GuC mmio request 0x5507: no reply 0x5507
<6> [103.265817] xe 0000:03:00.0: [drm] device wedged, needs recovery
<7> [103.268544] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<7> [103.299560] xe 0000:03:00.0: [drm:drm_client_dev_restore] fbdev: ret=0
|