Result: 29 Warning(s)
i915_display_info0 igt_runner0 results0.json results0-xe-load.json guc_logs0.tar i915_display_info_post_exec0 boot0 dmesg0
| Detail | Value |
|---|---|
| Duration | 24.08 seconds |
| Hostname |
shard-bmg-2 |
| Igt-Version |
IGT-Version: 2.4-g57e42adc9 (x86_64) (Linux: 7.1.0-rc1-lgci-xe-xe-4944-aea2c496abcf55b64-debug+ x86_64) |
| Out |
Using IGT_SRANDOM=1777373151 for randomisation Opened device: /dev/dri/card0 Starting subtest: many-userptr-invalidate-race Stack trace: #0 ../lib/igt_core.c:2075 __igt_fail_assert() #1 ../tests/intel/xe_exec_fault_mode.c:395 test_exec() #2 ../tests/intel/xe_exec_fault_mode.c:451 test_exec_main() #3 ../tests/intel/xe_exec_fault_mode.c:571 __igt_unique____real_main456() #4 ../tests/intel/xe_exec_fault_mode.c:456 main() #5 [__libc_init_first+0x8a] #6 [__libc_start_main+0x8b] #7 [_start+0x25] Subtest many-userptr-invalidate-race: FAIL (24.083s) |
| Err |
Starting subtest: many-userptr-invalidate-race (xe_exec_fault_mode:5599) CRITICAL: Test assertion failure function test_exec, file ../tests/intel/xe_exec_fault_mode.c:383: (xe_exec_fault_mode:5599) CRITICAL: Failed assertion: __xe_wait_ufence(fd, &exec_sync[i], 0xdeadbeefdeadbeefull, exec_queues[i % n_exec_queues], &timeout) == 0 (xe_exec_fault_mode:5599) CRITICAL: Last errno: 62, Timer expired (xe_exec_fault_mode:5599) CRITICAL: error: -62 != 0 Subtest many-userptr-invalidate-race failed. **** DEBUG **** (xe_exec_fault_mode:5599) DEBUG: test_exec running on: DRM_XE_ENGINE_CLASS_RENDER (xe_exec_fault_mode:5599) CRITICAL: Test assertion failure function test_exec, file ../tests/intel/xe_exec_fault_mode.c:383: (xe_exec_fault_mode:5599) CRITICAL: Failed assertion: __xe_wait_ufence(fd, &exec_sync[i], 0xdeadbeefdeadbeefull, exec_queues[i % n_exec_queues], &timeout) == 0 (xe_exec_fault_mode:5599) CRITICAL: Last errno: 62, Timer expired (xe_exec_fault_mode:5599) CRITICAL: error: -62 != 0 (xe_exec_fault_mode:5599) igt_core-INFO: Stack trace: (xe_exec_fault_mode:5599) igt_core-INFO: #0 ../lib/igt_core.c:2075 __igt_fail_assert() (xe_exec_fault_mode:5599) igt_core-INFO: #1 ../tests/intel/xe_exec_fault_mode.c:395 test_exec() (xe_exec_fault_mode:5599) igt_core-INFO: #2 ../tests/intel/xe_exec_fault_mode.c:451 test_exec_main() (xe_exec_fault_mode:5599) igt_core-INFO: #3 ../tests/intel/xe_exec_fault_mode.c:571 __igt_unique____real_main456() (xe_exec_fault_mode:5599) igt_core-INFO: #4 ../tests/intel/xe_exec_fault_mode.c:456 main() (xe_exec_fault_mode:5599) igt_core-INFO: #5 [__libc_init_first+0x8a] (xe_exec_fault_mode:5599) igt_core-INFO: #6 [__libc_start_main+0x8b] (xe_exec_fault_mode:5599) igt_core-INFO: #7 [_start+0x25] **** END **** Subtest many-userptr-invalidate-race: FAIL (24.083s) |
| Dmesg |
<6> [308.492659] Console: switching to colour dummy device 80x25
<6> [308.492955] [IGT] xe_exec_fault_mode: executing
<7> [310.334479] xe 0000:03:00.0: [drm:intel_power_well_enable [xe]] enabling AUX_TC2
<7> [310.446446] xe 0000:03:00.0: [drm:intel_power_well_disable [xe]] disabling AUX_TC2
<3> [310.783936] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22272 recv=22271
<3> [313.086866] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22272 recv=22271
<3> [315.390094] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22273 recv=22271
<7> [317.452430] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2d2d2a2a
<7> [317.452581] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2d2d2d2d
<6> [317.558756] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [317.558783] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [317.558793] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [317.558802] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [317.694132] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22273 recv=22271
<3> [319.999176] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22274 recv=22271
<3> [322.301974] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22274 recv=22271
<6> [322.311446] [IGT] xe_exec_fault_mode: starting subtest many-userptr-invalidate-race
<7> [322.311934] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<3> [324.605860] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22275 recv=22271
<3> [326.909739] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22275 recv=22271
<3> [329.213621] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22276 recv=22271
<3> [329.222522] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22277 recv=22271
<3> [331.517586] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22276 recv=22271
<3> [331.526493] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22277 recv=22271
<7> [332.451596] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2e2e2b2c
<7> [332.451749] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2e2e2d2e
<3> [333.821612] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22278 recv=22271
<6> [333.936502] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [333.936528] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [333.936538] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [333.936548] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [336.125379] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22278 recv=22271
<3> [338.431586] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22279 recv=22271
<3> [340.733233] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22279 recv=22271
<3> [343.037150] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22280 recv=22271
<3> [345.341086] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22280 recv=22271
<6> [345.456571] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [345.456596] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [345.456606] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [345.456616] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [346.395099] [IGT] xe_exec_fault_mode: finished subtest many-userptr-invalidate-race, FAIL
<7> [347.391132] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2e2f2c2c
<7> [347.391281] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2e2e2f
<4> [351.740908] xe 0000:03:00.0: [drm] Tile0: GT0: Schedule disable failed to respond, guc_id=2
<7> [351.926426] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] Tile0: GT0: Proceeding with manual engine snapshot
<6> [351.926669] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [351.926687] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [351.926689] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<6> [351.926820] xe 0000:03:00.0: [drm] Tile0: GT0: reset queued
<3> [351.926967] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=22281 recv=22271
<6> [351.935903] xe 0000:03:00.0: [drm] Tile0: GT0: reset started
<7> [351.936040] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [351.936487] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: Applying GT save-restore MMIOs
<7> [351.936584] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x4148] = 0x00000000
<7> [351.936703] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x8828] = 0x00800000
<7> [351.936813] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb0c8] = 0x11111440
<7> [351.936909] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb104] = 0x08104440
<7> [351.937000] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb108] = 0x30200000
<7> [351.937088] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb158] = 0x0000007f
<7> [351.937174] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xe7cc] = 0x00000100
<7> [351.937255] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] WOPCM: 4096K
<7> [351.937349] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [6144K, 832K)
<7> [351.937438] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel disabled
<7> [351.938555] xe 0000:03:00.0: [drm:xe_guc_ads_populate [xe]] Tile0: GT0: Updated ADS capture size 20480 (was 49152)
<3> [351.949299] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status = 0x400000A0, time = 10ms, freq = 2150MHz (req 2133MHz)
<3> [351.961147] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
<3> [351.974247] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: firmware signature verification failed
<3> [351.983438] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: reset failed (-EPROTO)
<3> [351.990595] xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
IOCTLs and executions are blocked.
For recovery procedure, refer to https://docs.kernel.org/gpu/drm-uapi.html#device-wedging
Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new
<7> [352.022433] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [352.022784] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<3> [352.082454] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: GuC mmio request 0x5507: no reply 0x5507
<6> [352.093004] xe 0000:03:00.0: [drm] device wedged, needs recovery
<7> [352.093104] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<7> [352.101353] xe 0000:03:00.0: [drm:drm_client_dev_restore] fbdev: ret=0
<6> [352.102808] [IGT] xe_exec_fault_mode: exiting, ret=98
<6> [352.118312] Console: switching to colour frame buffer device 240x67
|