Result: 29 Warning(s)
i915_display_info4 igt_runner4 results4.json results4-xe-load.json guc_logs4.tar i915_display_info_post_exec4 boot4 dmesg4
| Detail | Value |
|---|---|
| Duration | 32.72 seconds |
| Hostname |
shard-bmg-2 |
| Igt-Version |
IGT-Version: 2.4-g9b95600c4 (x86_64) (Linux: 7.0.0-lgci-xe-xe-pw-165014v1-debug+ x86_64) |
| Out |
Using IGT_SRANDOM=1776368772 for randomisation Opened device: /dev/dri/card0 Starting subtest: twice-new-nomemset Stack trace: #0 ../lib/igt_core.c:2075 __igt_fail_assert() #1 [xe_wait_ufence+0x57] #2 ../tests/intel/xe_exec_system_allocator.c:1757 test_exec() #3 ../tests/intel/xe_exec_system_allocator.c:2547 __igt_unique____real_main2349() #4 ../tests/intel/xe_exec_system_allocator.c:2349 main() #5 [__libc_init_first+0x8a] #6 [__libc_start_main+0x8b] #7 [_start+0x25] Subtest twice-new-nomemset: FAIL (32.719s) |
| Err |
Starting subtest: twice-new-nomemset (xe_exec_system_allocator:5662) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:763: (xe_exec_system_allocator:5662) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0 (xe_exec_system_allocator:5662) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired (xe_exec_system_allocator:5662) xe/xe_ioctl-CRITICAL: error: -62 != 0 Subtest twice-new-nomemset failed. **** DEBUG **** (xe_exec_system_allocator:5662) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:763: (xe_exec_system_allocator:5662) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0 (xe_exec_system_allocator:5662) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired (xe_exec_system_allocator:5662) xe/xe_ioctl-CRITICAL: error: -62 != 0 (xe_exec_system_allocator:5662) igt_core-INFO: Stack trace: (xe_exec_system_allocator:5662) igt_core-INFO: #0 ../lib/igt_core.c:2075 __igt_fail_assert() (xe_exec_system_allocator:5662) igt_core-INFO: #1 [xe_wait_ufence+0x57] (xe_exec_system_allocator:5662) igt_core-INFO: #2 ../tests/intel/xe_exec_system_allocator.c:1757 test_exec() (xe_exec_system_allocator:5662) igt_core-INFO: #3 ../tests/intel/xe_exec_system_allocator.c:2547 __igt_unique____real_main2349() (xe_exec_system_allocator:5662) igt_core-INFO: #4 ../tests/intel/xe_exec_system_allocator.c:2349 main() (xe_exec_system_allocator:5662) igt_core-INFO: #5 [__libc_init_first+0x8a] (xe_exec_system_allocator:5662) igt_core-INFO: #6 [__libc_start_main+0x8b] (xe_exec_system_allocator:5662) igt_core-INFO: #7 [_start+0x25] **** END **** Subtest twice-new-nomemset: FAIL (32.719s) |
| Dmesg |
<6> [279.878402] Console: switching to colour dummy device 80x25
<6> [279.878707] [IGT] xe_exec_system_allocator: executing
<7> [281.514628] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2b2b2829
<7> [281.514824] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2b2b2a2b
<7> [281.728965] xe 0000:03:00.0: [drm:intel_power_well_enable [xe]] enabling AUX_TC2
<7> [281.840970] xe 0000:03:00.0: [drm:intel_power_well_disable [xe]] disabling AUX_TC2
<3> [282.178912] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38819 recv=38818
<3> [284.481386] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38819 recv=38818
<3> [286.784874] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38820 recv=38818
<6> [286.908807] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [286.908834] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [286.908844] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [286.908854] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [289.088881] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38820 recv=38818
<3> [291.392841] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38821 recv=38818
<3> [293.696800] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38821 recv=38818
<6> [293.706541] [IGT] xe_exec_system_allocator: starting subtest twice-new-nomemset
<7> [293.707408] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<3> [296.000797] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38822 recv=38818
<7> [296.457996] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2d2d2a2b
<7> [296.458145] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2d2d2c2d
<3> [298.304792] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38822 recv=38818
<3> [300.608718] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38823 recv=38818
<3> [302.912718] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38823 recv=38818
<3> [305.216619] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38824 recv=38818
<6> [305.335563] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [305.335589] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [305.335600] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [305.335609] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [307.520591] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38824 recv=38818
<3> [309.824619] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38825 recv=38818
<7> [311.483545] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2e2e2b2b
<7> [311.483774] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2e2d2d2e
<6> [312.020334] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [312.020359] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [312.020369] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [312.020379] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [312.128555] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38825 recv=38818
<3> [314.432513] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38826 recv=38818
<6> [314.547234] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [314.547319] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [314.547329] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [314.547338] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [316.736485] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38826 recv=38818
<6> [316.855466] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [316.855492] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [316.855502] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [316.855511] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [319.040449] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38827 recv=38818
<6> [319.159442] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [319.159467] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [319.159477] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [319.159486] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [321.344415] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38827 recv=38818
<6> [326.426512] [IGT] xe_exec_system_allocator: finished subtest twice-new-nomemset, FAIL
<6> [326.427076] [IGT] xe_exec_system_allocator: exiting, ret=98
<6> [326.427609] Console: switching to colour frame buffer device 240x67
<7> [326.464751] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2e2e2b2c
<7> [326.464932] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2e2e2d2e
<6> [326.568827] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [326.568853] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [326.568863] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [326.568872] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [326.677359] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [326.677386] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [326.677396] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [326.677405] nvme 0000:05:00.0: [ 0] RxErr (First)
<4> [331.520329] xe 0000:03:00.0: [drm] Tile0: GT0: Schedule disable failed to respond, guc_id=2
<7> [331.520354] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [331.520738] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<6> [331.521240] xe 0000:03:00.0: [drm] Tile0: GT0: reset queued
<3> [331.521312] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=38828 recv=38818
<6> [331.530745] xe 0000:03:00.0: [drm] Tile0: GT0: reset started
<7> [331.530851] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [331.531297] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: Applying GT save-restore MMIOs
<7> [331.531391] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x4148] = 0x00000000
<7> [331.531485] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x8828] = 0x00800000
<7> [331.531571] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb0c8] = 0x11111440
<7> [331.531655] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb104] = 0x08104440
<7> [331.531737] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb108] = 0x30200000
<7> [331.531818] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb158] = 0x0000007f
<7> [331.531904] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xe7cc] = 0x00000100
<7> [331.531989] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] WOPCM: 4096K
<7> [331.532096] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [6144K, 832K)
<7> [331.532213] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel disabled
<7> [331.533323] xe 0000:03:00.0: [drm:xe_guc_ads_populate [xe]] Tile0: GT0: Updated ADS capture size 20480 (was 49152)
<3> [331.544137] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status = 0x400000A0, time = 10ms, freq = 2150MHz (req 2133MHz)
<3> [331.556387] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
<3> [331.569158] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: firmware signature verification failed
<3> [331.577955] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: reset failed (-EPROTO)
<3> [331.585111] xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
IOCTLs and executions are blocked.
For recovery procedure, refer to https://docs.kernel.org/gpu/drm-uapi.html#device-wedging
Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new
<7> [331.616939] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [331.617081] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<3> [331.676312] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: GuC mmio request 0x5507: no reply 0x5507
<6> [331.686101] xe 0000:03:00.0: [drm] device wedged, needs recovery
<7> [331.688836] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<7> [331.693730] xe 0000:03:00.0: [drm:drm_client_dev_restore] fbdev: ret=0
|