Result: 90 Warning(s)
i915_display_info22 igt_runner22 results22.json results22-xe-load.json guc_logs22.tar i915_display_info_post_exec22 boot22 dmesg22
| Detail | Value |
|---|---|
| Duration | 166.93 seconds |
| Hostname |
shard-bmg-2 |
| Igt-Version |
IGT-Version: 2.4-g93abaf017 (x86_64) (Linux: 7.0.0-lgci-xe-xe-pw-163428v3-debug+ x86_64) |
| Out |
Using IGT_SRANDOM=1776113976 for randomisation Opened device: /dev/dri/card0 Starting subtest: many-execqueues-userptr-rebind Stack trace: #0 ../lib/igt_core.c:2075 __igt_fail_assert() #1 [xe_wait_ufence+0x57] #2 ../tests/intel/xe_exec_fault_mode.c:307 test_exec() #3 ../tests/intel/xe_exec_fault_mode.c:451 test_exec_main() #4 ../tests/intel/xe_exec_fault_mode.c:577 __igt_unique____real_main456() #5 ../tests/intel/xe_exec_fault_mode.c:456 main() #6 [__libc_init_first+0x8a] #7 [__libc_start_main+0x8b] #8 [_start+0x25] Subtest many-execqueues-userptr-rebind: FAIL (166.931s) |
| Err |
Starting subtest: many-execqueues-userptr-rebind (xe_exec_fault_mode:10532) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:763: (xe_exec_fault_mode:10532) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0 (xe_exec_fault_mode:10532) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired (xe_exec_fault_mode:10532) xe/xe_ioctl-CRITICAL: error: -62 != 0 Subtest many-execqueues-userptr-rebind failed. **** DEBUG **** (xe_exec_fault_mode:10532) DEBUG: test_exec running on: DRM_XE_ENGINE_CLASS_RENDER (xe_exec_fault_mode:10532) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_wait_ufence, file ../lib/xe/xe_ioctl.c:763: (xe_exec_fault_mode:10532) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_wait_ufence(fd, addr, value, exec_queue, &timeout) == 0 (xe_exec_fault_mode:10532) xe/xe_ioctl-CRITICAL: Last errno: 62, Timer expired (xe_exec_fault_mode:10532) xe/xe_ioctl-CRITICAL: error: -62 != 0 (xe_exec_fault_mode:10532) igt_core-INFO: Stack trace: (xe_exec_fault_mode:10532) igt_core-INFO: #0 ../lib/igt_core.c:2075 __igt_fail_assert() (xe_exec_fault_mode:10532) igt_core-INFO: #1 [xe_wait_ufence+0x57] (xe_exec_fault_mode:10532) igt_core-INFO: #2 ../tests/intel/xe_exec_fault_mode.c:307 test_exec() (xe_exec_fault_mode:10532) igt_core-INFO: #3 ../tests/intel/xe_exec_fault_mode.c:451 test_exec_main() (xe_exec_fault_mode:10532) igt_core-INFO: #4 ../tests/intel/xe_exec_fault_mode.c:577 __igt_unique____real_main456() (xe_exec_fault_mode:10532) igt_core-INFO: #5 ../tests/intel/xe_exec_fault_mode.c:456 main() (xe_exec_fault_mode:10532) igt_core-INFO: #6 [__libc_init_first+0x8a] (xe_exec_fault_mode:10532) igt_core-INFO: #7 [__libc_start_main+0x8b] (xe_exec_fault_mode:10532) igt_core-INFO: #8 [_start+0x25] **** END **** Subtest many-execqueues-userptr-rebind: FAIL (166.931s) |
| Dmesg |
<6> [384.107822] Console: switching to colour dummy device 80x25
<6> [384.108722] [IGT] xe_exec_fault_mode: executing
<3> [386.402330] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31805 recv=31794
<3> [388.706331] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31805 recv=31794
<6> [388.825303] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [388.825329] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [388.825339] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [388.825349] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [389.167626] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2e2e2b2c
<7> [389.167788] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2e2e2d2e
<3> [391.010324] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31806 recv=31794
<3> [393.314263] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31806 recv=31794
<3> [395.618290] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31807 recv=31794
<3> [397.922256] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31807 recv=31794
<6> [397.931743] [IGT] xe_exec_fault_mode: starting subtest many-execqueues-userptr-rebind
<7> [397.932746] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<3> [400.226242] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31808 recv=31794
<3> [402.530245] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31808 recv=31794
<6> [404.126824] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [404.126830] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [404.126832] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [404.126834] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [404.162241] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2e2e2b2c
<7> [404.162385] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2e2e2d2f
<6> [404.267788] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [404.267821] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [404.267834] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [404.267847] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [404.834223] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31809 recv=31794
<3> [407.138157] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31809 recv=31794
<3> [409.442161] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31810 recv=31794
<6> [409.561415] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [409.561449] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [409.561462] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [409.561474] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [409.670130] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [409.670156] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [409.670166] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [409.670175] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [411.746106] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31810 recv=31794
<3> [414.050172] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31811 recv=31794
<6> [414.860271] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [414.860297] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [414.860308] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [414.860317] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [414.969063] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [414.969090] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [414.969100] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [414.969109] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [416.354101] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31811 recv=31794
<3> [418.658105] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31812 recv=31794
<7> [419.195103] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2e2e2b2c
<7> [419.195324] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2e2e2f
<6> [419.301738] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [419.301764] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [419.301773] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [419.301783] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [420.962083] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31812 recv=31794
<3> [423.266102] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31813 recv=31794
<6> [423.386531] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [423.386557] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [423.386567] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [423.386576] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [425.570054] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31813 recv=31794
<3> [427.874031] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31814 recv=31794
<3> [430.177990] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31814 recv=31794
<3> [432.481964] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31815 recv=31794
<7> [434.154548] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2e2f2b2c
<7> [434.154768] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2e2e2e2f
<6> [434.629778] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [434.629804] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [434.629814] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [434.629824] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [434.786001] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31815 recv=31794
<3> [437.089943] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31816 recv=31794
<3> [439.393918] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31816 recv=31794
<3> [441.697945] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31817 recv=31794
<3> [444.001941] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31817 recv=31794
<3> [446.305929] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31818 recv=31794
<6> [446.420939] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [446.420966] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [446.420976] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [446.420985] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [448.609864] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31818 recv=31794
<6> [448.728708] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [448.728734] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [448.728744] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [448.728754] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [448.838277] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [448.838302] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [448.838313] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [448.838322] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [449.195254] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f2f2c2c
<7> [449.195435] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2e2e2f
<6> [449.301103] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [449.301135] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [449.301148] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [449.301159] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [449.409824] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [449.409850] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [449.409860] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [449.409870] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [449.518925] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [449.518951] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [449.518962] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [449.518971] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [450.913853] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31819 recv=31794
<3> [453.219076] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31819 recv=31794
<3> [455.521866] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31820 recv=31794
<3> [457.825811] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31820 recv=31794
<6> [457.945120] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [457.945146] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [457.945156] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [457.945165] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [460.129839] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31821 recv=31794
<3> [462.433821] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31821 recv=31794
<6> [464.159856] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [464.159889] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [464.159904] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [464.159920] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [464.207915] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f2f2c2d
<7> [464.208058] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2f2e2f
<3> [464.737806] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31822 recv=31794
<6> [465.547437] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [465.547463] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [465.547474] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [465.547483] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [467.041762] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31822 recv=31794
<3> [469.345735] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31823 recv=31794
<3> [471.649733] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31823 recv=31794
<3> [473.953716] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31824 recv=31794
<3> [476.257715] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31824 recv=31794
<6> [476.370784] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [476.370810] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [476.370820] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [476.370830] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [478.561733] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31825 recv=31794
<6> [478.680764] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [478.680790] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [478.680800] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [478.680810] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [479.168807] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f2f2c2d
<7> [479.168966] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2f2e2f
<6> [479.274822] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [479.274849] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [479.274859] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [479.274869] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [479.384659] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [479.384686] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [479.384695] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [479.384705] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [480.865605] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31825 recv=31794
<6> [481.930848] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [481.930875] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [481.930885] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [481.930894] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [483.169691] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31826 recv=31794
<3> [485.473679] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31826 recv=31794
<3> [487.777671] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31827 recv=31794
<6> [487.896242] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [487.896269] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [487.896279] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [487.896289] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [488.003633] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [488.003659] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [488.003669] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [488.003678] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [488.112619] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [488.112645] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [488.112655] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [488.112665] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [490.081609] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31827 recv=31794
<3> [492.385642] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31828 recv=31794
<6> [492.505884] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [492.505910] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [492.505919] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [492.505929] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [494.166060] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f2f2c2d
<7> [494.166221] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2f2e30
<3> [494.689576] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31828 recv=31794
<3> [496.993574] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31829 recv=31794
<3> [499.297652] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31829 recv=31794
<3> [501.601535] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31830 recv=31794
<3> [503.905327] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31830 recv=31794
<3> [506.209453] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31831 recv=31794
<3> [508.513341] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31831 recv=31794
<7> [509.192696] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f2f2c2d
<7> [509.192948] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2f2f30
<3> [510.817262] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31832 recv=31794
<6> [512.650491] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [512.650518] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [512.650528] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [512.650537] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [513.121234] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31832 recv=31794
<3> [515.426062] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31833 recv=31794
<6> [515.546168] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [515.546194] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [515.546205] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [515.546214] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [517.729163] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31833 recv=31794
<3> [520.033036] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31834 recv=31794
<3> [522.336997] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31834 recv=31794
<6> [522.457216] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [522.457243] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [522.457253] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [522.457262] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [524.145920] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f302c2d
<7> [524.146080] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2f2f2f
<3> [524.640931] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31835 recv=31794
<3> [526.944931] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31835 recv=31794
<3> [529.248829] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31836 recv=31794
<3> [531.552791] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31836 recv=31794
<3> [533.856655] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31837 recv=31794
<3> [536.160716] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31837 recv=31794
<6> [536.280049] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [536.280076] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [536.280087] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [536.280096] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [536.387642] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [536.387668] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [536.387679] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [536.387688] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [538.464689] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31838 recv=31794
<6> [538.579923] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [538.579950] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [538.579960] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [538.579970] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [539.202212] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f2f2c2d
<7> [539.202356] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x302f2f30
<6> [539.307762] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [539.307789] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [539.307798] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [539.307808] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [540.768634] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31838 recv=31794
<3> [543.072624] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31839 recv=31794
<6> [543.369803] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [543.369830] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [543.369840] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [543.369850] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [545.376535] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31839 recv=31794
<3> [547.680470] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31840 recv=31794
<6> [547.800834] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [547.800860] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [547.800870] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [547.800879] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [547.908553] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [547.908580] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [547.908590] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [547.908599] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [549.984442] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31840 recv=31794
<3> [552.288340] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31841 recv=31794
<7> [554.185628] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f2f2c2d
<7> [554.185774] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f302f30
<6> [554.291688] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [554.291715] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [554.291725] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [554.291734] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [554.400240] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [554.400267] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [554.400277] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [554.400287] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [554.592268] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31841 recv=31794
<6> [554.707074] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [554.707100] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [554.707110] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [554.707119] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [556.896283] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31842 recv=31794
<3> [559.200203] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31842 recv=31794
<6> [559.320542] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [559.320568] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [559.320578] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [559.320587] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [559.428314] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [559.428340] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [559.428351] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [559.428361] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [559.537319] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [559.537347] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [559.537357] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [559.537366] nvme 0000:05:00.0: [ 0] RxErr (First)
<3> [561.504176] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31843 recv=31794
<3> [563.808122] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31843 recv=31794
<6> [563.928736] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [563.928763] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [563.928773] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [563.928782] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [564.863989] [IGT] xe_exec_fault_mode: finished subtest many-execqueues-userptr-rebind, FAIL
<6> [564.980197] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [564.980223] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [564.980233] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [564.980243] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [565.090290] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [565.090316] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [565.090326] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [565.090335] nvme 0000:05:00.0: [ 0] RxErr (First)
<6> [565.199538] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [565.199565] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [565.199575] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [565.199584] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [569.153061] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 0 val 0x2f302d2d
<7> [569.153221] xe 0000:03:00.0: [drm:xe_hwmon_read [xe]] thermal data for group 1 val 0x2f2f2f30
<4> [569.888074] xe 0000:03:00.0: [drm] Tile0: GT0: Schedule disable failed to respond, guc_id=2
<6> [569.996826] pcieport 0000:00:06.0: AER: Multiple Correctable error message received from 0000:05:00.0
<4> [569.996832] nvme 0000:05:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
<4> [569.996833] nvme 0000:05:00.0: device [15b7:5017] error status/mask=00000001/0000e000
<4> [569.996835] nvme 0000:05:00.0: [ 0] RxErr (First)
<7> [570.075560] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] Tile0: GT0: Proceeding with manual engine snapshot
<6> [570.075846] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [570.075867] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [570.075870] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<6> [570.075946] xe 0000:03:00.0: [drm] Tile0: GT0: reset queued
<3> [570.076148] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31844 recv=31794
<3> [570.085054] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=31845 recv=31794
<6> [570.093952] xe 0000:03:00.0: [drm] Tile0: GT0: reset started
<7> [570.094069] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [570.094549] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: Applying GT save-restore MMIOs
<7> [570.094638] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x4148] = 0x00000000
<7> [570.094741] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x8828] = 0x00800000
<7> [570.094843] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb0c8] = 0x11111440
<7> [570.094936] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb104] = 0x08104440
<7> [570.095018] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb108] = 0x30200000
<7> [570.095099] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb158] = 0x0000007f
<7> [570.095180] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xe7cc] = 0x00000100
<7> [570.095256] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] WOPCM: 4096K
<7> [570.095352] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [6144K, 832K)
<7> [570.095498] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel disabled
<7> [570.096671] xe 0000:03:00.0: [drm:xe_guc_ads_populate [xe]] Tile0: GT0: Updated ADS capture size 20480 (was 49152)
<3> [570.107432] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status = 0x400000A0, time = 10ms, freq = 2150MHz (req 2133MHz)
<3> [570.119271] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
<3> [570.132046] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: firmware signature verification failed
<3> [570.140819] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: reset failed (-EPROTO)
<3> [570.147966] xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
IOCTLs and executions are blocked.
For recovery procedure, refer to https://docs.kernel.org/gpu/drm-uapi.html#device-wedging
Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new
<7> [570.179805] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [570.179985] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<3> [570.239558] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: GuC mmio request 0x5507: no reply 0x5507
<6> [570.249354] xe 0000:03:00.0: [drm] device wedged, needs recovery
<7> [570.253526] xe 0000:03:00.0: [drm:drm_pagemap_dev_unhold_work [drm_gpusvm_helper]] Releasing reference on provider device and module.
<7> [570.269024] xe 0000:03:00.0: [drm:drm_client_dev_restore] fbdev: ret=0
<6> [570.271354] [IGT] xe_exec_fault_mode: exiting, ret=98
<6> [570.285855] Console: switching to colour frame buffer device 240x67
|