Result: 28 Warning(s)
i915_display_info0 igt_runner0 results0.json results0-xe-load.json guc_logs0.tar i915_display_info_post_exec0 boot0 dmesg0
| Detail | Value |
|---|---|
| Duration | unknown |
| Hostname |
shard-bmg-2 |
| Igt-Version |
IGT-Version: 2.3-g24cec315d (x86_64) (Linux: 7.0.0-rc3-lgci-xe-xe-4719-bbe6ae2e40f59b05f-debug+ x86_64) |
| Out |
Using IGT_SRANDOM=1773550762 for randomisation Opened device: /dev/dri/card0 Starting subtest: bind-one-bo-many-times Stack trace: #0 ../lib/igt_core.c:2075 __igt_fail_assert() #1 ../lib/igt_syncobj.c:71 syncobj_create() #2 ../lib/xe/xe_ioctl.c:316 __xe_vm_bind_sync() #3 ../lib/xe/xe_ioctl.c:327 xe_vm_bind_sync() #4 ../tests/intel/xe_vm.c:126 __test_bind_one_bo() #5 ../tests/intel/xe_vm.c:2578 __igt_unique____real_main2453() #6 ../tests/intel/xe_vm.c:2453 main() #7 [__libc_init_first+0x8a] #8 [__libc_start_main+0x8b] #9 [_start+0x25] Subtest bind-one-bo-many-times: FAIL (5.196s) This test caused an abort condition: Kernel badly tainted (0x40244, 0x200) (check dmesg for details): TAINT_WARN: WARN_ON has happened. |
| Err |
Starting subtest: bind-one-bo-many-times (xe_vm:11661) igt_syncobj-CRITICAL: Test assertion failure function syncobj_create, file ../lib/igt_syncobj.c:68: (xe_vm:11661) igt_syncobj-CRITICAL: Failed assertion: __syncobj_create(fd, &handle, flags) == 0 (xe_vm:11661) igt_syncobj-CRITICAL: error: -125 != 0 Subtest bind-one-bo-many-times failed. **** DEBUG **** (xe_vm:11661) DEBUG: Binding addr 0 (xe_vm:11661) igt_syncobj-CRITICAL: Test assertion failure function syncobj_create, file ../lib/igt_syncobj.c:68: (xe_vm:11661) igt_syncobj-CRITICAL: Failed assertion: __syncobj_create(fd, &handle, flags) == 0 (xe_vm:11661) igt_syncobj-CRITICAL: error: -125 != 0 (xe_vm:11661) igt_core-INFO: Stack trace: (xe_vm:11661) igt_core-INFO: #0 ../lib/igt_core.c:2075 __igt_fail_assert() (xe_vm:11661) igt_core-INFO: #1 ../lib/igt_syncobj.c:71 syncobj_create() (xe_vm:11661) igt_core-INFO: #2 ../lib/xe/xe_ioctl.c:316 __xe_vm_bind_sync() (xe_vm:11661) igt_core-INFO: #3 ../lib/xe/xe_ioctl.c:327 xe_vm_bind_sync() (xe_vm:11661) igt_core-INFO: #4 ../tests/intel/xe_vm.c:126 __test_bind_one_bo() (xe_vm:11661) igt_core-INFO: #5 ../tests/intel/xe_vm.c:2578 __igt_unique____real_main2453() (xe_vm:11661) igt_core-INFO: #6 ../tests/intel/xe_vm.c:2453 main() (xe_vm:11661) igt_core-INFO: #7 [__libc_init_first+0x8a] (xe_vm:11661) igt_core-INFO: #8 [__libc_start_main+0x8b] (xe_vm:11661) igt_core-INFO: #9 [_start+0x25] **** END **** Subtest bind-one-bo-many-times: FAIL (5.196s) |
| Dmesg |
<6> [519.760082] Console: switching to colour dummy device 80x25
<6> [519.760366] [IGT] xe_vm: executing
<6> [519.768892] [IGT] xe_vm: starting subtest bind-one-bo-many-times
<4> [524.751603] xe 0000:03:00.0: [drm] Tile0: GT0: Schedule disable failed to respond, guc_id=2
<4> [524.751644] xe 0000:03:00.0: [drm] Tile0: GT1: Schedule disable failed to respond, guc_id=0
<7> [524.939109] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] Tile0: GT0: Proceeding with manual engine snapshot
<6> [524.939344] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [524.939369] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [524.939395] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<7> [524.939458] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [524.939506] xe 0000:03:00.0: [drm] Tile0: GT0: reset queued
<6> [524.939641] xe 0000:03:00.0: [drm] Tile0: GT1: trying reset from guc_exec_queue_timedout_job [xe]
<6> [524.939722] xe 0000:03:00.0: [drm] Tile0: GT1: reset queued
<7> [524.939948] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [524.940007] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<7> [524.940018] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<7> [524.940082] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [524.940079] xe 0000:03:00.0: [drm] Tile0: GT1: trying reset from guc_exec_queue_timedout_job [xe]
<6> [524.940139] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<7> [524.940162] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<3> [524.940210] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=15660 recv=15637
<3> [524.940214] xe 0000:03:00.0: [drm] *ERROR* TLB invalidation fence timeout, seqno=15661 recv=15637
<6> [524.940396] xe 0000:03:00.0: [drm] Tile0: GT0: reset started
<6> [524.940406] xe 0000:03:00.0: [drm] Tile0: GT1: trying reset from guc_exec_queue_timedout_job [xe]
<7> [524.940478] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<7> [524.940528] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<6> [524.940536] xe 0000:03:00.0: [drm] Tile0: GT1: trying reset from guc_exec_queue_timedout_job [xe]
<6> [524.940658] xe 0000:03:00.0: [drm] Tile0: GT1: reset started
<7> [524.940959] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<7> [524.941161] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: Applying GT save-restore MMIOs
<7> [524.941249] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x4148] = 0x00000000
<7> [524.941356] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0x8828] = 0x00800000
<7> [524.941438] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb0c8] = 0x11111440
<7> [524.941531] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb104] = 0x08104440
<7> [524.941663] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb108] = 0x30200000
<7> [524.941757] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT0: REG[0xb158] = 0x0000007f
<7> [524.941779] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT1: Applying GT save-restore MMIOs
<7> [524.941834] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] WOPCM: 4096K
<7> [524.941926] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [6144K, 832K)
<7> [524.941989] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT1: REG[0x4148] = 0x00000000
<7> [524.942035] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel disabled
<7> [524.942078] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT1: REG[0xc50c] = 0x00010002
<7> [524.942162] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT1: REG[0x1c3f08] = 0x00000020
<7> [524.942242] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT1: REG[0x1c3f1c] = 0x00800008
<7> [524.942376] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT1: REG[0x1d3f08] = 0x00000020
<7> [524.942492] xe 0000:03:00.0: [drm:xe_reg_sr_apply_mmio [xe]] Tile0: GT1: REG[0x1d3f1c] = 0x00800008
<7> [524.942565] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] WOPCM: 4096K
<7> [524.942794] xe 0000:03:00.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [5136K, 832K)
<7> [524.943137] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel disabled
<7> [524.943225] xe 0000:03:00.0: [drm:xe_guc_ads_populate [xe]] Tile0: GT0: Updated ADS capture size 20480 (was 49152)
<7> [524.944220] xe 0000:03:00.0: [drm:xe_guc_ads_populate [xe]] Tile0: GT1: Updated ADS capture size 20480 (was 49152)
<3> [524.954307] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status = 0x400000A0, time = 9ms, freq = 2150MHz (req 2133MHz)
<3> [524.954437] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
<3> [524.954458] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: firmware signature verification failed
<3> [524.954732] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: reset failed (-EPROTO)
<3> [524.954843] xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
IOCTLs and executions are blocked.
For recovery procedure, refer to https://docs.kernel.org/gpu/drm-uapi.html#device-wedging
Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new
<7> [524.954892] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<3> [524.954977] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: load failed: status = 0x400000A0, time = 10ms, freq = 1500MHz (req 1500MHz)
<3> [524.954981] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
<3> [524.954983] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: firmware signature verification failed
<3> [524.955531] xe 0000:03:00.0: [drm] *ERROR* Tile0: GT1: reset failed (-EPROTO)
<7> [524.955553] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<6> [524.955694] xe 0000:03:00.0: [drm] device wedged, needs recovery
<7> [524.955693] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT0: GuC CT communication channel stopped
<7> [524.955848] xe 0000:03:00.0: [drm:guc_ct_change_state [xe]] Tile0: GT1: GuC CT communication channel stopped
<6> [524.955957] xe 0000:03:00.0: [drm] device wedged, needs recovery
<4> [524.956113] ------------[ cut here ]------------
<4> [524.956116] xe 0000:03:00.0: [drm] Tile0: GT0: Kernel-submitted job timed out
<4> [524.956119] WARNING: drivers/gpu/drm/xe/xe_guc_submit.c:1641 at guc_exec_queue_timedout_job+0x1424/0x2400 [xe], CPU#1: kworker/u64:46/7753
<4> [524.956231] Modules linked in: snd_hda_codec_intelhdmi snd_hda_codec_hdmi pmt_crashlog mei_lb mei_gsc_proxy mtd_intel_dg mei_gsc xe drm_gpuvm drm_gpusvm_helper drm_buddy drm_ttm_helper ttm gpu_sched drm_suballoc_helper drm_exec drm_display_helper cec rc_core drm_kunit_helpers i2c_algo_bit kunit intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal cmdlinepart intel_powerclamp hid_generic spi_nor eeepc_wmi mtd coretemp asus_wmi sparse_keymap mei_pxp platform_profile mei_hdcp wmi_bmof kvm_intel snd_intel_dspcfg kvm snd_hda_codec irqbypass r8169 ghash_clmulni_intel aesni_intel snd_hda_core snd_hwdep rapl usbhid snd_pcm binfmt_misc intel_cstate realtek hid snd_timer spi_intel_pci i2c_i801 snd i2c_mux soundcore i2c_smbus idma64 spi_intel video intel_pmc_core pmt_telemetry pmt_discovery nls_iso8859_1 mei_me pmt_class intel_pmc_ssram_telemetry mei wmi intel_vsec pinctrl_alderlake acpi_pad acpi_tad dm_multipath msr nvme_fabrics fuse efi_pstore nfnetlink
<4> [524.956368] autofs4 [last unloaded: xe_live_test]
<4> [524.956375] CPU: 1 UID: 0 PID: 7753 Comm: kworker/u64:46 Tainted: G S U N 7.0.0-rc3-lgci-xe-xe-4719-bbe6ae2e40f59b05f-debug+ #1 PREEMPT(lazy)
<4> [524.956380] Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER, [N]=TEST
<4> [524.956383] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1645 03/15/2024
<4> [524.956385] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched]
<4> [524.956395] RIP: 0010:guc_exec_queue_timedout_job+0x142d/0x2400 [xe]
<4> [524.956502] Code: 74 04 48 8b 7f 08 4c 8b 6f 50 4d 85 ed 75 03 4c 8b 2f e8 66 56 5e e1 48 89 c6 48 8d 3d dc a1 39 00 41 89 d8 44 89 e1 4c 89 ea <67> 48 0f b9 3a 48 8b 45 90 48 8b 40 60 e9 c6 ee ff ff 8b 70 08 49
<4> [524.956505] RSP: 0018:ffffc90005fbfca0 EFLAGS: 00010246
<4> [524.956510] RAX: ffffffffa11fe932 RBX: 0000000000000000 RCX: 0000000000000000
<4> [524.956513] RDX: ffff888104d6a510 RSI: ffffffffa11fe932 RDI: ffffffffa1003dc0
<4> [524.956515] RBP: ffffc90005fbfdb0 R08: 0000000000000000 R09: 0000000000000000
<4> [524.956518] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4> [524.956520] R13: ffff888104d6a510 R14: ffff8882afd2d018 R15: 00000000ffffffc2
<4> [524.956523] FS: 0000000000000000(0000) GS:ffff8888dad1b000(0000) knlGS:0000000000000000
<4> [524.956526] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [524.956528] CR2: 00007f0a1d8227fd CR3: 000000000344c003 CR4: 0000000000f72ef0
<4> [524.956531] PKRU: 55555554
<4> [524.956534] Call Trace:
<4> [524.956536] <TASK>
<4> [524.956544] ? lock_acquire+0x40/0x2f0
<4> [524.956554] ? lock_release+0xd0/0x2b0
<4> [524.956563] drm_sched_job_timedout+0x94/0x1a0 [gpu_sched]
<4> [524.956574] process_one_work+0x22e/0x740
<4> [524.956585] worker_thread+0x1e8/0x3d0
<4> [524.956590] ? __pfx_worker_thread+0x10/0x10
<4> [524.956594] kthread+0x10d/0x150
<4> [524.956598] ? __pfx_kthread+0x10/0x10
<4> [524.956604] ret_from_fork+0x3d4/0x480
<4> [524.956607] ? __pfx_kthread+0x10/0x10
<4> [524.956613] ret_from_fork_asm+0x1a/0x30
<4> [524.956625] </TASK>
<4> [524.956627] irq event stamp: 760725
<4> [524.956629] hardirqs last enabled at (760731): [<ffffffff814a9d69>] __up_console_sem+0x79/0xa0
<4> [524.956633] hardirqs last disabled at (760736): [<ffffffff814a9d4e>] __up_console_sem+0x5e/0xa0
<4> [524.956636] softirqs last enabled at (760208): [<ffffffff813d0eef>] __irq_exit_rcu+0x13f/0x160
<4> [524.956640] softirqs last disabled at (760203): [<ffffffff813d0eef>] __irq_exit_rcu+0x13f/0x160
<4> [524.956643] ---[ end trace 0000000000000000 ]---
<6> [524.956646] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<4> [524.957173] ------------[ cut here ]------------
<4> [524.957176] xe 0000:03:00.0: [drm] Tile0: GT0: Kernel-submitted job timed out
<4> [524.957179] WARNING: drivers/gpu/drm/xe/xe_guc_submit.c:1641 at guc_exec_queue_timedout_job+0x1424/0x2400 [xe], CPU#13: kworker/u64:26/7733
<4> [524.957309] Modules linked in: snd_hda_codec_intelhdmi snd_hda_codec_hdmi pmt_crashlog mei_lb mei_gsc_proxy mtd_intel_dg mei_gsc xe drm_gpuvm drm_gpusvm_helper drm_buddy drm_ttm_helper ttm gpu_sched drm_suballoc_helper drm_exec drm_display_helper cec rc_core drm_kunit_helpers i2c_algo_bit kunit intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal cmdlinepart intel_powerclamp hid_generic spi_nor eeepc_wmi mtd coretemp asus_wmi sparse_keymap mei_pxp platform_profile mei_hdcp wmi_bmof kvm_intel snd_intel_dspcfg kvm snd_hda_codec irqbypass r8169 ghash_clmulni_intel aesni_intel snd_hda_core snd_hwdep rapl usbhid snd_pcm binfmt_misc intel_cstate realtek hid snd_timer spi_intel_pci i2c_i801 snd i2c_mux soundcore i2c_smbus idma64 spi_intel video intel_pmc_core pmt_telemetry pmt_discovery nls_iso8859_1 mei_me pmt_class intel_pmc_ssram_telemetry mei wmi intel_vsec pinctrl_alderlake acpi_pad acpi_tad dm_multipath msr nvme_fabrics fuse efi_pstore nfnetlink
<4> [524.957412] autofs4 [last unloaded: xe_live_test]
<4> [524.957418] CPU: 13 UID: 0 PID: 7733 Comm: kworker/u64:26 Tainted: G S U W N 7.0.0-rc3-lgci-xe-xe-4719-bbe6ae2e40f59b05f-debug+ #1 PREEMPT(lazy)
<4> [524.957423] Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER, [W]=WARN, [N]=TEST
<4> [524.957425] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1645 03/15/2024
<4> [524.957427] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched]
<4> [524.957436] RIP: 0010:guc_exec_queue_timedout_job+0x142d/0x2400 [xe]
<4> [524.957547] Code: 74 04 48 8b 7f 08 4c 8b 6f 50 4d 85 ed 75 03 4c 8b 2f e8 66 56 5e e1 48 89 c6 48 8d 3d dc a1 39 00 41 89 d8 44 89 e1 4c 89 ea <67> 48 0f b9 3a 48 8b 45 90 48 8b 40 60 e9 c6 ee ff ff 8b 70 08 49
<4> [524.957550] RSP: 0018:ffffc90005f27ca0 EFLAGS: 00010246
<4> [524.957553] RAX: ffffffffa11fe932 RBX: 0000000000000000 RCX: 0000000000000000
<4> [524.957555] RDX: ffff888104d6a510 RSI: ffffffffa11fe932 RDI: ffffffffa1003dc0
<4> [524.957557] RBP: ffffc90005f27db0 R08: 0000000000000000 R09: 0000000000000000
<4> [524.957559] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4> [524.957560] R13: ffff888104d6a510 R14: ffff8882afd2d018 R15: 00000000ffffffc2
<4> [524.957562] FS: 0000000000000000(0000) GS:ffff8888db31b000(0000) knlGS:0000000000000000
<4> [524.957564] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [524.957566] CR2: 0000705cb340b048 CR3: 00000001346a3006 CR4: 0000000000f72ef0
<4> [524.957568] PKRU: 55555554
<4> [524.957570] Call Trace:
<4> [524.957571] <TASK>
<4> [524.957578] ? lock_acquire+0x40/0x2f0
<4> [524.957587] ? lock_release+0xd0/0x2b0
<4> [524.957595] drm_sched_job_timedout+0x94/0x1a0 [gpu_sched]
<4> [524.957604] process_one_work+0x22e/0x740
<4> [524.957615] worker_thread+0x1e8/0x3d0
<4> [524.957619] ? __pfx_worker_thread+0x10/0x10
<4> [524.957622] kthread+0x10d/0x150
<4> [524.957626] ? __pfx_kthread+0x10/0x10
<4> [524.957631] ret_from_fork+0x3d4/0x480
<4> [524.957634] ? __pfx_kthread+0x10/0x10
<4> [524.957638] ret_from_fork_asm+0x1a/0x30
<4> [524.957650] </TASK>
<4> [524.957652] irq event stamp: 541841
<4> [524.957653] hardirqs last enabled at (541847): [<ffffffff814a9d69>] __up_console_sem+0x79/0xa0
<4> [524.957657] hardirqs last disabled at (541852): [<ffffffff814a9d4e>] __up_console_sem+0x5e/0xa0
<4> [524.957660] softirqs last enabled at (540364): [<ffffffff813d0eef>] __irq_exit_rcu+0x13f/0x160
<4> [524.957663] softirqs last disabled at (540359): [<ffffffff813d0eef>] __irq_exit_rcu+0x13f/0x160
<4> [524.957666] ---[ end trace 0000000000000000 ]---
<6> [524.957669] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from guc_exec_queue_timedout_job [xe]
<4> [524.957789] ------------[ cut here ]------------
<4> [524.957791] xe 0000:03:00.0: [drm] Tile0: GT0: Kernel-submitted job timed out
<4> [524.957794] WARNING: drivers/gpu/drm/xe/xe_guc_submit.c:1641 at guc_exec_queue_timedout_job+0x1424/0x2400 [xe], CPU#1: kworker/u64:46/7753
<4> [524.957896] Modules linked in: snd_hda_codec_intelhdmi snd_hda_codec_hdmi pmt_crashlog mei_lb mei_gsc_proxy mtd_intel_dg mei_gsc xe drm_gpuvm drm_gpusvm_helper drm_buddy drm_ttm_helper ttm gpu_sched drm_suballoc_helper drm_exec drm_display_helper cec rc_core drm_kunit_helpers i2c_algo_bit kunit intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal cmdlinepart intel_powerclamp hid_generic spi_nor eeepc_wmi mtd coretemp asus_wmi sparse_keymap mei_pxp platform_profile mei_hdcp wmi_bmof kvm_intel snd_intel_dspcfg kvm snd_hda_codec irqbypass r8169 ghash_clmulni_intel aesni_intel snd_hda_core snd_hwdep rapl usbhid snd_pcm binfmt_misc intel_cstate realtek hid snd_timer spi_intel_pci i2c_i801 snd i2c_mux soundcore i2c_smbus idma64 spi_intel video intel_pmc_core pmt_telemetry pmt_discovery nls_iso8859_1 mei_me pmt_class intel_pmc_ssram_telemetry mei wmi intel_vsec pinctrl_alderlake acpi_pad acpi_tad dm_multipath msr nvme_fabrics fuse efi_pstore nfnetlink
<4> [524.958014] autofs4 [last unloaded: xe_live_test]
<4> [524.958020] CPU: 1 UID: 0 PID: 7753 Comm: kworker/u64:46 Tainted: G S U W N 7.0.0-rc3-lgci-xe-xe-4719-bbe6ae2e40f59b05f-debug+ #1 PREEMPT(lazy)
<4> [524.958025] Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER, [W]=WARN, [N]=TEST
<4> [524.958027] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1645 03/15/2024
<4> [524.958029] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched]
<4> [524.958037] RIP: 0010:guc_exec_queue_timedout_job+0x142d/0x2400 [xe]
<4> [524.958136] Code: 74 04 48 8b 7f 08 4c 8b 6f 50 4d 85 ed 75 03 4c 8b 2f e8 66 56 5e e1 48 89 c6 48 8d 3d dc a1 39 00 41 89 d8 44 89 e1 4c 89 ea <67> 48 0f b9 3a 48 8b 45 90 48 8b 40 60 e9 c6 ee ff ff 8b 70 08 49
<4> [524.958139] RSP: 0018:ffffc90005fbfca0 EFLAGS: 00010246
<4> [524.958143] RAX: ffffffffa11fe932 RBX: 0000000000000000 RCX: 0000000000000000
<4> [524.958145] RDX: ffff888104d6a510 RSI: ffffffffa11fe932 RDI: ffffffffa1003dc0
<4> [524.958147] RBP: ffffc90005fbfdb0 R08: 0000000000000000 R09: 0000000000000000
<4> [524.958149] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4> [524.958151] R13: ffff888104d6a510 R14: ffff8882afd2d018 R15: 00000000ffffffc2
<4> [524.958154] FS: 0000000000000000(0000) GS:ffff8888dad1b000(0000) knlGS:0000000000000000
<4> [524.958156] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [524.958159] CR2: 00007f0a1d8227fd CR3: 000000000344c003 CR4: 0000000000f72ef0
<4> [524.958161] PKRU: 55555554
<4> [524.958163] Call Trace:
<4> [524.958165] <TASK>
<4> [524.958173] ? lock_acquire+0x40/0x2f0
<4> [524.958181] ? lock_release+0xd0/0x2b0
<4> [524.958190] drm_sched_job_timedout+0x94/0x1a0 [gpu_sched]
<4> [524.958200] process_one_work+0x22e/0x740
<4> [524.958211] worker_thread+0x1e8/0x3d0
<4> [524.958216] ? __pfx_worker_thread+0x10/0x10
<4> [524.958221] kthread+0x10d/0x150
<4> [524.958225] ? __pfx_kthread+0x10/0x10
<4> [524.958231] ret_from_fork+0x3d4/0x480
<4> [524.958234] ? __pfx_kthread+0x10/0x10
<4> [524.958240] ret_from_fork_asm+0x1a/0x30
<4> [524.958253] </TASK>
<4> [524.958255] irq event stamp: 761695
<4> [524.958257] hardirqs last enabled at (761701): [<ffffffff814a9d69>] __up_console_sem+0x79/0xa0
<4> [524.958262] hardirqs last disabled at (761706): [<ffffffff814a9d4e>] __up_console_sem+0x5e/0xa0
<4> [524.958265] softirqs last enabled at (760208): [<ffffffff813d0eef>] __irq_exit_rcu+0x13f/0x160
<4> [524.958269] softirqs last disabled at (760203): [<ffffffff813d0eef>] __irq_exit_rcu+0x13f/0x160
<4> [524.958273] ---[ end trace 0000000000000000 ]---
<6> [524.965604] [IGT] xe_vm: finished subtest bind-one-bo-many-times, FAIL
<6> [524.966171] [IGT] xe_vm: exiting, ret=98
<6> [524.966748] Console: switching to colour frame buffer device 240x67
<7> [524.980074] xe 0000:03:00.0: [drm:drm_client_dev_restore] fbdev: ret=0
|