Machine description: shard-bmg-1
Result:
i915_display_info21 igt_runner21 results21.json results21-xe-load.json i915_display_info_post_exec21 boot21 dmesg21
Detail | Value |
---|---|
Duration | unknown |
Hostname |
shard-bmg-1 |
Igt-Version |
IGT-Version: 1.30-g822e95b25 (x86_64) (Linux: 6.14.0-rc3-xe+ x86_64) |
Out |
Using IGT_SRANDOM=1740150991 for randomisation Opened device: /dev/dri/card0 Starting subtest: sanity-after-timeout Starting dynamic subtest: DRM_XE_ENGINE_CLASS_RENDER0 Dynamic subtest DRM_XE_ENGINE_CLASS_RENDER0: SUCCESS (6.131s) This test caused an abort condition: Lockdep not active /proc/lockdep_stats contents: lock-classes: 2197 [max: 8192] direct dependencies: 24522 [max: 524288] indirect dependencies: 174127 all direct dependencies: 473386 dependency chains: 37013 [max: 524288] dependency chain hlocks used: 163050 [max: 2621440] dependency chain hlocks lost: 0 in-hardirq chains: 317 in-softirq chains: 754 in-process chains: 35942 stack-trace entries: 253576 [max: 524288] number of stack traces: 11996 number of stack hash chains: 8543 combined max dependencies: 39620278 hardirq-safe locks: 98 hardirq-unsafe locks: 1328 softirq-safe locks: 228 softirq-unsafe locks: 1229 irq-safe locks: 240 irq-unsafe locks: 1328 hardirq-read-safe locks: 4 hardirq-read-unsafe locks: 400 softirq-read-safe locks: 9 softirq-read-unsafe locks: 395 irq-read-safe locks: 9 irq-read-unsafe locks: 400 uncategorized locks: 361 unused locks: 1 max locking depth: 21 max bfs queue depth: 375 max lock class index: 2196 debug_locks: 0 zapped classes: 3 zapped lock chains: 150 large chain blocks: 1 Starting dynamic subtest: DRM_XE_ENGINE_CLASS_COMPUTE0 Dynamic subtest DRM_XE_ENGINE_CLASS_COMPUTE0: SUCCESS (6.123s) Subtest sanity-after-timeout: SUCCESS (12.255s) |
Err |
Starting subtest: sanity-after-timeout Starting dynamic subtest: DRM_XE_ENGINE_CLASS_RENDER0 Dynamic subtest DRM_XE_ENGINE_CLASS_RENDER0: SUCCESS (6.131s) Starting dynamic subtest: DRM_XE_ENGINE_CLASS_COMPUTE0 Dynamic subtest DRM_XE_ENGINE_CLASS_COMPUTE0: SUCCESS (6.123s) Subtest sanity-after-timeout: SUCCESS (12.255s) |
Dmesg |
<6> [167.052544] Console: switching to colour dummy device 80x25
<6> [167.052797] [IGT] xe_exec_sip: executing
<6> [167.060683] [IGT] xe_exec_sip: starting subtest sanity-after-timeout
<6> [167.060978] [IGT] xe_exec_sip: starting dynamic subtest DRM_XE_ENGINE_CLASS_RENDER0
<7> [172.508395] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] GT0: Proceeding with manual engine snapshot
<6> [173.151830] xe 0000:03:00.0: [drm] GT0: Engine reset: engine_class=rcs, logical_mask: 0x1, guc_id=2
<4> [173.152128]
<4> [173.152146] ======================================================
<4> [173.152159] WARNING: possible circular locking dependency detected
<4> [173.152171] 6.14.0-rc3-xe+ #1 Tainted: G W
<4> [173.152184] ------------------------------------------------------
<4> [173.152195] kworker/u64:44/4198 is trying to acquire lock:
<4> [173.152206] ffff88813959ab38 (&hwe->pf.lock){+.+.}-{2:2}, at: xe_guc_exec_queue_reset_handler+0xb5/0x200 [xe]
<4> [173.152549]
but task is already holding lock:
<4> [173.152560] ffff888139599428 (&ct->lock){+.+.}-{3:3}, at: receive_g2h+0x47/0x100 [xe]
<4> [173.152887]
which lock already depends on the new lock.
<4> [173.152905]
the existing dependency chain (in reverse order) is:
<4> [173.152920]
-> #2 (&ct->lock){+.+.}-{3:3}:
<4> [173.152943] xe_guc_ct_init+0x2b4/0x4c0 [xe]
<4> [173.153266] xe_guc_init+0xe9/0x360 [xe]
<4> [173.153583] xe_uc_init+0x1e/0x1f0 [xe]
<4> [173.153998] xe_gt_init_hwconfig+0x4c/0xb0 [xe]
<4> [173.154318] xe_device_probe+0x3ce/0x820 [xe]
<4> [173.154628] xe_pci_probe+0x35b/0x5f0 [xe]
<4> [173.154768] local_pci_probe+0x44/0xb0
<4> [173.154773] pci_device_probe+0xf4/0x270
<4> [173.154777] really_probe+0xee/0x3c0
<4> [173.154780] __driver_probe_device+0x8c/0x180
<4> [173.154783] driver_probe_device+0x24/0xd0
<4> [173.154786] __driver_attach+0x10f/0x220
<4> [173.154789] bus_for_each_dev+0x8d/0xf0
<4> [173.154792] driver_attach+0x1e/0x30
<4> [173.154794] bus_add_driver+0x151/0x290
<4> [173.154797] driver_register+0x5e/0x130
<4> [173.154800] __pci_register_driver+0x7d/0x90
<4> [173.154803] xe_register_pci_driver+0x23/0x30 [xe]
<4> [173.154864] soundcore_open+0x83/0x210 [soundcore]
<4> [173.154868] do_one_initcall+0x76/0x400
<4> [173.154871] do_init_module+0x97/0x2a0
<4> [173.154874] load_module+0x2c23/0x2f60
<4> [173.154877] init_module_from_file+0x97/0xe0
<4> [173.154879] idempotent_init_module+0x134/0x350
<4> [173.154882] __x64_sys_finit_module+0x77/0x100
<4> [173.154885] x64_sys_call+0x1f37/0x2650
<4> [173.154887] do_syscall_64+0x91/0x180
<4> [173.154891] entry_SYSCALL_64_after_hwframe+0x76/0x7e
<4> [173.154895]
-> #1 (fs_reclaim){+.+.}-{0:0}:
<4> [173.154899] fs_reclaim_acquire+0xc5/0x100
<4> [173.154903] __kmalloc_cache_noprof+0x58/0x490
<4> [173.154906] pf_queue_work_func+0x53e/0x5d0 [xe]
<4> [173.154962] process_one_work+0x21c/0x740
<4> [173.154965] worker_thread+0x1db/0x3c0
<4> [173.154968] kthread+0x10d/0x270
<4> [173.154971] ret_from_fork+0x44/0x70
<4> [173.154975] ret_from_fork_asm+0x1a/0x30
<4> [173.154978]
-> #0 (&hwe->pf.lock){+.+.}-{2:2}:
<4> [173.154981] __lock_acquire+0x1637/0x2810
<4> [173.154985] lock_acquire+0xc9/0x300
<4> [173.154988] _raw_spin_lock+0x2f/0x60
<4> [173.154991] xe_guc_exec_queue_reset_handler+0xb5/0x200 [xe]
<4> [173.155049] dequeue_one_g2h+0x4e5/0x900 [xe]
<4> [173.155106] receive_g2h+0x4f/0x100 [xe]
<4> [173.155161] g2h_worker_func+0x15/0x20 [xe]
<4> [173.155217] process_one_work+0x21c/0x740
<4> [173.155219] worker_thread+0x1db/0x3c0
<4> [173.155222] kthread+0x10d/0x270
<4> [173.155224] ret_from_fork+0x44/0x70
<4> [173.155227] ret_from_fork_asm+0x1a/0x30
<4> [173.155230]
other info that might help us debug this:
<4> [173.155233] Chain exists of:
&hwe->pf.lock --> fs_reclaim --> &ct->lock
<4> [173.155238] Possible unsafe locking scenario:
<4> [173.155241] CPU0 CPU1
<4> [173.155243] ---- ----
<4> [173.155245] lock(&ct->lock);
<4> [173.155247] lock(fs_reclaim);
<4> [173.155250] lock(&ct->lock);
<4> [173.155253] lock(&hwe->pf.lock);
<4> [173.155255]
*** DEADLOCK ***
<4> [173.155258] 3 locks held by kworker/u64:44/4198:
<4> [173.155260] #0: ffff88812ba44d48 ((wq_completion)xe-g2h-wq){+.+.}-{0:0}, at: process_one_work+0x444/0x740
<4> [173.155266] #1: ffffc9000af1fe20 ((work_completion)(&ct->g2h_worker)){+.+.}-{0:0}, at: process_one_work+0x1da/0x740
<4> [173.155273] #2: ffff888139599428 (&ct->lock){+.+.}-{3:3}, at: receive_g2h+0x47/0x100 [xe]
<4> [173.155331]
stack backtrace:
<4> [173.155334] CPU: 1 UID: 0 PID: 4198 Comm: kworker/u64:44 Tainted: G W 6.14.0-rc3-xe+ #1
<4> [173.155335] Tainted: [W]=WARN
<4> [173.155336] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1645 03/15/2024
<4> [173.155337] Workqueue: xe-g2h-wq g2h_worker_func [xe]
<4> [173.155391] Call Trace:
<4> [173.155392] <TASK>
<4> [173.155393] dump_stack_lvl+0x91/0xf0
<4> [173.155395] dump_stack+0x10/0x20
<4> [173.155397] print_circular_bug+0x285/0x360
<4> [173.155399] check_noncircular+0x150/0x170
<4> [173.155400] ? __irq_work_queue_local+0x43/0x120
<4> [173.155403] __lock_acquire+0x1637/0x2810
<4> [173.155405] ? dev_printk_emit+0xa1/0xe0
<4> [173.155407] lock_acquire+0xc9/0x300
<4> [173.155409] ? xe_guc_exec_queue_reset_handler+0xb5/0x200 [xe]
<4> [173.155464] ? __dev_printk+0x39/0xa0
<4> [173.155465] ? _dev_info+0x75/0xa0
<4> [173.155467] _raw_spin_lock+0x2f/0x60
<4> [173.155468] ? xe_guc_exec_queue_reset_handler+0xb5/0x200 [xe]
<4> [173.155523] xe_guc_exec_queue_reset_handler+0xb5/0x200 [xe]
<4> [173.155577] dequeue_one_g2h+0x4e5/0x900 [xe]
<4> [173.155630] ? receive_g2h+0x47/0x100 [xe]
<4> [173.155684] receive_g2h+0x4f/0x100 [xe]
<4> [173.155738] g2h_worker_func+0x15/0x20 [xe]
<4> [173.155791] process_one_work+0x21c/0x740
<4> [173.155793] worker_thread+0x1db/0x3c0
<4> [173.155795] ? __pfx_worker_thread+0x10/0x10
<4> [173.155796] kthread+0x10d/0x270
<4> [173.155797] ? __pfx_kthread+0x10/0x10
<4> [173.155798] ret_from_fork+0x44/0x70
<4> [173.155799] ? __pfx_kthread+0x10/0x10
<4> [173.155800] ret_from_fork_asm+0x1a/0x30
<4> [173.155803] </TASK>
<7> [173.156089] xe 0000:03:00.0: [drm:guc_exec_queue_timedout_job [xe]] GT0: Check job timeout: seqno=4294967169, lrc_seqno=4294967169, guc_id=2, running_time_ms=0, timeout_ms=5000, diff=0x00000000
<5> [173.156164] xe 0000:03:00.0: [drm] GT0: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=2, flags=0x0 in xe_exec_sip [4546]
<6> [173.179059] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [173.179062] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [173.192268] [IGT] xe_exec_sip: finished subtest DRM_XE_ENGINE_CLASS_RENDER0, SUCCESS
<6> [173.192522] [IGT] xe_exec_sip: starting dynamic subtest DRM_XE_ENGINE_CLASS_COMPUTE0
<6> [179.293652] xe 0000:03:00.0: [drm] GT0: Engine reset: engine_class=ccs, logical_mask: 0x1, guc_id=2
<7> [179.293850] xe 0000:03:00.0: [drm:guc_exec_queue_timedout_job [xe]] GT0: Check job timeout: seqno=4294967169, lrc_seqno=4294967169, guc_id=2, running_time_ms=0, timeout_ms=5000, diff=0x00000000
<5> [179.294260] xe 0000:03:00.0: [drm] GT0: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=2, flags=0x0 in xe_exec_sip [4546]
<7> [179.294288] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [179.315945] [IGT] xe_exec_sip: finished subtest DRM_XE_ENGINE_CLASS_COMPUTE0, SUCCESS
<6> [179.316191] [IGT] xe_exec_sip: finished subtest sanity-after-timeout, SUCCESS
<7> [179.323293] xe 0000:03:00.0: [drm:drm_client_dev_restore] intel-fbdev: ret=0
<6> [179.323559] [IGT] xe_exec_sip: exiting, ret=0
<6> [179.340193] Console: switching to colour frame buffer device 240x67
|