Machine description: shard-adlp-4
Result: 1 Warning(s)
i915_display_info6 igt_runner6 results6.json results6-xe-load.json i915_display_info_post_exec6 boot6 dmesg6
Detail | Value |
---|---|
Duration | unknown |
Hostname |
shard-adlp-4 |
Igt-Version |
IGT-Version: 1.30-gf0b668833 (x86_64) (Linux: 6.14.0-rc4-xe+ x86_64) |
Out |
Using IGT_SRANDOM=1740704283 for randomisation Opened device: /dev/dri/card0 Starting subtest: virtual-cat-error Subtest virtual-cat-error: SUCCESS (0.017s) This test caused an abort condition: Lockdep not active /proc/lockdep_stats contents: lock-classes: 2182 [max: 8192] direct dependencies: 23414 [max: 524288] indirect dependencies: 143466 all direct dependencies: 455610 dependency chains: 33806 [max: 524288] dependency chain hlocks used: 140784 [max: 2621440] dependency chain hlocks lost: 0 in-hardirq chains: 254 in-softirq chains: 701 in-process chains: 32851 stack-trace entries: 247677 [max: 524288] number of stack traces: 11453 number of stack hash chains: 8259 combined max dependencies: 1585869224 hardirq-safe locks: 95 hardirq-unsafe locks: 1294 softirq-safe locks: 213 softirq-unsafe locks: 1194 irq-safe locks: 239 irq-unsafe locks: 1294 hardirq-read-safe locks: 5 hardirq-read-unsafe locks: 421 softirq-read-safe locks: 9 softirq-read-unsafe locks: 416 irq-read-safe locks: 10 irq-read-unsafe locks: 421 uncategorized locks: 357 unused locks: 1 max locking depth: 18 max bfs queue depth: 360 max lock class index: 2181 debug_locks: 0 zapped classes: 3 zapped lock chains: 168 large chain blocks: 1 |
Err |
Starting subtest: virtual-cat-error Subtest virtual-cat-error: SUCCESS (0.017s) |
Dmesg
|
<6> [88.277707] Console: switching to colour dummy device 80x25
<6> [88.277788] [IGT] xe_exec_reset: executing
<6> [88.294356] [IGT] xe_exec_reset: starting subtest virtual-cat-error
<7> [88.305913] xe 0000:00:02.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT0: Engine memory cat error: engine_class=vcs, logical_mask: 0x3, guc_id=2
<4> [88.306425]
<4> [88.306430] ======================================================
<4> [88.306434] WARNING: possible circular locking dependency detected
<4> [88.306437] 6.14.0-rc4-xe+ #1 Tainted: G U
<4> [88.306441] ------------------------------------------------------
<4> [88.306444] kworker/u80:7/900 is trying to acquire lock:
<4> [88.306447] ffffffff834c95c0 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x58/0x490
<4> [88.306458]
but task is already holding lock:
<4> [88.306461] ffff888144a49430 (&ct->lock){+.+.}-{3:3}, at: receive_g2h+0x47/0x100 [xe]
<4> [88.306550]
which lock already depends on the new lock.
<4> [88.306553]
the existing dependency chain (in reverse order) is:
<4> [88.306557]
-> #1 (&ct->lock){+.+.}-{3:3}:
<4> [88.306562] xe_guc_ct_init+0x2b4/0x4c0 [xe]
<7> [88.306596] xe 0000:00:02.0: [drm:xe_hw_engine_snapshot_capture [xe]] GT0: Proceeding with manual engine snapshot
<4> [88.306648] xe_guc_init+0xe9/0x360 [xe]
<4> [88.306729] xe_uc_init+0x1e/0x1f0 [xe]
<4> [88.306833] xe_gt_init_hwconfig+0x4c/0xb0 [xe]
<4> [88.306911] xe_device_probe+0x3b8/0x7f0 [xe]
<4> [88.306986] xe_pci_probe+0x372/0x5f0 [xe]
<4> [88.307076] local_pci_probe+0x44/0xb0
<4> [88.307082] pci_device_probe+0xf4/0x270
<4> [88.307086] really_probe+0xee/0x3c0
<4> [88.307091] __driver_probe_device+0x8c/0x180
<4> [88.307095] driver_probe_device+0x24/0xd0
<4> [88.307099] __driver_attach+0x10f/0x220
<4> [88.307104] bus_for_each_dev+0x8d/0xf0
<4> [88.307107] driver_attach+0x1e/0x30
<4> [88.307111] bus_add_driver+0x151/0x290
<4> [88.307115] driver_register+0x5e/0x130
<4> [88.307119] __pci_register_driver+0x7d/0x90
<4> [88.307123] xe_register_pci_driver+0x23/0x30 [xe]
<4> [88.307216] 0xffffffffa0a170f3
<4> [88.307231] do_one_initcall+0x76/0x400
<4> [88.307235] do_init_module+0x97/0x2a0
<4> [88.307240] load_module+0x2c23/0x2f60
<4> [88.307243] init_module_from_file+0x97/0xe0
<4> [88.307246] idempotent_init_module+0x134/0x350
<4> [88.307250] __x64_sys_finit_module+0x77/0x100
<4> [88.307254] x64_sys_call+0x1f37/0x2650
<4> [88.307258] do_syscall_64+0x91/0x180
<4> [88.307262] entry_SYSCALL_64_after_hwframe+0x76/0x7e
<4> [88.307267]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [88.307273] __lock_acquire+0x1637/0x2810
<4> [88.307280] lock_acquire+0xc9/0x300
<4> [88.307284] fs_reclaim_acquire+0xc5/0x100
<4> [88.307288] __kmalloc_cache_noprof+0x58/0x490
<4> [88.307292] xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [88.307396] xe_guc_exec_queue_memory_cat_error_handler+0x19d/0x230 [xe]
<4> [88.307482] dequeue_one_g2h+0x349/0x900 [xe]
<4> [88.307563] receive_g2h+0x4f/0x100 [xe]
<4> [88.307644] g2h_worker_func+0x15/0x20 [xe]
<4> [88.307725] process_one_work+0x21c/0x740
<4> [88.307730] worker_thread+0x1db/0x3c0
<4> [88.307734] kthread+0x10d/0x270
<4> [88.307738] ret_from_fork+0x44/0x70
<4> [88.307742] ret_from_fork_asm+0x1a/0x30
<4> [88.307746]
other info that might help us debug this:
<4> [88.307750] Possible unsafe locking scenario:
<4> [88.307753] CPU0 CPU1
<4> [88.307756] ---- ----
<4> [88.307758] lock(&ct->lock);
<4> [88.307761] lock(fs_reclaim);
<4> [88.307765] lock(&ct->lock);
<4> [88.307769] lock(fs_reclaim);
<4> [88.307772]
*** DEADLOCK ***
<4> [88.307775] 3 locks held by kworker/u80:7/900:
<4> [88.307779] #0: ffff888144a54148 ((wq_completion)xe-g2h-wq){+.+.}-{0:0}, at: process_one_work+0x444/0x740
<4> [88.307787] #1: ffffc90004233e20 ((work_completion)(&ct->g2h_worker)){+.+.}-{0:0}, at: process_one_work+0x1da/0x740
<4> [88.307795] #2: ffff888144a49430 (&ct->lock){+.+.}-{3:3}, at: receive_g2h+0x47/0x100 [xe]
<4> [88.307881]
stack backtrace:
<4> [88.307884] CPU: 13 UID: 0 PID: 900 Comm: kworker/u80:7 Tainted: G U 6.14.0-rc4-xe+ #1
<4> [88.307886] Tainted: [U]=USER
<4> [88.307887] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
<4> [88.307888] Workqueue: xe-g2h-wq g2h_worker_func [xe]
<4> [88.307968] Call Trace:
<4> [88.307969] <TASK>
<4> [88.307970] dump_stack_lvl+0x91/0xf0
<4> [88.307973] dump_stack+0x10/0x20
<4> [88.307975] print_circular_bug+0x285/0x360
<4> [88.307978] check_noncircular+0x150/0x170
<4> [88.307981] __lock_acquire+0x1637/0x2810
<4> [88.307983] ? try_to_wake_up+0x447/0xbc0
<4> [88.307987] lock_acquire+0xc9/0x300
<4> [88.307989] ? __kmalloc_cache_noprof+0x58/0x490
<4> [88.307992] ? lock_release+0xd4/0x2b0
<4> [88.307994] ? xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [88.308094] fs_reclaim_acquire+0xc5/0x100
<4> [88.308095] ? __kmalloc_cache_noprof+0x58/0x490
<4> [88.308097] __kmalloc_cache_noprof+0x58/0x490
<4> [88.308099] ? find_held_lock+0x31/0x90
<4> [88.308101] xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [88.308204] ? xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [88.308303] ? _raw_spin_unlock+0x22/0x50
<4> [88.308305] ? drm_sched_tdr_queue_imm+0x36/0x50 [gpu_sched]
<4> [88.308309] xe_guc_exec_queue_memory_cat_error_handler+0x19d/0x230 [xe]
<4> [88.308392] dequeue_one_g2h+0x349/0x900 [xe]
<4> [88.308471] ? receive_g2h+0x47/0x100 [xe]
<4> [88.308552] receive_g2h+0x4f/0x100 [xe]
<4> [88.308631] g2h_worker_func+0x15/0x20 [xe]
<4> [88.308710] process_one_work+0x21c/0x740
<4> [88.308713] worker_thread+0x1db/0x3c0
<4> [88.308715] ? __pfx_worker_thread+0x10/0x10
<4> [88.308717] kthread+0x10d/0x270
<4> [88.308719] ? __pfx_kthread+0x10/0x10
<4> [88.308720] ret_from_fork+0x44/0x70
<4> [88.308722] ? __pfx_kthread+0x10/0x10
<4> [88.308723] ret_from_fork_asm+0x1a/0x30
<4> [88.308726] </TASK>
<7> [88.308937] xe 0000:00:02.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT0: Engine memory cat error: engine_class=vcs, logical_mask: 0x3, guc_id=3
<6> [88.309444] xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=vcs, logical_mask: 0x3, guc_id=2
<5> [88.309472] xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=2, flags=0x0 in xe_exec_reset [2204]
<6> [88.309882] xe 0000:00:02.0: [drm] Xe device coredump has been created
<6> [88.309886] xe 0000:00:02.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [88.310402] xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=vcs, logical_mask: 0x3, guc_id=3
<5> [88.310415] xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=3, flags=0x0 in xe_exec_reset [2204]
<7> [88.310423] xe 0000:00:02.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [88.311713] [IGT] xe_exec_reset: finished subtest virtual-cat-error, SUCCESS
<7> [88.312569] xe 0000:00:02.0: [drm:intel_power_well_enable [xe]] enabling DC_off
<7> [88.312687] xe 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [xe]] Setting DC state from 02 to 00
<7> [88.313223] xe 0000:00:02.0: [drm:drm_client_dev_restore] intel-fbdev: ret=0
<6> [88.313272] [IGT] xe_exec_reset: exiting, ret=0
<6> [88.330068] Console: switching to colour frame buffer device 240x67
|