Results for igt@xe_exec_reset@virtual-cat-error

Machine description: shard-bmg-8

Result: Abort 1 Warning(s)

i915_display_info6 igt_runner6 results6.json results6-xe-load.json i915_display_info_post_exec6 boot6 dmesg6

DetailValue
Duration unknown
Hostname
shard-bmg-8
Igt-Version
IGT-Version: 1.30-gf0b668833 (x86_64) (Linux: 6.14.0-rc4-xe+ x86_64)
Out
Using IGT_SRANDOM=1740703858 for randomisation
Opened device: /dev/dri/card0
Starting subtest: virtual-cat-error
Subtest virtual-cat-error: SUCCESS (0.127s)

This test caused an abort condition: Lockdep not active

/proc/lockdep_stats contents:
 lock-classes:                         2083 [max: 8192]
 direct dependencies:                 21534 [max: 524288]
 indirect dependencies:              134805
 all direct dependencies:            400666
 dependency chains:                   31016 [max: 524288]
 dependency chain hlocks used:       129161 [max: 2621440]
 dependency chain hlocks lost:            0
 in-hardirq chains:                     226
 in-softirq chains:                     679
 in-process chains:                   30111
 stack-trace entries:                227694 [max: 524288]
 number of stack traces:              10756
 number of stack hash chains:          7895
 combined max dependencies:       353121024
 hardirq-safe locks:                     84
 hardirq-unsafe locks:                 1268
 softirq-safe locks:                    201
 softirq-unsafe locks:                 1170
 irq-safe locks:                        215
 irq-unsafe locks:                     1268
 hardirq-read-safe locks:                 4
 hardirq-read-unsafe locks:             382
 softirq-read-safe locks:                 9
 softirq-read-unsafe locks:             377
 irq-read-safe locks:                     9
 irq-read-unsafe locks:                 382
 uncategorized locks:                   347
 unused locks:                            1
 max locking depth:                      18
 max bfs queue depth:                   336
 max lock class index:                 2082
 debug_locks:                             0

 zapped classes:                          3
 zapped lock chains:                    164
 large chain blocks:                      1
Err
Starting subtest: virtual-cat-error
Subtest virtual-cat-error: SUCCESS (0.127s)
Dmesg

<6> [68.118577] Console: switching to colour dummy device 80x25
<6> [68.118624] [IGT] xe_exec_reset: executing
<6> [68.147457] [IGT] xe_exec_reset: starting subtest virtual-cat-error
<7> [68.163800] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 1048575
VFID: 0
PDATA: 0x0450
Faulted Address: 0x0000000000320000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 1 vcs
EngineInstance: 0
<7> [68.163910] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [68.164008] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 1048575
VFID: 0
PDATA: 0x0451
Faulted Address: 0x0000000000320000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 1 vcs
EngineInstance: 2
<7> [68.164093] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [68.164149] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory cat error: engine_class=vcs, logical_mask: 0x3, guc_id=0
<4> [68.164387]
<4> [68.164392] ======================================================
<4> [68.164395] WARNING: possible circular locking dependency detected
<4> [68.164397] 6.14.0-rc4-xe+ #1 Tainted: G U
<4> [68.164400] ------------------------------------------------------
<4> [68.164402] kworker/u64:1/115 is trying to acquire lock:
<4> [68.164404] ffffffff834c95c0 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x58/0x490
<4> [68.164413]
but task is already holding lock:
<4> [68.164415] ffff88812ac49430 (&ct->lock){+.+.}-{3:3}, at: receive_g2h+0x47/0x100 [xe]
<4> [68.164475]
which lock already depends on the new lock.
<4> [68.164478]
the existing dependency chain (in reverse order) is:
<4> [68.164480]
-> #1 (&ct->lock){+.+.}-{3:3}:
<4> [68.164484] xe_guc_ct_init+0x2b4/0x4c0 [xe]
<4> [68.164538] xe_guc_init+0xe9/0x360 [xe]
<7> [68.164517] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] GT1: Proceeding with manual engine snapshot
<4> [68.164590] xe_uc_init+0x1e/0x1f0 [xe]
<4> [68.164664] xe_gt_init_hwconfig+0x4c/0xb0 [xe]
<4> [68.164713] xe_device_probe+0x3b8/0x7f0 [xe]
<4> [68.164759] xe_pci_probe+0x372/0x5f0 [xe]
<4> [68.164826] local_pci_probe+0x44/0xb0
<4> [68.164831] pci_device_probe+0xf4/0x270
<4> [68.164834] really_probe+0xee/0x3c0
<4> [68.164838] __driver_probe_device+0x8c/0x180
<4> [68.164841] driver_probe_device+0x24/0xd0
<4> [68.164845] __driver_attach+0x10f/0x220
<4> [68.164848] bus_for_each_dev+0x8d/0xf0
<4> [68.164851] driver_attach+0x1e/0x30
<4> [68.164854] bus_add_driver+0x151/0x290
<4> [68.164857] driver_register+0x5e/0x130
<4> [68.164860] __pci_register_driver+0x7d/0x90
<4> [68.164864] xe_register_pci_driver+0x23/0x30 [xe]
<4> [68.164935] 0xffffffffa06b80f3
<4> [68.164939] do_one_initcall+0x76/0x400
<4> [68.164943] do_init_module+0x97/0x2a0
<4> [68.164947] load_module+0x2c23/0x2f60
<4> [68.164950] init_module_from_file+0x97/0xe0
<4> [68.164953] idempotent_init_module+0x134/0x350
<4> [68.164956] __x64_sys_finit_module+0x77/0x100
<4> [68.164959] x64_sys_call+0x1f37/0x2650
<4> [68.164962] do_syscall_64+0x91/0x180
<4> [68.164966] entry_SYSCALL_64_after_hwframe+0x76/0x7e
<4> [68.164970]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [68.164975] __lock_acquire+0x1637/0x2810
<4> [68.164979] lock_acquire+0xc9/0x300
<4> [68.164983] fs_reclaim_acquire+0xc5/0x100
<4> [68.164986] __kmalloc_cache_noprof+0x58/0x490
<4> [68.164990] xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [68.165071] xe_guc_exec_queue_memory_cat_error_handler+0x19d/0x230 [xe]
<4> [68.165138] dequeue_one_g2h+0x349/0x900 [xe]
<4> [68.165202] receive_g2h+0x4f/0x100 [xe]
<4> [68.165265] g2h_worker_func+0x15/0x20 [xe]
<4> [68.165327] process_one_work+0x21c/0x740
<4> [68.165331] worker_thread+0x1db/0x3c0
<4> [68.165335] kthread+0x10d/0x270
<4> [68.165338] ret_from_fork+0x44/0x70
<4> [68.165341] ret_from_fork_asm+0x1a/0x30
<4> [68.165344]
other info that might help us debug this:
<4> [68.165348] Possible unsafe locking scenario:
<4> [68.165350] CPU0 CPU1
<4> [68.165353] ---- ----
<4> [68.165355] lock(&ct->lock);
<4> [68.165357] lock(fs_reclaim);
<4> [68.165361] lock(&ct->lock);
<4> [68.165364] lock(fs_reclaim);
<4> [68.165366]
*** DEADLOCK ***
<4> [68.165369] 3 locks held by kworker/u64:1/115:
<4> [68.165372] #0: ffff88812de01d48 ((wq_completion)xe-g2h-wq#2){+.+.}-{0:0}, at: process_one_work+0x444/0x740
<4> [68.165379] #1: ffffc90000533e20 ((work_completion)(&ct->g2h_worker)){+.+.}-{0:0}, at: process_one_work+0x1da/0x740
<4> [68.165386] #2: ffff88812ac49430 (&ct->lock){+.+.}-{3:3}, at: receive_g2h+0x47/0x100 [xe]
<4> [68.165453]
stack backtrace:
<4> [68.165456] CPU: 0 UID: 0 PID: 115 Comm: kworker/u64:1 Tainted: G U 6.14.0-rc4-xe+ #1
<4> [68.165457] Tainted: [U]=USER
<4> [68.165458] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1645 03/15/2024
<4> [68.165459] Workqueue: xe-g2h-wq g2h_worker_func [xe]
<4> [68.165520] Call Trace:
<4> [68.165521] <TASK>
<4> [68.165521] dump_stack_lvl+0x91/0xf0
<4> [68.165524] dump_stack+0x10/0x20
<4> [68.165526] print_circular_bug+0x285/0x360
<4> [68.165528] check_noncircular+0x150/0x170
<4> [68.165531] __lock_acquire+0x1637/0x2810
<4> [68.165533] ? try_to_wake_up+0x447/0xbc0
<4> [68.165536] lock_acquire+0xc9/0x300
<4> [68.165538] ? __kmalloc_cache_noprof+0x58/0x490
<4> [68.165540] ? lock_release+0xd4/0x2b0
<4> [68.165542] ? xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [68.165621] fs_reclaim_acquire+0xc5/0x100
<4> [68.165622] ? __kmalloc_cache_noprof+0x58/0x490
<4> [68.165624] __kmalloc_cache_noprof+0x58/0x490
<4> [68.165625] ? find_held_lock+0x31/0x90
<4> [68.165627] xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [68.165704] ? xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [68.165777] ? _raw_spin_unlock+0x22/0x50
<4> [68.165779] ? drm_sched_tdr_queue_imm+0x36/0x50 [gpu_sched]
<4> [68.165783] xe_guc_exec_queue_memory_cat_error_handler+0x19d/0x230 [xe]
<4> [68.165854] dequeue_one_g2h+0x349/0x900 [xe]
<4> [68.165916] ? receive_g2h+0x47/0x100 [xe]
<4> [68.165978] receive_g2h+0x4f/0x100 [xe]
<4> [68.166039] g2h_worker_func+0x15/0x20 [xe]
<4> [68.166100] process_one_work+0x21c/0x740
<4> [68.166103] worker_thread+0x1db/0x3c0
<4> [68.166105] ? __pfx_worker_thread+0x10/0x10
<4> [68.166106] kthread+0x10d/0x270
<4> [68.166107] ? __pfx_kthread+0x10/0x10
<4> [68.166108] ret_from_fork+0x44/0x70
<4> [68.166110] ? __pfx_kthread+0x10/0x10
<4> [68.166111] ret_from_fork_asm+0x1a/0x30
<4> [68.166114] </TASK>
<7> [68.166409] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory cat error: engine_class=vcs, logical_mask: 0x3, guc_id=1
<6> [68.167157] xe 0000:03:00.0: [drm] GT1: Engine reset: engine_class=vcs, logical_mask: 0x3, guc_id=0
<5> [68.167226] xe 0000:03:00.0: [drm] GT1: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=0, flags=0x0 in xe_exec_reset [2172]
<6> [68.259095] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [68.259103] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [68.259827] xe 0000:03:00.0: [drm] GT1: Engine reset: engine_class=vcs, logical_mask: 0x3, guc_id=1
<5> [68.259905] xe 0000:03:00.0: [drm] GT1: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=1, flags=0x0 in xe_exec_reset [2172]
<7> [68.259913] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<7> [68.272001] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 1
VFID: 0
PDATA: 0x04d0
Faulted Address: 0x0000000000320000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 2 vecs
EngineInstance: 0
<7> [68.272086] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [68.272149] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 1
VFID: 0
PDATA: 0x04d1
Faulted Address: 0x0000000000320000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 2 vecs
EngineInstance: 1
<7> [68.272209] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [68.272224] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory cat error: engine_class=vecs, logical_mask: 0x3, guc_id=0
<7> [68.272403] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory cat error: engine_class=vecs, logical_mask: 0x3, guc_id=1
<6> [68.272900] xe 0000:03:00.0: [drm] GT1: Engine reset: engine_class=vecs, logical_mask: 0x3, guc_id=0
<5> [68.272933] xe 0000:03:00.0: [drm] GT1: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=0, flags=0x0 in xe_exec_reset [2172]
<7> [68.272941] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [68.273778] xe 0000:03:00.0: [drm] GT1: Engine reset: engine_class=vecs, logical_mask: 0x3, guc_id=1
<5> [68.273812] xe 0000:03:00.0: [drm] GT1: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=1, flags=0x0 in xe_exec_reset [2172]
<7> [68.273817] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [68.274733] [IGT] xe_exec_reset: finished subtest virtual-cat-error, SUCCESS
<7> [68.291105] xe 0000:03:00.0: [drm:drm_client_dev_restore] intel-fbdev: ret=0
<6> [68.291211] [IGT] xe_exec_reset: exiting, ret=0
<6> [68.307859] Console: switching to colour frame buffer device 240x67
Created at 2025-02-28 01:59:27