Results for igt@xe_exec_reset@cat-error

Machine description: shard-bmg-6

Result: Abort 1 Warning(s)

i915_display_info8 igt_runner8 results8.json results8-xe-load.json i915_display_info_post_exec8 boot8 dmesg8

DetailValue
Duration unknown
Hostname
shard-bmg-6
Igt-Version
IGT-Version: 1.30-gf0b668833 (x86_64) (Linux: 6.14.0-rc4-xe+ x86_64)
Out
Using IGT_SRANDOM=1740704346 for randomisation
Opened device: /dev/dri/card0
Starting subtest: cat-error
Subtest cat-error: SUCCESS (0.081s)

This test caused an abort condition: Lockdep not active

/proc/lockdep_stats contents:
 lock-classes:                         2196 [max: 8192]
 direct dependencies:                 25354 [max: 524288]
 indirect dependencies:              181792
 all direct dependencies:            475885
 dependency chains:                   39110 [max: 524288]
 dependency chain hlocks used:       176656 [max: 2621440]
 dependency chain hlocks lost:            0
 in-hardirq chains:                     332
 in-softirq chains:                     767
 in-process chains:                   38011
 stack-trace entries:                265641 [max: 524288]
 number of stack traces:              12477
 number of stack hash chains:          8722
 combined max dependencies:      1131406336
 hardirq-safe locks:                     97
 hardirq-unsafe locks:                 1329
 softirq-safe locks:                    225
 softirq-unsafe locks:                 1225
 irq-safe locks:                        242
 irq-unsafe locks:                     1329
 hardirq-read-safe locks:                 5
 hardirq-read-unsafe locks:             396
 softirq-read-safe locks:                 9
 softirq-read-unsafe locks:             391
 irq-read-safe locks:                     9
 irq-read-unsafe locks:                 396
 uncategorized locks:                   367
 unused locks:                            1
 max locking depth:                      22
 max bfs queue depth:                   369
 max lock class index:                 2198
 debug_locks:                             0

 zapped classes:                         33
 zapped lock chains:                   1085
 large chain blocks:                      1
Err
Starting subtest: cat-error
Subtest cat-error: SUCCESS (0.081s)
Dmesg

<6> [164.740576] Console: switching to colour dummy device 80x25
<6> [164.740694] [IGT] xe_exec_reset: executing
<6> [164.768429] [IGT] xe_exec_reset: starting subtest cat-error
<7> [164.774363] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 65
VFID: 0
PDATA: 0x0490
Faulted Address: 0x00000000002a0000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 0 rcs
EngineInstance: 0
<7> [164.774498] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [164.774842] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT0: Engine memory cat error: engine_class=rcs, logical_mask: 0x1, guc_id=2
<4> [164.775196]
<4> [164.775203] ======================================================
<4> [164.775207] WARNING: possible circular locking dependency detected
<4> [164.775211] 6.14.0-rc4-xe+ #1 Tainted: G U
<4> [164.775215] ------------------------------------------------------
<4> [164.775218] kworker/u64:2/119 is trying to acquire lock:
<4> [164.775222] ffffffff834c95c0 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x58/0x490
<4> [164.775236]
but task is already holding lock:
<4> [164.775240] ffff88811fcf1430 (&ct->lock){+.+.}-{3:3}, at: receive_g2h+0x47/0x100 [xe]
<4> [164.775337]
which lock already depends on the new lock.
<4> [164.775341]
the existing dependency chain (in reverse order) is:
<4> [164.775345]
-> #1 (&ct->lock){+.+.}-{3:3}:
<4> [164.775351] xe_guc_ct_init+0x2b4/0x4c0 [xe]
<4> [164.775441] xe_guc_init+0xe9/0x360 [xe]
<4> [164.775526] xe_uc_init+0x1e/0x1f0 [xe]
<7> [164.775541] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture [xe]] GT0: Proceeding with manual engine snapshot
<4> [164.775656] xe_gt_init_hwconfig+0x4c/0xb0 [xe]
<4> [164.775761] xe_device_probe+0x3b8/0x7f0 [xe]
<4> [164.775862] xe_pci_probe+0x372/0x5f0 [xe]
<4> [164.775980] local_pci_probe+0x44/0xb0
<4> [164.775986] pci_device_probe+0xf4/0x270
<4> [164.775992] really_probe+0xee/0x3c0
<4> [164.775998] __driver_probe_device+0x8c/0x180
<4> [164.776003] driver_probe_device+0x24/0xd0
<4> [164.776009] __driver_attach+0x10f/0x220
<4> [164.776014] bus_for_each_dev+0x8d/0xf0
<4> [164.776018] driver_attach+0x1e/0x30
<4> [164.776023] bus_add_driver+0x151/0x290
<4> [164.776028] driver_register+0x5e/0x130
<4> [164.776033] __pci_register_driver+0x7d/0x90
<4> [164.776039] xe_register_pci_driver+0x23/0x30 [xe]
<4> [164.776150] soundcore_open+0x83/0x210 [soundcore]
<4> [164.776156] do_one_initcall+0x76/0x400
<4> [164.776161] do_init_module+0x97/0x2a0
<4> [164.776166] load_module+0x2c23/0x2f60
<4> [164.776171] init_module_from_file+0x97/0xe0
<4> [164.776175] idempotent_init_module+0x134/0x350
<4> [164.776180] __x64_sys_finit_module+0x77/0x100
<4> [164.776185] x64_sys_call+0x1f37/0x2650
<4> [164.776190] do_syscall_64+0x91/0x180
<4> [164.776195] entry_SYSCALL_64_after_hwframe+0x76/0x7e
<4> [164.776202]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [164.776209] __lock_acquire+0x1637/0x2810
<4> [164.776215] lock_acquire+0xc9/0x300
<4> [164.776220] fs_reclaim_acquire+0xc5/0x100
<4> [164.776225] __kmalloc_cache_noprof+0x58/0x490
<4> [164.776231] xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [164.776354] xe_guc_exec_queue_memory_cat_error_handler+0x19d/0x230 [xe]
<4> [164.776457] dequeue_one_g2h+0x349/0x900 [xe]
<4> [164.776554] receive_g2h+0x4f/0x100 [xe]
<4> [164.776649] g2h_worker_func+0x15/0x20 [xe]
<4> [164.776741] process_one_work+0x21c/0x740
<4> [164.776747] worker_thread+0x1db/0x3c0
<4> [164.776752] kthread+0x10d/0x270
<4> [164.776757] ret_from_fork+0x44/0x70
<4> [164.776762] ret_from_fork_asm+0x1a/0x30
<4> [164.776767]
other info that might help us debug this:
<4> [164.776772] Possible unsafe locking scenario:
<4> [164.776775] CPU0 CPU1
<4> [164.776779] ---- ----
<4> [164.776782] lock(&ct->lock);
<4> [164.776786] lock(fs_reclaim);
<4> [164.776790] lock(&ct->lock);
<4> [164.776795] lock(fs_reclaim);
<4> [164.776799]
*** DEADLOCK ***
<4> [164.776803] 3 locks held by kworker/u64:2/119:
<4> [164.776806] #0: ffff888151e4a948 ((wq_completion)xe-g2h-wq#3){+.+.}-{0:0}, at: process_one_work+0x444/0x740
<4> [164.776818] #1: ffffc90000553e20 ((work_completion)(&ct->g2h_worker)){+.+.}-{0:0}, at: process_one_work+0x1da/0x740
<4> [164.776828] #2: ffff88811fcf1430 (&ct->lock){+.+.}-{3:3}, at: receive_g2h+0x47/0x100 [xe]
<4> [164.776926]
stack backtrace:
<4> [164.776930] CPU: 2 UID: 0 PID: 119 Comm: kworker/u64:2 Tainted: G U 6.14.0-rc4-xe+ #1
<4> [164.776932] Tainted: [U]=USER
<4> [164.776933] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 1656 04/18/2024
<4> [164.776934] Workqueue: xe-g2h-wq g2h_worker_func [xe]
<4> [164.777024] Call Trace:
<4> [164.777025] <TASK>
<4> [164.777026] dump_stack_lvl+0x91/0xf0
<4> [164.777030] dump_stack+0x10/0x20
<4> [164.777032] print_circular_bug+0x285/0x360
<4> [164.777036] check_noncircular+0x150/0x170
<4> [164.777040] __lock_acquire+0x1637/0x2810
<4> [164.777042] ? try_to_wake_up+0x447/0xbc0
<4> [164.777047] lock_acquire+0xc9/0x300
<4> [164.777049] ? __kmalloc_cache_noprof+0x58/0x490
<4> [164.777052] ? lock_release+0xd4/0x2b0
<4> [164.777055] ? xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [164.777167] fs_reclaim_acquire+0xc5/0x100
<4> [164.777168] ? __kmalloc_cache_noprof+0x58/0x490
<4> [164.777171] __kmalloc_cache_noprof+0x58/0x490
<4> [164.777173] ? find_held_lock+0x31/0x90
<4> [164.777175] xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [164.777281] ? xe_vm_add_ban_entry+0x64/0x2a0 [xe]
<4> [164.777384] ? _raw_spin_unlock+0x22/0x50
<4> [164.777387] ? drm_sched_tdr_queue_imm+0x36/0x50 [gpu_sched]
<4> [164.777391] xe_guc_exec_queue_memory_cat_error_handler+0x19d/0x230 [xe]
<4> [164.777482] dequeue_one_g2h+0x349/0x900 [xe]
<4> [164.777566] ? receive_g2h+0x47/0x100 [xe]
<4> [164.777649] receive_g2h+0x4f/0x100 [xe]
<4> [164.777730] g2h_worker_func+0x15/0x20 [xe]
<4> [164.777810] process_one_work+0x21c/0x740
<4> [164.777815] worker_thread+0x1db/0x3c0
<4> [164.777817] ? __pfx_worker_thread+0x10/0x10
<4> [164.777819] kthread+0x10d/0x270
<4> [164.777821] ? __pfx_kthread+0x10/0x10
<4> [164.777822] ret_from_fork+0x44/0x70
<4> [164.777824] ? __pfx_kthread+0x10/0x10
<4> [164.777825] ret_from_fork_asm+0x1a/0x30
<4> [164.777829] </TASK>
<6> [164.781073] xe 0000:03:00.0: [drm] GT0: Engine reset: engine_class=rcs, logical_mask: 0x1, guc_id=2
<5> [164.781111] xe 0000:03:00.0: [drm] GT0: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=2, flags=0x0 in xe_exec_reset [5203]
<6> [164.813036] xe 0000:03:00.0: [drm] Xe device coredump has been created
<6> [164.813043] xe 0000:03:00.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<7> [164.818550] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 66
VFID: 0
PDATA: 0x0c10
Faulted Address: 0x00000000002a0000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 3 bcs
EngineInstance: 0
<7> [164.818676] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [164.818906] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT0: Engine memory cat error: engine_class=bcs, logical_mask: 0x1, guc_id=2
<6> [164.819906] xe 0000:03:00.0: [drm] GT0: Engine reset: engine_class=bcs, logical_mask: 0x1, guc_id=2
<5> [164.819935] xe 0000:03:00.0: [drm] GT0: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=2, flags=0x0 in xe_exec_reset [5203]
<7> [164.819944] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<7> [164.824132] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 67
VFID: 0
PDATA: 0x0411
Faulted Address: 0x00000000002a0000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 5 ccs
EngineInstance: 0
<7> [164.824239] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [164.824496] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT0: Engine memory cat error: engine_class=ccs, logical_mask: 0x1, guc_id=2
<6> [164.827775] xe 0000:03:00.0: [drm] GT0: Engine reset: engine_class=ccs, logical_mask: 0x1, guc_id=2
<5> [164.827809] xe 0000:03:00.0: [drm] GT0: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=2, flags=0x0 in xe_exec_reset [5203]
<7> [164.827814] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<7> [164.831520] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 68
VFID: 0
PDATA: 0x0450
Faulted Address: 0x00000000002a0000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 1 vcs
EngineInstance: 0
<7> [164.831622] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [164.831863] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory cat error: engine_class=vcs, logical_mask: 0x1, guc_id=0
<6> [164.832828] xe 0000:03:00.0: [drm] GT1: Engine reset: engine_class=vcs, logical_mask: 0x1, guc_id=0
<5> [164.832858] xe 0000:03:00.0: [drm] GT1: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=0, flags=0x0 in xe_exec_reset [5203]
<7> [164.832865] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<7> [164.836519] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 69
VFID: 0
PDATA: 0x0451
Faulted Address: 0x00000000002a0000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 1 vcs
EngineInstance: 2
<7> [164.836633] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [164.836844] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory cat error: engine_class=vcs, logical_mask: 0x2, guc_id=0
<6> [164.837825] xe 0000:03:00.0: [drm] GT1: Engine reset: engine_class=vcs, logical_mask: 0x2, guc_id=0
<5> [164.837855] xe 0000:03:00.0: [drm] GT1: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=0, flags=0x0 in xe_exec_reset [5203]
<7> [164.837861] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<7> [164.842010] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 70
VFID: 0
PDATA: 0x04d0
Faulted Address: 0x00000000002a0000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 2 vecs
EngineInstance: 0
<7> [164.842134] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [164.842401] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory cat error: engine_class=vecs, logical_mask: 0x1, guc_id=0
<6> [164.843390] xe 0000:03:00.0: [drm] GT1: Engine reset: engine_class=vecs, logical_mask: 0x1, guc_id=0
<5> [164.843422] xe 0000:03:00.0: [drm] GT1: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=0, flags=0x0 in xe_exec_reset [5203]
<7> [164.843430] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<7> [164.847791] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 71
VFID: 0
PDATA: 0x04d1
Faulted Address: 0x00000000002a0000
FaultType: 0
AccessType: 0
FaultLevel: 1
EngineClass: 2 vecs
EngineInstance: 1
<7> [164.847943] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
<7> [164.848229] xe 0000:03:00.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory cat error: engine_class=vecs, logical_mask: 0x2, guc_id=0
<6> [164.849208] xe 0000:03:00.0: [drm] GT1: Engine reset: engine_class=vecs, logical_mask: 0x2, guc_id=0
<5> [164.849243] xe 0000:03:00.0: [drm] GT1: Timedout job: seqno=4294967169, lrc_seqno=4294967169, guc_id=0, flags=0x0 in xe_exec_reset [5203]
<7> [164.849251] xe 0000:03:00.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [164.849853] [IGT] xe_exec_reset: finished subtest cat-error, SUCCESS
<7> [164.857685] xe 0000:03:00.0: [drm:drm_client_dev_restore] intel-fbdev: ret=0
<6> [164.857956] [IGT] xe_exec_reset: exiting, ret=0
<6> [164.875222] Console: switching to colour frame buffer device 240x67
Created at 2025-02-28 01:59:28