Results for igt@xe_exec_reset@cm-cat-error

Machine description: shard-adlp-8

Result: Abort 5 Warning(s)

i915_display_info14 igt_runner14 results14.json results14-xe-load.json i915_display_info_post_exec14 boot14 dmesg14

DetailValue
Duration unknown
Hostname
shard-adlp-8
Igt-Version
IGT-Version: 1.30-gd063ceaea (x86_64) (Linux: 6.14.0-rc3-xe+ x86_64)
Out
Using IGT_SRANDOM=1739983082 for randomisation
Opened device: /dev/dri/card1
Starting subtest: cm-cat-error
Stack trace:
  #0 ../lib/igt_core.c:2051 __igt_fail_assert()
  #1 ../tests/intel/xe_exec_reset.c:570 test_compute_mode()
  #2 ../tests/intel/xe_exec_reset.c:806 __igt_unique____real_main758()
  #3 ../tests/intel/xe_exec_reset.c:758 main()
  #4 [__libc_init_first+0x8a]
  #5 [__libc_start_main+0x8b]
  #6 [_start+0x25]
Subtest cm-cat-error: FAIL (1.010s)

This test caused an abort condition: Lockdep not active

/proc/lockdep_stats contents:
 lock-classes:                         2323 [max: 8192]
 direct dependencies:                 27476 [max: 524288]
 indirect dependencies:              218349
 all direct dependencies:            556081
 dependency chains:                   41781 [max: 524288]
 dependency chain hlocks used:       185858 [max: 2621440]
 dependency chain hlocks lost:            0
 in-hardirq chains:                     353
 in-softirq chains:                     917
 in-process chains:                   40511
 stack-trace entries:                284398 [max: 524288]
 number of stack traces:              13190
 number of stack hash chains:          9101
 combined max dependencies:       280363776
 hardirq-safe locks:                    110
 hardirq-unsafe locks:                 1373
 softirq-safe locks:                    258
 softirq-unsafe locks:                 1262
 irq-safe locks:                        275
 irq-unsafe locks:                     1373
 hardirq-read-safe locks:                 4
 hardirq-read-unsafe locks:             443
 softirq-read-safe locks:                11
 softirq-read-unsafe locks:             438
 irq-read-safe locks:                    11
 irq-read-unsafe locks:                 443
 uncategorized locks:                   374
 unused locks:                            1
 max locking depth:                      18
 max bfs queue depth:                   399
 max lock class index:                 2322
 debug_locks:                             0

 zapped classes:                         45
 zapped lock chains:                   1646
 large chain blocks:                      1
Err
Starting subtest: cm-cat-error
(xe_exec_reset:4131) CRITICAL: Test assertion failure function test_compute_mode, file ../tests/intel/xe_exec_reset.c:570:
(xe_exec_reset:4131) CRITICAL: Failed assertion: err == 0
(xe_exec_reset:4131) CRITICAL: Last errno: 5, Input/output error
(xe_exec_reset:4131) CRITICAL: error: -5 != 0
Subtest cm-cat-error failed.
**** DEBUG ****
(xe_exec_reset:4131) CRITICAL: Test assertion failure function test_compute_mode, file ../tests/intel/xe_exec_reset.c:570:
(xe_exec_reset:4131) CRITICAL: Failed assertion: err == 0
(xe_exec_reset:4131) CRITICAL: Last errno: 5, Input/output error
(xe_exec_reset:4131) CRITICAL: error: -5 != 0
(xe_exec_reset:4131) igt_core-INFO: Stack trace:
(xe_exec_reset:4131) igt_core-INFO:   #0 ../lib/igt_core.c:2051 __igt_fail_assert()
(xe_exec_reset:4131) igt_core-INFO:   #1 ../tests/intel/xe_exec_reset.c:570 test_compute_mode()
(xe_exec_reset:4131) igt_core-INFO:   #2 ../tests/intel/xe_exec_reset.c:806 __igt_unique____real_main758()
(xe_exec_reset:4131) igt_core-INFO:   #3 ../tests/intel/xe_exec_reset.c:758 main()
(xe_exec_reset:4131) igt_core-INFO:   #4 [__libc_init_first+0x8a]
(xe_exec_reset:4131) igt_core-INFO:   #5 [__libc_start_main+0x8b]
(xe_exec_reset:4131) igt_core-INFO:   #6 [_start+0x25]
****  END  ****
Subtest cm-cat-error: FAIL (1.010s)
Dmesg

<6> [373.982128] Console: switching to colour dummy device 80x25
<6> [373.982493] [IGT] xe_exec_reset: executing
<6> [373.985263] [IGT] xe_exec_reset: starting subtest cm-cat-error
<7> [373.987391] xe 0000:00:02.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT0: Engine memory cat error: engine_class=rcs, logical_mask: 0x1, guc_id=2
<3> [374.988715] xe 0000:00:02.0: [drm] *ERROR* GT0: GuC engine reset request failed on 0:0 because 0x00000000
<6> [374.988963] xe 0000:00:02.0: [drm] GT0: trying reset from xe_guc_exec_queue_reset_failure_handler [xe]
<6> [374.989061] xe 0000:00:02.0: [drm] GT0: reset queued
<7> [374.989198] xe 0000:00:02.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
<6> [374.989600] xe 0000:00:02.0: [drm] GT0: reset started
<4> [374.989796]
<4> [374.989799] ======================================================
<4> [374.989801] WARNING: possible circular locking dependency detected
<4> [374.989804] 6.14.0-rc3-xe+ #1 Not tainted
<4> [374.989806] ------------------------------------------------------
<4> [374.989808] kworker/u80:74/3056 is trying to acquire lock:
<4> [374.989811] ffffffff834c9500 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x58/0x490
<4> [374.989820]
but task is already holding lock:
<4> [374.989823] ffff8881ceb6a158 (&guc->submission_state.lock){+.+.}-{3:3}, at: xe_guc_submit_stop+0x6c/0x590 [xe]
<4> [374.989892]
which lock already depends on the new lock.
<4> [374.989895]
the existing dependency chain (in reverse order) is:
<4> [374.989898]
-> #1 (&guc->submission_state.lock){+.+.}-{3:3}:
<4> [374.989902] __mutex_lock+0xdc/0xe60
<4> [374.989908] mutex_lock_nested+0x1b/0x30
<4> [374.989911] xe_guc_submit_init+0xf0/0x130 [xe]
<4> [374.989972] xe_guc_init_post_hwconfig+0x352/0x11c0 [xe]
<4> [374.990027] xe_uc_init_post_hwconfig+0x3c/0x70 [xe]
<4> [374.990105] xe_gt_init+0x3df/0x910 [xe]
<4> [374.990157] xe_device_probe+0x5d1/0x820 [xe]
<4> [374.990205] xe_pci_probe+0x35b/0x5f0 [xe]
<4> [374.990288] local_pci_probe+0x44/0xb0
<4> [374.990295] pci_device_probe+0xf4/0x270
<4> [374.990299] really_probe+0xee/0x3c0
<4> [374.990305] __driver_probe_device+0x8c/0x180
<4> [374.990309] driver_probe_device+0x24/0xd0
<4> [374.990313] __driver_attach+0x10f/0x220
<4> [374.990316] bus_for_each_dev+0x8d/0xf0
<4> [374.990320] driver_attach+0x1e/0x30
<4> [374.990323] bus_add_driver+0x151/0x290
<4> [374.990327] driver_register+0x5e/0x130
<4> [374.990331] __pci_register_driver+0x7d/0x90
<4> [374.990335] xe_register_pci_driver+0x23/0x30 [xe]
<4> [374.990420] soundcore_open+0x83/0x210 [soundcore]
<4> [374.990425] do_one_initcall+0x76/0x400
<4> [374.990433] do_init_module+0x97/0x2a0
<4> [374.990437] load_module+0x2c23/0x2f60
<4> [374.990439] init_module_from_file+0x97/0xe0
<4> [374.990442] idempotent_init_module+0x134/0x350
<4> [374.990445] __x64_sys_finit_module+0x77/0x100
<4> [374.990448] x64_sys_call+0x1f37/0x2650
<4> [374.990451] do_syscall_64+0x91/0x180
<4> [374.990455] entry_SYSCALL_64_after_hwframe+0x76/0x7e
<4> [374.990460]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [374.990464] __lock_acquire+0x1637/0x2810
<4> [374.990469] lock_acquire+0xc9/0x300
<4> [374.990472] fs_reclaim_acquire+0xc5/0x100
<4> [374.990477] __kmalloc_cache_noprof+0x58/0x490
<4> [374.990481] xe_drm_client_add_blame+0x6c/0x310 [xe]
<4> [374.990540] xe_guc_submit_stop+0x21e/0x590 [xe]
<4> [374.990606] xe_guc_stop+0x21/0x30 [xe]
<4> [374.990668] xe_uc_stop+0x2a/0x40 [xe]
<4> [374.990749] gt_reset_worker+0x13e/0x1e0 [xe]
<4> [374.990810] process_one_work+0x21c/0x740
<4> [374.990814] worker_thread+0x1db/0x3c0
<4> [374.990817] kthread+0x10d/0x270
<4> [374.990820] ret_from_fork+0x44/0x70
<4> [374.990825] ret_from_fork_asm+0x1a/0x30
<4> [374.990828]
other info that might help us debug this:
<4> [374.990831] Possible unsafe locking scenario:
<4> [374.990834] CPU0 CPU1
<4> [374.990836] ---- ----
<4> [374.990838] lock(&guc->submission_state.lock);
<4> [374.990841] lock(fs_reclaim);
<4> [374.990844] lock(&guc->submission_state.lock);
<4> [374.990848] lock(fs_reclaim);
<4> [374.990850]
*** DEADLOCK ***
<4> [374.990853] 3 locks held by kworker/u80:74/3056:
<4> [374.990855] #0: ffff8881cd0b0548 ((wq_completion)gt-ordered-wq){+.+.}-{0:0}, at: process_one_work+0x444/0x740
<4> [374.990862] #1: ffffc90002e4fe20 ((work_completion)(&gt->reset.worker)){+.+.}-{0:0}, at: process_one_work+0x1da/0x740
<4> [374.990869] #2: ffff8881ceb6a158 (&guc->submission_state.lock){+.+.}-{3:3}, at: xe_guc_submit_stop+0x6c/0x590 [xe]
<4> [374.990938]
stack backtrace:
<4> [374.990941] CPU: 10 UID: 0 PID: 3056 Comm: kworker/u80:74 Not tainted 6.14.0-rc3-xe+ #1
<4> [374.990942] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
<4> [374.990943] Workqueue: gt-ordered-wq gt_reset_worker [xe]
<4> [374.991001] Call Trace:
<4> [374.991002] <TASK>
<4> [374.991003] dump_stack_lvl+0x91/0xf0
<4> [374.991006] dump_stack+0x10/0x20
<4> [374.991008] print_circular_bug+0x285/0x360
<4> [374.991010] check_noncircular+0x150/0x170
<4> [374.991013] __lock_acquire+0x1637/0x2810
<4> [374.991016] lock_acquire+0xc9/0x300
<4> [374.991018] ? __kmalloc_cache_noprof+0x58/0x490
<4> [374.991020] ? __lock_acquire+0x1166/0x2810
<4> [374.991022] ? __flush_work+0x4a5/0x5f0
<4> [374.991024] ? xe_drm_client_add_blame+0x6c/0x310 [xe]
<4> [374.991080] fs_reclaim_acquire+0xc5/0x100
<4> [374.991081] ? __kmalloc_cache_noprof+0x58/0x490
<4> [374.991083] __kmalloc_cache_noprof+0x58/0x490
<4> [374.991085] xe_drm_client_add_blame+0x6c/0x310 [xe]
<4> [374.991141] ? xe_drm_client_add_blame+0x6c/0x310 [xe]
<4> [374.991198] ? xe_lrc_read_ctx_reg+0x41/0x80 [xe]
<4> [374.991270] xe_guc_submit_stop+0x21e/0x590 [xe]
<4> [374.991345] ? trace_hardirqs_on+0x1e/0xe0
<4> [374.991349] ? enable_work+0x8c/0x110
<4> [374.991353] xe_guc_stop+0x21/0x30 [xe]
<4> [374.991417] xe_uc_stop+0x2a/0x40 [xe]
<4> [374.991495] gt_reset_worker+0x13e/0x1e0 [xe]
<4> [374.991552] process_one_work+0x21c/0x740
<4> [374.991555] worker_thread+0x1db/0x3c0
<4> [374.991557] ? __pfx_worker_thread+0x10/0x10
<4> [374.991558] kthread+0x10d/0x270
<4> [374.991559] ? __pfx_kthread+0x10/0x10
<4> [374.991560] ret_from_fork+0x44/0x70
<4> [374.991562] ? __pfx_kthread+0x10/0x10
<4> [374.991563] ret_from_fork_asm+0x1a/0x30
<4> [374.991566] </TASK>
<7> [374.991655] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0: Applying GT save-restore MMIOs
<6> [374.991762] xe 0000:00:02.0: [drm] exec queue reset detected
<7> [374.991746] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0: REG[0x9424] = 0xfffffffc
<7> [374.991827] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0: REG[0x9550] = 0x000003ff
<7> [374.991908] xe 0000:00:02.0: [drm:xe_wopcm_init [xe]] WOPCM: 2048K
<7> [374.992002] xe 0000:00:02.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [592K, 1420K)
<7> [374.993326] xe 0000:00:02.0: [drm:xe_guc_ads_populate [xe]] GT0: ADS capture alloc size changed from 36864 to 32768
<7> [374.994106] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: load still in progress, timeouts = 0, freq = 1250MHz (req 1300MHz), status = 0x00000072 [0x39/00]
<7> [374.994316] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: load still in progress, timeouts = 0, freq = 1250MHz (req 1300MHz), status = 0x00000074 [0x3A/00]
<7> [374.994450] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: load still in progress, timeouts = 0, freq = 1250MHz (req 1300MHz), status = 0x800005EC [0x76/05]
<6> [374.995714] [IGT] xe_exec_reset: finished subtest cm-cat-error, FAIL
<6> [374.996049] [IGT] xe_exec_reset: exiting, ret=98
<6> [374.996419] Console: switching to colour frame buffer device 240x67
<7> [375.001521] xe 0000:00:02.0: [drm:intel_power_well_enable [xe]] enabling DC_off
<7> [375.001733] xe 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [xe]] Setting DC state from 02 to 00
<7> [375.009030] xe 0000:00:02.0: [drm:drm_client_dev_restore] intel-fbdev: ret=0
<7> [375.014333] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: init took 20ms, freq = 1250MHz (req = 1300MHz), before = 1250MHz, status = 0x8002F0EC, timeouts = 0
<7> [375.014610] xe 0000:00:02.0: [drm:xe_guc_ct_enable [xe]] GT0: GuC CT communication channel enabled
Created at 2025-02-19 17:01:44