Results for igt@xe_exec_reset@cm-cat-error

Machine description: shard-adlp-2

Result: Abort 5 Warning(s)

i915_display_info15 igt_runner15 results15.json results15-xe-load.json i915_display_info_post_exec15 boot15 dmesg15

DetailValue
Duration unknown
Hostname
shard-adlp-2
Igt-Version
IGT-Version: 1.30-g822e95b25 (x86_64) (Linux: 6.14.0-rc3-xe+ x86_64)
Out
Using IGT_SRANDOM=1740150350 for randomisation
Opened device: /dev/dri/card0
Starting subtest: cm-cat-error
Stack trace:
  #0 ../lib/igt_core.c:2051 __igt_fail_assert()
  #1 ../tests/intel/xe_exec_reset.c:570 test_compute_mode()
  #2 ../tests/intel/xe_exec_reset.c:806 __igt_unique____real_main758()
  #3 ../tests/intel/xe_exec_reset.c:758 main()
  #4 [__libc_init_first+0x8a]
  #5 [__libc_start_main+0x8b]
  #6 [_start+0x25]
Subtest cm-cat-error: FAIL (1.030s)

This test caused an abort condition: Lockdep not active

/proc/lockdep_stats contents:
 lock-classes:                         2166 [max: 8192]
 direct dependencies:                 22920 [max: 524288]
 indirect dependencies:              140024
 all direct dependencies:            434440
 dependency chains:                   32751 [max: 524288]
 dependency chain hlocks used:       135563 [max: 2621440]
 dependency chain hlocks lost:            0
 in-hardirq chains:                     232
 in-softirq chains:                     651
 in-process chains:                   31868
 stack-trace entries:                240164 [max: 524288]
 number of stack traces:              11212
 number of stack hash chains:          8111
 combined max dependencies:       546443708
 hardirq-safe locks:                     91
 hardirq-unsafe locks:                 1286
 softirq-safe locks:                    212
 softirq-unsafe locks:                 1187
 irq-safe locks:                        229
 irq-unsafe locks:                     1286
 hardirq-read-safe locks:                 4
 hardirq-read-unsafe locks:             419
 softirq-read-safe locks:                 9
 softirq-read-unsafe locks:             414
 irq-read-safe locks:                     9
 irq-read-unsafe locks:                 419
 uncategorized locks:                   358
 unused locks:                            2
 max locking depth:                      18
 max bfs queue depth:                   352
 max lock class index:                 2165
 debug_locks:                             0

 zapped classes:                          3
 zapped lock chains:                    166
 large chain blocks:                      1
Err
Starting subtest: cm-cat-error
(xe_exec_reset:2124) CRITICAL: Test assertion failure function test_compute_mode, file ../tests/intel/xe_exec_reset.c:570:
(xe_exec_reset:2124) CRITICAL: Failed assertion: err == 0
(xe_exec_reset:2124) CRITICAL: Last errno: 5, Input/output error
(xe_exec_reset:2124) CRITICAL: error: -5 != 0
Subtest cm-cat-error failed.
**** DEBUG ****
(xe_exec_reset:2124) CRITICAL: Test assertion failure function test_compute_mode, file ../tests/intel/xe_exec_reset.c:570:
(xe_exec_reset:2124) CRITICAL: Failed assertion: err == 0
(xe_exec_reset:2124) CRITICAL: Last errno: 5, Input/output error
(xe_exec_reset:2124) CRITICAL: error: -5 != 0
(xe_exec_reset:2124) igt_core-INFO: Stack trace:
(xe_exec_reset:2124) igt_core-INFO:   #0 ../lib/igt_core.c:2051 __igt_fail_assert()
(xe_exec_reset:2124) igt_core-INFO:   #1 ../tests/intel/xe_exec_reset.c:570 test_compute_mode()
(xe_exec_reset:2124) igt_core-INFO:   #2 ../tests/intel/xe_exec_reset.c:806 __igt_unique____real_main758()
(xe_exec_reset:2124) igt_core-INFO:   #3 ../tests/intel/xe_exec_reset.c:758 main()
(xe_exec_reset:2124) igt_core-INFO:   #4 [__libc_init_first+0x8a]
(xe_exec_reset:2124) igt_core-INFO:   #5 [__libc_start_main+0x8b]
(xe_exec_reset:2124) igt_core-INFO:   #6 [_start+0x25]
****  END  ****
Subtest cm-cat-error: FAIL (1.030s)
Dmesg

<6> [92.806715] Console: switching to colour dummy device 80x25
<6> [92.806766] [IGT] xe_exec_reset: executing
<6> [92.811616] [IGT] xe_exec_reset: starting subtest cm-cat-error
<7> [92.816894] xe 0000:00:02.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT0: Engine memory cat error: engine_class=rcs, logical_mask: 0x1, guc_id=2
<3> [93.818714] xe 0000:00:02.0: [drm] *ERROR* GT0: GuC engine reset request failed on 0:0 because 0x00000000
<6> [93.820252] xe 0000:00:02.0: [drm] GT0: trying reset from xe_guc_exec_queue_reset_failure_handler [xe]
<6> [93.820918] xe 0000:00:02.0: [drm] GT0: reset queued
<7> [93.825211] xe 0000:00:02.0: [drm:xe_hw_engine_snapshot_capture [xe]] GT0: Proceeding with manual engine snapshot
<6> [93.826385] xe 0000:00:02.0: [drm] Xe device coredump has been created
<6> [93.826454] xe 0000:00:02.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
<6> [93.826576] xe 0000:00:02.0: [drm] GT0: reset started
<4> [93.826830]
<4> [93.826833] ======================================================
<4> [93.826837] WARNING: possible circular locking dependency detected
<4> [93.826840] 6.14.0-rc3-xe+ #1 Not tainted
<4> [93.826843] ------------------------------------------------------
<4> [93.826845] kworker/u64:5/767 is trying to acquire lock:
<4> [93.826848] ffffffff834c9500 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x58/0x490
<4> [93.826859]
but task is already holding lock:
<4> [93.826863] ffff88813830a158 (&guc->submission_state.lock){+.+.}-{3:3}, at: xe_guc_submit_stop+0x6c/0x590 [xe]
<4> [93.826995]
which lock already depends on the new lock.
<4> [93.826999]
the existing dependency chain (in reverse order) is:
<4> [93.827002]
-> #1 (&guc->submission_state.lock){+.+.}-{3:3}:
<4> [93.827007] __mutex_lock+0xdc/0xe60
<4> [93.827013] mutex_lock_nested+0x1b/0x30
<4> [93.827017] xe_guc_submit_init+0xf0/0x130 [xe]
<4> [93.827090] xe_guc_init_post_hwconfig+0x352/0x11c0 [xe]
<4> [93.827159] xe_uc_init_post_hwconfig+0x3c/0x70 [xe]
<4> [93.827250] xe_gt_init+0x3df/0x910 [xe]
<4> [93.827317] xe_device_probe+0x5d1/0x820 [xe]
<4> [93.827382] xe_pci_probe+0x35b/0x5f0 [xe]
<4> [93.827460] local_pci_probe+0x44/0xb0
<4> [93.827466] pci_device_probe+0xf4/0x270
<4> [93.827470] really_probe+0xee/0x3c0
<4> [93.827475] __driver_probe_device+0x8c/0x180
<4> [93.827479] driver_probe_device+0x24/0xd0
<4> [93.827482] __driver_attach+0x10f/0x220
<4> [93.827485] bus_for_each_dev+0x8d/0xf0
<4> [93.827488] driver_attach+0x1e/0x30
<4> [93.827491] bus_add_driver+0x151/0x290
<4> [93.827494] driver_register+0x5e/0x130
<4> [93.827499] __pci_register_driver+0x7d/0x90
<4> [93.827502] xe_register_pci_driver+0x23/0x30 [xe]
<4> [93.827579] soundcore_open+0x83/0x210 [soundcore]
<4> [93.827584] do_one_initcall+0x76/0x400
<4> [93.827588] do_init_module+0x97/0x2a0
<4> [93.827592] load_module+0x2c23/0x2f60
<4> [93.827595] init_module_from_file+0x97/0xe0
<4> [93.827598] idempotent_init_module+0x134/0x350
<4> [93.827601] __x64_sys_finit_module+0x77/0x100
<4> [93.827605] x64_sys_call+0x1f37/0x2650
<4> [93.827608] do_syscall_64+0x91/0x180
<4> [93.827612] entry_SYSCALL_64_after_hwframe+0x76/0x7e
<4> [93.827617]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [93.827622] __lock_acquire+0x1637/0x2810
<4> [93.827627] lock_acquire+0xc9/0x300
<4> [93.827631] fs_reclaim_acquire+0xc5/0x100
<4> [93.827635] __kmalloc_cache_noprof+0x58/0x490
<4> [93.827639] xe_drm_client_add_blame+0x6c/0x320 [xe]
<4> [93.827703] xe_guc_submit_stop+0x21e/0x590 [xe]
<4> [93.827776] xe_guc_stop+0x21/0x30 [xe]
<4> [93.827843] xe_uc_stop+0x2a/0x40 [xe]
<4> [93.827933] gt_reset_worker+0x13e/0x1e0 [xe]
<4> [93.828001] process_one_work+0x21c/0x740
<4> [93.828009] worker_thread+0x1db/0x3c0
<4> [93.828013] kthread+0x10d/0x270
<4> [93.828016] ret_from_fork+0x44/0x70
<4> [93.828020] ret_from_fork_asm+0x1a/0x30
<4> [93.828024]
other info that might help us debug this:
<4> [93.828027] Possible unsafe locking scenario:
<4> [93.828030] CPU0 CPU1
<4> [93.828033] ---- ----
<4> [93.828035] lock(&guc->submission_state.lock);
<4> [93.828038] lock(fs_reclaim);
<4> [93.828041] lock(&guc->submission_state.lock);
<4> [93.828045] lock(fs_reclaim);
<4> [93.828048]
*** DEADLOCK ***
<4> [93.828051] 3 locks held by kworker/u64:5/767:
<4> [93.828054] #0: ffff888138313548 ((wq_completion)gt-ordered-wq){+.+.}-{0:0}, at: process_one_work+0x444/0x740
<4> [93.828061] #1: ffffc9000206be20 ((work_completion)(&gt->reset.worker)){+.+.}-{0:0}, at: process_one_work+0x1da/0x740
<4> [93.828068] #2: ffff88813830a158 (&guc->submission_state.lock){+.+.}-{3:3}, at: xe_guc_submit_stop+0x6c/0x590 [xe]
<4> [93.828144]
stack backtrace:
<4> [93.828147] CPU: 6 UID: 0 PID: 767 Comm: kworker/u64:5 Not tainted 6.14.0-rc3-xe+ #1
<4> [93.828149] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
<4> [93.828150] Workqueue: gt-ordered-wq gt_reset_worker [xe]
<4> [93.828214] Call Trace:
<4> [93.828215] <TASK>
<4> [93.828216] dump_stack_lvl+0x91/0xf0
<4> [93.828219] dump_stack+0x10/0x20
<4> [93.828221] print_circular_bug+0x285/0x360
<4> [93.828224] check_noncircular+0x150/0x170
<4> [93.828227] __lock_acquire+0x1637/0x2810
<4> [93.828230] lock_acquire+0xc9/0x300
<4> [93.828232] ? __kmalloc_cache_noprof+0x58/0x490
<4> [93.828234] ? __lock_acquire+0x1166/0x2810
<4> [93.828236] ? __flush_work+0x4a5/0x5f0
<4> [93.828238] ? xe_drm_client_add_blame+0x6c/0x320 [xe]
<4> [93.828301] fs_reclaim_acquire+0xc5/0x100
<4> [93.828302] ? __kmalloc_cache_noprof+0x58/0x490
<4> [93.828304] __kmalloc_cache_noprof+0x58/0x490
<4> [93.828306] xe_drm_client_add_blame+0x6c/0x320 [xe]
<4> [93.828368] ? xe_drm_client_add_blame+0x6c/0x320 [xe]
<4> [93.828431] ? xe_lrc_read_ctx_reg+0x41/0x80 [xe]
<4> [93.828504] xe_guc_submit_stop+0x21e/0x590 [xe]
<4> [93.828573] ? trace_hardirqs_on+0x1e/0xe0
<4> [93.828577] ? enable_work+0x8c/0x110
<4> [93.828581] xe_guc_stop+0x21/0x30 [xe]
<4> [93.828646] xe_uc_stop+0x2a/0x40 [xe]
<4> [93.828731] gt_reset_worker+0x13e/0x1e0 [xe]
<4> [93.828794] process_one_work+0x21c/0x740
<4> [93.828797] worker_thread+0x1db/0x3c0
<4> [93.828799] ? __pfx_worker_thread+0x10/0x10
<4> [93.828801] kthread+0x10d/0x270
<4> [93.828802] ? __pfx_kthread+0x10/0x10
<4> [93.828803] ret_from_fork+0x44/0x70
<4> [93.828804] ? __pfx_kthread+0x10/0x10
<4> [93.828805] ret_from_fork_asm+0x1a/0x30
<4> [93.828809] </TASK>
<6> [93.828978] xe 0000:00:02.0: [drm] exec queue reset detected
<7> [93.828942] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0: Applying GT save-restore MMIOs
<7> [93.829064] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0: REG[0x9424] = 0xfffffffc
<7> [93.829179] xe 0000:00:02.0: [drm:xe_reg_sr_apply_mmio [xe]] GT0: REG[0x9550] = 0x000003ff
<7> [93.829286] xe 0000:00:02.0: [drm:xe_wopcm_init [xe]] WOPCM: 2048K
<7> [93.829406] xe 0000:00:02.0: [drm:xe_wopcm_init [xe]] GuC WOPCM is already locked [592K, 1420K)
<7> [93.831969] xe 0000:00:02.0: [drm:xe_guc_ads_populate [xe]] GT0: ADS capture alloc size changed from 36864 to 32768
<7> [93.832669] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: load still in progress, timeouts = 0, freq = 1250MHz (req 1300MHz), status = 0x00000072 [0x39/00]
<7> [93.832992] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: load still in progress, timeouts = 0, freq = 1250MHz (req 1300MHz), status = 0x00000074 [0x3A/00]
<7> [93.833137] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: load still in progress, timeouts = 0, freq = 1250MHz (req 1300MHz), status = 0x800030EC [0x76/30]
<7> [93.833254] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: load still in progress, timeouts = 0, freq = 1250MHz (req 1300MHz), status = 0x800005EC [0x76/05]
<6> [93.841678] [IGT] xe_exec_reset: finished subtest cm-cat-error, FAIL
<6> [93.841776] [IGT] xe_exec_reset: exiting, ret=98
<6> [93.842467] Console: switching to colour frame buffer device 240x67
<7> [93.848950] xe 0000:00:02.0: [drm:intel_power_well_enable [xe]] enabling DC_off
<7> [93.849144] xe 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [xe]] Setting DC state from 02 to 00
<7> [93.852941] xe 0000:00:02.0: [drm:__xe_guc_upload [xe]] GT0: init took 20ms, freq = 1250MHz (req = 1300MHz), before = 1250MHz, status = 0x8002F0EC, timeouts = 0
<7> [93.853340] xe 0000:00:02.0: [drm:xe_guc_ct_enable [xe]] GT0: GuC CT communication channel enabled
<7> [93.858905] xe 0000:00:02.0: [drm:xe_huc_auth [xe]] GT0: HuC: authenticated via GuC
<7> [93.859071] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: flag:0x3
<7> [93.859200] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: mocs entries: 64
<7> [93.859331] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[0] 0x4000 0x37
<7> [93.859464] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[1] 0x4004 0x37
<7> [93.859596] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[2] 0x4008 0x37
<7> [93.859729] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[3] 0x400c 0x5
<7> [93.859866] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[4] 0x4010 0x5
<7> [93.859997] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[5] 0x4014 0x37
<7> [93.860132] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[6] 0x4018 0x17
<7> [93.860269] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[7] 0x401c 0x17
<7> [93.860407] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[8] 0x4020 0x27
<7> [93.860546] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[9] 0x4024 0x27
<7> [93.860684] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[10] 0x4028 0x77
<7> [93.860818] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[11] 0x402c 0x77
<7> [93.860964] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[12] 0x4030 0x57
<7> [93.861099] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[13] 0x4034 0x57
<7> [93.861237] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[14] 0x4038 0x67
<7> [93.861377] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[15] 0x403c 0x67
<7> [93.861515] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[16] 0x4040 0x37
<7> [93.861662] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[17] 0x4044 0x37
<7> [93.861805] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[18] 0x4048 0x60037
<7> [93.861964] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[19] 0x404c 0x737
<7> [93.862117] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[20] 0x4050 0x337
<7> [93.862268] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[21] 0x4054 0x137
<7> [93.862420] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[22] 0x4058 0x3b7
<7> [93.862569] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[23] 0x405c 0x7b7
<7> [93.862714] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[24] 0x4060 0x37
<7> [93.862851] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[25] 0x4064 0x37
<7> [93.862995] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[26] 0x4068 0x37
<7> [93.863136] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[27] 0x406c 0x37
<7> [93.863278] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[28] 0x4070 0x37
<7> [93.863417] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[29] 0x4074 0x37
<7> [93.863558] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[30] 0x4078 0x37
<7> [93.863701] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[31] 0x407c 0x37
<7> [93.863844] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[32] 0x4080 0x37
<7> [93.863995] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[33] 0x4084 0x37
<7> [93.864143] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[34] 0x4088 0x37
<7> [93.865218] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[35] 0x408c 0x37
<7> [93.865793] xe 0000:00:02.0: [drm:drm_client_dev_restore] intel-fbdev: ret=0
<7> [93.866281] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[36] 0x4090 0x37
<7> [93.867791] xe 0000:00:02.0: [drm:xe_mocs_init [xe]] GT0: GLOB_MOCS[37] 0x4094 0x37
Created at 2025-02-21 15:34:58