Related to pmc, was running "pmcstat -S INST_RETIRED -O sample.out" while ld was running. root@cavium:~ # hwpmc: SOFT/16/64/0x67<INT,USR,SYS,REA,WRI> ARMV8/6/32/0x1ff<INT,USR,SYS,EDG,THR,REA,WRI,INV,QUA> panic: critical_exit: td_critnest == 0 cpuid = 15 KDB: stack backtrace: db_trace_self() at db_trace_self_wrapper+0x28 pc = 0xffffff80004d6434 lr = 0xffffff800006d888 sp = 0xffffff87cc0c4590 fp = 0xffffff87cc0c46b0 db_trace_self_wrapper() at vpanic+0x170 pc = 0xffffff800006d888 lr = 0xffffff800025ad54 sp = 0xffffff87cc0c46c0 fp = 0xffffff87cc0c4740 vpanic() at kassert_panic+0x160 pc = 0xffffff800025ad54 lr = 0xffffff800025abe0 sp = 0xffffff87cc0c4750 fp = 0xffffff87cc0c4810 kassert_panic() at critical_exit+0xc8 pc = 0xffffff800025abe0 lr = 0xffffff8000261510 sp = 0xffffff87cc0c4820 fp = 0xffffff87cc0c4830 critical_exit() at spinlock_exit+0x10 pc = 0xffffff8000261510 lr = 0xffffff80004ddf10 sp = 0xffffff87cc0c4840 fp = 0xffffff87cc0c4850 spinlock_exit() at pmc_select_cpu+0x80 pc = 0xffffff80004ddf10 lr = 0xffffff8047a5ba0c sp = 0xffffff87cc0c4860 fp = 0xffffff87cc0c4870 pmc_select_cpu() at pmc_syscall_handler+0x15f8 pc = 0xffffff8047a5ba0c lr = 0xffffff8047a5f824 sp = 0xffffff87cc0c4880 fp = 0xffffff87cc0c49e0 pmc_syscall_handler() at do_el0_sync+0x478 pc = 0xffffff8047a5f824 lr = 0xffffff80004e7e5c sp = 0xffffff87cc0c49f0 fp = 0xffffff87cc0c4aa0 do_el0_sync() at handle_el0_sync+0x58 pc = 0xffffff80004e7e5c lr = 0xffffff80004d79d4 sp = 0xffffff87cc0c4ab0 fp = 0x0000007ffffff500 KDB: enter: panic [ thread pid 8261 tid 100471 ] Stopped at kdb_enter+0x40: db>
I encountered similar problems with FreeBSD on ARM64 while using hwpmc. Some of the errors that I found are listed below: * panic: Unknown kernel exception 0 esr_el1 2000000 * panic: data abort in critical section or under mutex * panic: VFP exception in the kernel * panic: Unknown kernel exception 21 esr_el1 86000006 This can be easily reproduced by invoking for example: $ pmcstat -S CPU_CYCLES -O cpu_cycles.pmc wait ~30 seconds or more and hit ctrl + C Platform: ThunderX CRB (single socket) SVN rev: 291651 Example: root@thunderx_crb4:~ # x0: 5b4fc4 x1: 0 x2: 0 x3: ffffff800080d048 x4: ffffff87cc051cd0 x5: ffffff87cc051510 x6: 40761000 x7: 4 x8: ffffff800082be00 x9: 1 x10: 4 x11: 0 x12: 2af8 x13: 7ffe7ec0 x14: b x15: 296c x16: 7ffe7d60 x17: b x18: ffffff87cc051640 x19: 8 x20: 7 x21: 40461000 x22: ffffffc08b5d2438 x23: ffffffc04c8009a0 x24: 0 x25: 68 x26: 0 x27: 168 x28: 0 x29: 4045d000 x30: 4045d000 sp: ffffff87cc051640 lr: ffffffc01a4dd200 elr: ffffffc01a4dd200 spsr: 20000085 panic: Unknown kernel exception 0 esr_el1 2000000 cpuid = 0 KDB: stack backtrace: db_trace_self() at db_trace_self_wrapper+0x28 pc = 0xffffff80005a6d04 lr = 0xffffff8000070b84 sp = 0xffffff87cc051220 fp = 0xffffff87cc051340 db_trace_self_wrapper() at vpanic+0x170 pc = 0xffffff8000070b84 lr = 0xffffff80002bc468 sp = 0xffffff87cc051350 fp = 0xffffff87cc0513d0 vpanic() at panic+0x4c pc = 0xffffff80002bc468 lr = 0xffffff80002bc2f4 sp = 0xffffff87cc0513e0 fp = 0xffffff87cc051460 panic() at do_el1h_sync+0x128 pc = 0xffffff80002bc2f4 lr = 0xffffff80005b87d4 sp = 0xffffff87cc051470 fp = 0xffffff87cc051490 do_el1h_sync() at handle_el1h_sync+0x68 pc = 0xffffff80005b87d4 lr = 0xffffff80005a8068 sp = 0xffffff87cc0514a0 fp = 0xffffff87cc0515b0 handle_el1h_sync() at 0xffffffc01a4dd1fc pc = 0xffffff80005a8068 lr = 0xffffffc01a4dd1fc sp = 0xffffff87cc0515c0 fp = 0x000000004045d000 KDB: enter: panic [ thread pid 1022 tid 100194 ] Stopped at kdb_enter+0x40: db> root@thunderx_crb4:~ # x0: 0 x1: 0 x2: 0 x3: ffffff800080d048 x4: ffffff87cc146cd0 x5: ffffff87cc146510 x6: 40761000 x7: 100 x8: b9041a6951000529 x9: 1 x10: 4 x11: 0 x12: 2af8 x13: 8004190c x14: b x15: 2923 x16: 80041afd x17: b x18: ffffff87cc146640 x19: ffffffc00e71cd00 x20: ffffff8000722e50 x21: ffffff80005aeb08 x22: ffffff87cc146610 x23: 0 x24: 0 x25: 68 x26: 0 x27: 168 x28: 0 x29: ffffff87cc1466d0 x30: ffffff87cc1466d0 sp: ffffff87cc146640 lr: ffffff80002c6e1c elr: ffffff80002c6e38 spsr: 60000085 far: b9041a6951000851 esr: 96000004 timeout stopping cpus panic: data abort in critical section or under mutex cpuid = 0 KDB: stack backtrace: db_trace_self() at db_trace_self_wrapper+0x28 pc = 0xffffff80005a6d04 lr = 0xffffff8000070b84 sp = 0xffffff87cc146190 fp = 0xffffff87cc1462b0 db_trace_self_wrapper() at vpanic+0x170 pc = 0xffffff8000070b84 lr = 0xffffff80002bc468 sp = 0xffffff87cc1462c0 fp = 0xffffff87cc146340 vpanic() at panic+0x4c pc = 0xffffff80002bc468 lr = 0xffffff80002bc2f4 sp = 0xffffff87cc146350 fp = 0xffffff87cc1463d0 panic() at data_abort+0x1f0 pc = 0xffffff80002bc2f4 lr = 0xffffff80005b8a74 sp = 0xffffff87cc1463e0 fp = 0xffffff87cc146490 data_abort() at handle_el1h_sync+0x68 pc = 0xffffff80005b8a74 lr = 0xffffff80005a8068 sp = 0xffffff87cc1464a0 fp = 0xffffff87cc1465b0 handle_el1h_sync() at _sleep+0x2f8 pc = 0xffffff80005a8068 lr = 0xffffff80002c6e18 sp = 0xffffff87cc1465c0 fp = 0xffffff87cc1466d0 _sleep() at kqueue_kevent+0xd18 pc = 0xffffff80002c6e18 lr = 0xffffff8000268ae8 sp = 0xffffff87cc1466e0 fp = 0xffffffc00c3a8200 KDB: enter: panic [ thread pid 1027 tid 100243 ] Stopped at kdb_enter+0x40: db> root@thunderx_crb4:~ # pmcstat -S CPU_CYCLES -O cpu_cycles.pmc ^C x0: 5b4fc4 x1: 0 x2: 0 x3: ffffff800080d048 x4: ffffff87cc079cd0 x5: ffffff87cc079510 x6: 40761000 x7: 100 x8: ffffff800082be00 x9: 1 x10: 4 x11: 0 x12: 2af8 x13: 7ffae9a1 x14: b x15: 2714 x16: 7ffae969 x17: b x18: ffffff87cc079640 x19: 8 x20: 7 x21: 40461000 x22: ffffffc087bb2eb8 x23: ffffffc018c714d0 x24: 0 x25: 68 x26: 0 x27: 168 x28: 0 x29: 4045d000 x30: 4045d000 sp: ffffff87cc079640 lr: ffffffc00e272480 elr: ffffffc00e272480 spsr: 20000085 esr: 1fe00000 panic: VFP exception in the kernel cpuid = 0 KDB: stack backtrace: db_trace_self() at db_trace_self_wrapper+0x28 pc = 0xffffff80005a6d04 lr = 0xffffff8000070b84 sp = 0xffffff87cc079220 fp = 0xffffff87cc079340 db_trace_self_wrapper() at vpanic+0x170 pc = 0xffffff8000070b84 lr = 0xffffff80002bc468 sp = 0xffffff87cc079350 fp = 0xffffff87cc0793d0 vpanic() at panic+0x4c pc = 0xffffff80002bc468 lr = 0xffffff80002bc2f4 sp = 0xffffff87cc0793e0 fp = 0xffffff87cc079460 panic() at do_el1h_sync+0x10c pc = 0xffffff80002bc2f4 lr = 0xffffff80005b87b8 sp = 0xffffff87cc079470 fp = 0xffffff87cc079490 do_el1h_sync() at handle_el1h_sync+0x68 pc = 0xffffff80005b87b8 lr = 0xffffff80005a8068 sp = 0xffffff87cc0794a0 fp = 0xffffff87cc0795b0 handle_el1h_sync() at 0xffffffc00e27247c pc = 0xffffff80005a8068 lr = 0xffffffc00e27247c sp = 0xffffff87cc0795c0 fp = 0x000000004045d000 KDB: enter: panic [ thread pid 810 tid 100202 ] Stopped at kdb_enter+0x40: db>
Have you seen any of these recently on Pass 2.0+ hardware? I can trigger these on Pass 1.1, but I don't seem to be able to on other hardware I tried leading me to think it may be a hardware issue.