Running FreeBSD 12.0-CURRENT r331531, machine crashes after having decompressed about 100GB of a very large (>850GB) gzip file using gzip/unpigz. st_mtime and st_birthtime of the output file show "Mar 25 22:20:22 2018" "Mar 25 21:34:22 2018" Unread portion of the kernel message buffer: MCA: Bank 1, Status 0xcc00001000010151 MCA: Global Cap 0x0000000000000107, Status 0x0000000000000004 MCA: Vendor "AuthenticAMD", ID 0x610f01, APIC ID 18 MCA: CPU 2 COR OVER ICACHE L1 IRD error MCA: Address 0xffff80f8aa21 MCA: Misc 0xc01b0fff00000000 MCA: Bank 5, Status 0xb0800000000c0e0f MCA: Global Cap 0x0000000000000107, Status 0x0000000000000004 MCA: Vendor "AuthenticAMD", ID 0x610f01, APIC ID 18 MCA: CPU 2 UNCOR BUSLG ??? ERR Other panic: Unrecoverable machine check exception cpuid = 2 time = 1522009224 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0003581e40 vpanic() at vpanic+0x18d/frame 0xfffffe0003581ea0 panic() at panic+0x43/frame 0xfffffe0003581f00 mca_intr() at mca_intr+0x9b/frame 0xfffffe0003581f20 mchk_calltrap() at mchk_calltrap+0x8/frame 0xfffffe0003581f20 --- trap 0x1c, rip = 0xffffffff80f8aa33, rsp = 0xfffffe003dab07d0, rbp = 0xfffffe003dab0890 --- apic_isr1_u() at apic_isr1_u+0xa9/frame 0xfffffe003dab0890 acpi_cpu_idle() at acpi_cpu_idle+0x2ee/frame 0xfffffe003dab08e0 cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfffffe003dab0900 cpu_idle() at cpu_idle+0x8f/frame 0xfffffe003dab0920 sched_idletd() at sched_idletd+0x517/frame 0xfffffe003dab09f0 fork_exit() at fork_exit+0x84/frame 0xfffffe003dab0a30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe003dab0a30 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic __curthread () at ./machine/pcpu.h:230 230 __asm("movq %%gs:%1,%0" : "=r" (td) (kgdb) list *0xffffffff80f8aa33 0xffffffff80f8aa33 is at /usr/src/sys/amd64/amd64/apic_vector.S:118. 113 SUPERALIGN_TEXT 114 IDTVEC(spuriousint) 115 /* No EOI cycle used here */ 116 jmp doreti_iret 117 118 ISR_VEC 1, apic_isr1 119 ISR_VEC 2, apic_isr2 120 ISR_VEC 3, apic_isr3 121 ISR_VEC 4, apic_isr4 122 ISR_VEC 5, apic_isr5 (kgdb) bt #0 __curthread () at ./machine/pcpu.h:230 #1 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:361 #2 0xffffffff80424aeb in db_dump (dummy=<optimized out>, dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>) at /usr/src/sys/ddb/db_command.c:574 #3 0xffffffff804248b9 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=<optimized out>) at /usr/src/sys/ddb/db_command.c:481 #4 0xffffffff80424634 in db_command_loop () at /usr/src/sys/ddb/db_command.c:534 #5 0xffffffff8042785f in db_trap (type=<optimized out>, code=<optimized out>) at /usr/src/sys/ddb/db_main.c:250 #6 0xffffffff80b35273 in kdb_trap (type=3, code=-61456, tf=<optimized out>) at /usr/src/sys/kern/subr_kdb.c:697 #7 0xffffffff80fad8a8 in trap (frame=0xfffffe0003581d70) at /usr/src/sys/amd64/amd64/trap.c:547 #8 0xffffffff80f89a8c in alltraps_pushregs_no_rax () at /usr/src/sys/amd64/amd64/exception.S:223 #9 0xffffffff81be9b78 in ?? () #10 0x0000000000000080 in ?? () #11 0xfffffe0003581d30 in ?? () #12 0x0000000000000080 in ?? () #13 0x0000000000000278 in ?? () #14 0x00000000000001d0 in ?? () #15 0x0000000000000012 in ?? () #16 0xffffffff8121d87b in ?? () #17 0xfffffe0003581e40 in ?? () #18 0xffffffff81adb7c0 in local_info () #19 0x0000000000000010 in ?? () #20 0xffffffff81e10901 in __pcpu () #21 0x0000000002814000 in ?? () #22 0xfffffe0003581ee0 in ?? () #23 0xfffff80003346000 in ?? () #24 0x001b001300000003 in ?? () #25 0x0000000000000000 in ?? ()
This seems like a problem with the hardware. Either it's failing, maybe it is outside of proper conditions or maybe some other kind of problem. Maybe it's an early Ryzen? There have been numerous reports of problems with it. Also, mcelog helps to get more human readable information about MCA / MCE conditions.
Not really about the automounter daemon.
(In reply to Andriy Gapon from comment #1) This is Hudson-D4 hardware. (In reply to Mark Linimon from comment #2) That should have read AMD, as in the hardware manufacturer.
(In reply to Andriy Gapon from comment #1) Mask is 0xff00f00, printed CPUID is 0x610f01, and the fields are summed. So this is Family 15h, I think.
(In reply to vidwer+fbsdbugs from comment #3) I agree that the token is overloaded, but in the old days of us using GNATS, we used the [amd] notation to indicate a problem with the automounter daemon. The fact that you have set this to 'Hardware: amd64' is sufficient to indicate its architecture-specificness these days.