Bug 234296 - FreeBSD 12.0-STABLE r342216 Fatal trap 12
Summary: FreeBSD 12.0-STABLE r342216 Fatal trap 12
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords:
: 231080 234042 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-12-23 12:33 UTC by Victor
Modified: 2019-01-21 05:38 UTC (History)
7 users (show)

See Also:


Attachments
one of the logs of dump (131.86 KB, text/plain)
2018-12-23 12:33 UTC, Victor
no flags Details
patch (2.18 KB, patch)
2019-01-16 03:21 UTC, Mark Johnston
no flags Details | Diff
patch (2.34 KB, patch)
2019-01-16 17:29 UTC, Mark Johnston
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Victor 2018-12-23 12:33:32 UTC
Created attachment 200391 [details]
one of the logs of dump

Hello!

After upgrading servers to FreeBSD 12 stable, the server goes into a reboot, this problem is manifested on 5 servers out of 8 updated, the server data is used as gateways

part of the dump of one of the servers

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff805dd478
stack pointer           = 0x0:0xfffffe0094f65aa0
frame pointer           = 0x0:0xfffffe0094f65b10
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 41176 (ipfw)
trap number             = 12
panic: page fault
cpuid = 0
time = 1545493415
KDB: stack backtrace:
#0 0xffffffff8060c007 at kdb_backtrace+0x67
#1 0xffffffff805c3ab3 at vpanic+0x1a3
#2 0xffffffff805c3903 at panic+0x43
#3 0xffffffff808e2d5f at trap_fatal+0x35f
#4 0xffffffff808e2db9 at trap_pfault+0x49
#5 0xffffffff808e23fe at trap+0x29e
#6 0xffffffff808bfd15 at calltrap+0x8
#7 0xffffffff80918768 at handleevents+0x1a8
#8 0xffffffff80918dd9 at timercb+0xa9
#9 0xffffffff80953d1d at lapic_handle_timer+0xad
#10 0xffffffff808c181e at Xtimerint+0xae
Uptime: 12h23m19s

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff805dd478
stack pointer           = 0x28:0xfffffe0075d5a850
frame pointer           = 0x28:0xfffffe0075d5a8c0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 11 (idle: cpu0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1545530359
KDB: stack backtrace:
#0 0xffffffff8060c007 at kdb_backtrace+0x67
#1 0xffffffff805c3ab3 at vpanic+0x1a3
#2 0xffffffff805c3903 at panic+0x43
#3 0xffffffff808e2d5f at trap_fatal+0x35f
#4 0xffffffff808e2db9 at trap_pfault+0x49
#5 0xffffffff808e23fe at trap+0x29e
#6 0xffffffff808bfd15 at calltrap+0x8
#7 0xffffffff80918768 at handleevents+0x1a8
#8 0xffffffff80918dd9 at timercb+0xa9
#9 0xffffffff80953d1d at lapic_handle_timer+0xad
#10 0xffffffff808c181e at Xtimerint+0xae
#11 0xffffffff8094a83f at cpu_idle_acpi+0x3f
#12 0xffffffff8094a8f7 at cpu_idle+0xa7
#13 0xffffffff805f6275 at sched_idletd+0x515
#14 0xffffffff8058c683 at fork_exit+0x83
#15 0xffffffff808c0cfe at fork_trampoline+0xe
Uptime: 10h14m40s

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address   = 0x0
fault code              = supervisor read instruction, page not present
instruction pointer     = 0x20:0x0
stack pointer           = 0x28:0xfffffe000048ea88
frame pointer           = 0x28:0xfffffe000048eb30
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (swi4: clock (0))
trap number             = 12
panic: page fault
cpuid = 1
time = 1545537943
KDB: stack backtrace:
#0 0xffffffff8060c007 at kdb_backtrace+0x67
#1 0xffffffff805c3ab3 at vpanic+0x1a3
#2 0xffffffff805c3903 at panic+0x43
#3 0xffffffff808e2d5f at trap_fatal+0x35f
#4 0xffffffff808e2db9 at trap_pfault+0x49
#5 0xffffffff808e23fe at trap+0x29e
#6 0xffffffff808bfd15 at calltrap+0x8
#7 0xffffffff805ddbf9 at softclock+0x79
#8 0xffffffff8058f519 at ithread_loop+0x169
#9 0xffffffff8058c683 at fork_exit+0x83
#10 0xffffffff808c0cfe at fork_trampoline+0xe
Uptime: 2h5m2s

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x500b0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff805dd478
stack pointer           = 0x28:0xfffffe0075df23f0
frame pointer           = 0x28:0xfffffe0075df2460
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 0 (if_io_tqg_0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1545562751
KDB: stack backtrace:
#0 0xffffffff8060c007 at kdb_backtrace+0x67
#1 0xffffffff805c3ab3 at vpanic+0x1a3
#2 0xffffffff805c3903 at panic+0x43
#3 0xffffffff808e2d5f at trap_fatal+0x35f
#4 0xffffffff808e2db9 at trap_pfault+0x49
#5 0xffffffff808e23fe at trap+0x29e
#6 0xffffffff808bfd15 at calltrap+0x8
#7 0xffffffff80918768 at handleevents+0x1a8
#8 0xffffffff80918dd9 at timercb+0xa9
#9 0xffffffff80953d1d at lapic_handle_timer+0xad
#10 0xffffffff808c181e at Xtimerint+0xae
#11 0xffffffff806488c2 at m_dup+0x182
#12 0xffffffff806f7c2d at ng_tee_rcvdata+0x3d
#13 0xffffffff806ed419 at ng_apply_item+0x2f9
#14 0xffffffff806eced0 at ng_snd_item+0x130
#15 0xffffffff806f0c5c at ng_ether_input+0x4c
#16 0xffffffff806c200d at ether_nh_input+0x1dd
#17 0xffffffff806d9aa6 at netisr_dispatch_src+0xd6
Uptime: 6h52m21s

sorry for my English
Comment 1 Victor 2018-12-25 07:58:07 UTC
updated to 12.0-STABLE r342388 with CFLAGS= -O0 -pipe build options. The situation has not changed

panic: page fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff805dd328
stack pointer           = 0x28:0xfffffe0075d5a850
frame pointer           = 0x28:0xfffffe0075d5a8c0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 11 (idle: cpu0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1545716928
KDB: stack backtrace:
#0 0xffffffff8060beb7 at kdb_backtrace+0x67
#1 0xffffffff805c3963 at vpanic+0x1a3
#2 0xffffffff805c37b3 at panic+0x43
#3 0xffffffff808e321f at trap_fatal+0x35f
#4 0xffffffff808e3279 at trap_pfault+0x49
#5 0xffffffff808e28be at trap+0x29e
#6 0xffffffff808bfc15 at calltrap+0x8
#7 0xffffffff80918c28 at handleevents+0x1a8
#8 0xffffffff80919299 at timercb+0xa9
#9 0xffffffff809541dd at lapic_handle_timer+0xad
#10 0xffffffff808c171e at Xtimerint+0xae
#11 0xffffffff8094acff at cpu_idle_acpi+0x3f
#12 0xffffffff8094adb7 at cpu_idle+0xa7
#13 0xffffffff805f6125 at sched_idletd+0x515
#14 0xffffffff8058c673 at fork_exit+0x83
#15 0xffffffff808c0bfe at fork_trampoline+0xe
Uptime: 5h53m46s
Comment 2 Mark Johnston freebsd_committer 2019-01-02 19:22:11 UTC
(In reply to Victor from comment #1)
Would you be willing to share the vmcore from the -O0 crash?
Comment 3 Victor 2019-01-03 08:22:32 UTC
(In reply to Mark Johnston from comment #2)
posted on google disk https://drive.google.com/open?id=1E6pFrMYLg_mzWpqTa2hipF3zdORHhs3v
Comment 4 Mark Johnston freebsd_committer 2019-01-03 16:56:01 UTC
(In reply to Victor from comment #3)
Thanks.  Could you also provide the matching /boot/kernel and /usr/lib/debug/boot/kernel?  I didn't notice before that you were on -STABLE and not -RELEASE.
Comment 5 Victor 2019-01-04 16:44:53 UTC
(In reply to Mark Johnston from comment #4)
added to google disk
This bug was observed even with the head and in release it is also present
Comment 6 Victor 2019-01-10 07:09:16 UTC
same problem on another server

12.0-STABLE FreeBSD 12.0-STABLE r342878 OFFICE_GW  amd64

kgdb /boot/kernel/kernel /var/crash/vmcore.6
GNU gdb (GDB) 8.2 [GDB v8.2 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...done.
done.

Unread portion of the kernel message buffer:
kernel trap 9 with interrupts disabled


Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff8053ee2f
stack pointer           = 0x28:0xfffffe0025d8a770
frame pointer           = 0x28:0xfffffe0025d8a7e0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 11 (idle: cpu0)
trap number             = 9
panic: general protection fault
cpuid = 0
time = 1547100690
KDB: stack backtrace:
#0 0xffffffff8056da27 at kdb_backtrace+0x67
#1 0xffffffff80525313 at vpanic+0x1a3
#2 0xffffffff80525163 at panic+0x43
#3 0xffffffff808287df at trap_fatal+0x35f
#4 0xffffffff80827c4d at trap+0x6d
#5 0xffffffff808051c5 at calltrap+0x8
#6 0xffffffff80836db8 at handleevents+0x1a8
#7 0xffffffff80837621 at timercb+0x2a1
#8 0xffffffff8082d219 at hpet_intr_single+0x1b9
#9 0xffffffff8082d2be at hpet_intr+0x8e
#10 0xffffffff804ea32e at intr_event_handle+0xbe
#11 0xffffffff8086c1a8 at intr_execute_handlers+0x58
#12 0xffffffff808722b4 at lapic_handle_intr+0x44
#13 0xffffffff80806359 at Xapic_isr1+0xd9
#14 0xffffffff80375417 at acpi_cpu_idle+0x2e7
#15 0xffffffff80868e8f at cpu_idle_acpi+0x3f
#16 0xffffffff80868f47 at cpu_idle+0xa7
#17 0xffffffff80557c95 at sched_idletd+0x515
Uptime: 19h45m31s
Dumping 1359 out of 4069 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at ./machine/pcpu.h:230
230             __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (OFFSETOF_CURTHREAD));
(kgdb) where
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80524efb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446
#3  0xffffffff80525373 in vpanic (fmt=<optimized out>, ap=0xfffffe0025d8a520) at /usr/src/sys/kern/kern_shutdown.c:872
#4  0xffffffff80525163 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799
#5  0xffffffff808287df in trap_fatal (frame=0xfffffe0025d8a6b0, eva=0) at /usr/src/sys/amd64/amd64/trap.c:929
#6  0xffffffff80827c4d in trap (frame=0xfffffe0025d8a6b0) at /usr/src/sys/amd64/amd64/trap.c:217
#7  <signal handler called>
#8  0xffffffff8053ee2f in callout_process (now=305509015149479) at /usr/src/sys/kern/kern_timeout.c:510
#9  0xffffffff80836db8 in handleevents (now=305509015149479, fake=0) at /usr/src/sys/kern/kern_clocksource.c:213
#10 0xffffffff80837621 in timercb (et=<optimized out>, arg=<optimized out>) at /usr/src/sys/kern/kern_clocksource.c:357
#11 0xffffffff8082d219 in hpet_intr_single (arg=0xfffffe00003670b0) at /usr/src/sys/dev/acpica/acpi_hpet.c:309
#12 0xffffffff8082d2be in hpet_intr (arg=0xfffffe0000367000) at /usr/src/sys/dev/acpica/acpi_hpet.c:327
#13 0xffffffff804ea32e in intr_event_handle (ie=0xfffff80002088b00, frame=0xfffffe0025d8a9a0) at /usr/src/sys/kern/kern_intr.c:1225
#14 0xffffffff8086c1a8 in intr_execute_handlers (isrc=0xfffff800021fde48, frame=0xfffffe0025d8a9a0) at /usr/src/sys/x86/x86/intr_machdep.c:358
#15 0xffffffff808722b4 in lapic_handle_intr (vector=<optimized out>, frame=0x115dbe7e8ecf4) at /usr/src/sys/x86/x86/local_apic.c:1293
#16 <signal handler called>
#17 acpi_cpu_c1 () at /usr/src/sys/x86/x86/cpu_machdep.c:140
#18 0xffffffff80375417 in acpi_cpu_idle (sbt=<optimized out>) at /usr/src/sys/dev/acpica/acpi_cpu.c:1187
#19 0xffffffff80868e8f in cpu_idle_acpi (sbt=56223337) at /usr/src/sys/x86/x86/cpu_machdep.c:433
#20 0xffffffff80868f47 in cpu_idle (busy=0) at /usr/src/sys/x86/x86/cpu_machdep.c:581
#21 0xffffffff80557c95 in sched_idletd (dummy=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2829
#22 0xffffffff804e76a3 in fork_exit (callout=0xffffffff80557780 <sched_idletd>, arg=0x0, frame=0xfffffe0025d8ac00) at /usr/src/sys/kern/kern_fork.c:1059
#23 <signal handler called>
(kgdb)
Comment 7 Mark Johnston freebsd_committer 2019-01-10 19:22:21 UTC
I looked at the kernel dumps.  In all three cases, we crashed while processing a callout that had been mostly zeroed out.  However, in all cases, at offset 0x10 into the callout there is a pointer in the direct map.  Following that pointer leads me to the middle of a ZFS vnode:

(kgdb) frame 9
#9  0xffffffff805dd328 in callout_process (now=91167878320709) at /usr/src/sys/kern/kern_timeout.c:510
510                                             LIST_REMOVE(tmp, c_links.le);
(kgdb) p/x *cc->cc_callwheel[firstb & callwheelmask].lh_first
$21 = {
  c_links = {
    le = {
      le_next = 0x0, 
      le_prev = 0x0
    }, 
    sle = {
      sle_next = 0x0
    }, 
    tqe = {
      tqe_next = 0x0, 
      tqe_prev = 0x0
    }
  }, 
  c_time = 0xfffff803773850c0, 
  c_precision = 0x0, 
  c_arg = 0x0, 
  c_func = 0x0, 
  c_lock = 0x0, 
  c_flags = 0x0, 
  c_iflags = 0x0, 
  c_cpu = 0x0
}
(kgdb) x/32g 0xfffff803773850c0
0xfffff803773850c0:     0x0000000000000000      0xfffff8000a161de0
0xfffff803773850d0:     0xffffffff809bf965      0x0000000005230000
0xfffff803773850e0:     0x0000000000000000      0x0000000000000001
0xfffff803773850f0:     0xffffffff80c96170      0xfffff80463b90100
0xfffff80377385100:     0x0000000000000000      0x0000000000000000
0xfffff80377385110:     0xfffff80377385000      0x0000000000000000
0xfffff80377385120:     0xfffff80377385118      0x0000000000000000
0xfffff80377385130:     0x0000000000000000      0x0000000000000000
0xfffff80377385140:     0xfffff80377385138      0x0000000000000000
0xfffff80377385150:     0x0000000000000000      0x0000000000000000
0xfffff80377385160:     0x0000000300000000      0x0000000000020000
0xfffff80377385170:     0x0000000000000000      0x0000000000000000
0xfffff80377385180:     0x0000000000000000      0x0000000000000000
0xfffff80377385190:     0xfffff80377385188      0x0000000000000000
0xfffff803773851a0:     0x0000000000000000      0x0000000000000000
0xfffff803773851b0:     0x0000000000000000      0x0000000200000000

0xffffffff809bf965 is a pointer to the string "bufobj interlock".
So the pointer in this callout is to the v_actfreelist member
of a vnode.  Indeed, 0xfffff803773850c0 & ~0xff is a pointer to
a ZFS vnode.  So, the pointer in the callout structure is probably
the prev pointer, which means the callout structure is actually
embedded in a vnode.

(kgdb) p *(struct vnode * )((uintptr_t)$23 & ~0xff)
$24 = { 
  v_tag = 0x0,   
  v_op = 0x0,      
  v_data = 0x0,
  v_mount = 0xfffff8000a3fb000,
  v_nmntvnodes = {
    tqe_next = 0x0,  
    tqe_prev = 0x0
  },              
...
  v_vnlock = 0x0,
  v_actfreelist = {
    tqe_next = 0x0,
    tqe_prev = 0xfffff803773850c0
  },
...
  v_iflag = 0,
  v_vflag = 0,
  v_mflag = 0,
  v_writecount = 0,
  v_hash = 0,
  v_type = VMARKER
}

So the only non-zero fields are those of a marker
vnode initialized in __mnt_vnode_first_active().  Those
vnodes are allocated using malloc(), not the vnode zone.
Comment 8 Mark Johnston freebsd_committer 2019-01-10 19:33:12 UTC
sizeof(struct vnode) == 480, so it looks like this is a use-after-free in the
512 byte malloc zone.  The callout is at offset 0xb8 into the structure.  Based
on some skimming of the CTF type graph, this is very likely to be a struct
llentry...
Comment 9 Mark Johnston freebsd_committer 2019-01-10 19:55:36 UTC
Do the systems in question have many short-lived ARP or ND cache entries?
Comment 10 Andriy Gapon freebsd_committer 2019-01-10 21:08:44 UTC
(In reply to Mark Johnston from comment #8)
Mark, bug 234567 could be related.
At least an attachment added in bug 234567, comment #6 looks relevant.
It has this stack trace:
--- trap 0xc, rip = 0xffffffff80493fbf, rsp = 0xfffffe0000477940, rbp = 0xfffffe00004779e0 ---
__rw_wlock_hard() at __rw_wlock_hard+0x1bf/frame 0xfffffe00004779e0
nd6_llinfo_timer() at nd6_llinfo_timer+0x47f/frame 0xfffffe0000477a70
softclock_call_cc() at softclock_call_cc+0x12e/frame 0xfffffe0000477b20
softclock() at softclock+0x79/frame 0xfffffe0000477b40
ithread_loop() at ithread_loop+0x1a7/frame 0xfffffe0000477bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe0000477bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000477bf0
Comment 11 Mark Johnston freebsd_committer 2019-01-11 00:53:36 UTC
Hmm.  The allocation stats for M_LLTABLE are quite low in all three vmcores, i.e., we had freed fewer than 30 or so items before panicking.  That probably means it's not a race.  There's a couple of possibilities: there is some rarely executed code path that always triggers the bug, or struct llentry is a red herring.  I wrote a program to iterate over all types in the CTF graph looking for a struct callout at offset 0xb8; this catches cases like, struct A contains a struct B at offset 0x40 and struct B contains a struct callout at offset 0x78.  However, it didn't turn up any likely candidates aside from the lle_timer field.

struct llentry -> lle_timer
struct in6_llentry -> base -> lle_timer
struct in_llentry -> base -> lle_timer
struct sdhci_slot -> card_delayed_task -> c
struct mpr_softc -> periodic
struct ata_request -> callout

Of course, it's possible that I did something wrong and missed a candidate.
Comment 12 Victor 2019-01-14 07:37:32 UTC
(In reply to Mark Johnston from comment #11)
dumped dumps of other server, can analyze for comparison
https://drive.google.com/open?id=1SjVwUBIEAgxa-sU1i65NiVOhTlUvjzzE
Comment 13 Victor 2019-01-14 12:18:08 UTC
(In reply to Mark Johnston from comment #9)
The server uses IPv4 / IPv6, arp entries in the table are no more than 10-20 with a standard lifetime, there are servers with a similar problem that use only IPv4
Comment 14 Mark Johnston freebsd_committer 2019-01-15 22:36:57 UTC
Looking at the llentry code, I think there's a race between llentry destruction and lookup introduced by the conversion of the afdata read lock to epoch.

When an ARP entry expires, arptimer() acquires the lle write lock and afdata write lock, and unlinks the lle from the lookup table.  There is no mutual exclusion with a concurrent lookup anymore; previously, the ifdata read lock was required for a lookup.  So, if the hash table lookup succeeds before arptimer() removes the entry, the thread performing the lookup will block on the lle lock, holding a de facto reference without having bumped lle_refcount.  arptimer() may release the last reference on the llentry and schedule it for a deferred free after releasing the lle write lock.  The deferred free is probably enough to ensure that the lookup thread won't access freed memory in most cases, though it doesn't work in general.  Worse, though, the thread performing the lookup may acquire a long-lived reference after the lle is scheduled for destruction, and may schedule a callout for the lle, which would cause this panic.

Seems the ND6 code would be vulnerable to the same problem, though I haven't yet looked closely.
Comment 15 Mark Johnston freebsd_committer 2019-01-16 03:21:53 UTC
Created attachment 201176 [details]
patch

Would you (or anyone else affected by the issue) please try the attached patch?  I did some light testing and verified that it applies to stable/12.
Comment 16 Mark Johnston freebsd_committer 2019-01-16 17:29:31 UTC
Created attachment 201193 [details]
patch

Sorry, the previous patch contained a mistake in the IPv6 code.  This patch should be tested instead unless you're not using IPv6.
Comment 17 Victor 2019-01-16 18:13:53 UTC
(In reply to Mark Johnston from comment #16)
ok, today I will update, I will report on the results
Comment 18 Mark Johnston freebsd_committer 2019-01-20 17:07:34 UTC
*** Bug 231080 has been marked as a duplicate of this bug. ***
Comment 19 Mark Johnston freebsd_committer 2019-01-20 17:09:44 UTC
*** Bug 234042 has been marked as a duplicate of this bug. ***
Comment 20 Mark Johnston freebsd_committer 2019-01-20 17:45:30 UTC
https://reviews.freebsd.org/D18906
Comment 21 Victor 2019-01-21 05:38:55 UTC
(In reply to Mark Johnston from comment #16)
uptime more than 3 days without panic, thanks Mark