Bug 234296 - FreeBSD 12.0-STABLE r342216 Fatal trap 12
Summary: FreeBSD 12.0-STABLE r342216 Fatal trap 12
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords:
: 231080 234042 234374 234567 235472 236390 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-12-23 12:33 UTC by Victor
Modified: 2019-03-31 14:51 UTC (History)
11 users (show)

See Also:


Attachments
one of the logs of dump (131.86 KB, text/plain)
2018-12-23 12:33 UTC, Victor
no flags Details
patch (2.18 KB, patch)
2019-01-16 03:21 UTC, Mark Johnston
no flags Details | Diff
patch (2.34 KB, patch)
2019-01-16 17:29 UTC, Mark Johnston
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Victor 2018-12-23 12:33:32 UTC
Created attachment 200391 [details]
one of the logs of dump

Hello!

After upgrading servers to FreeBSD 12 stable, the server goes into a reboot, this problem is manifested on 5 servers out of 8 updated, the server data is used as gateways

part of the dump of one of the servers

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff805dd478
stack pointer           = 0x0:0xfffffe0094f65aa0
frame pointer           = 0x0:0xfffffe0094f65b10
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 41176 (ipfw)
trap number             = 12
panic: page fault
cpuid = 0
time = 1545493415
KDB: stack backtrace:
#0 0xffffffff8060c007 at kdb_backtrace+0x67
#1 0xffffffff805c3ab3 at vpanic+0x1a3
#2 0xffffffff805c3903 at panic+0x43
#3 0xffffffff808e2d5f at trap_fatal+0x35f
#4 0xffffffff808e2db9 at trap_pfault+0x49
#5 0xffffffff808e23fe at trap+0x29e
#6 0xffffffff808bfd15 at calltrap+0x8
#7 0xffffffff80918768 at handleevents+0x1a8
#8 0xffffffff80918dd9 at timercb+0xa9
#9 0xffffffff80953d1d at lapic_handle_timer+0xad
#10 0xffffffff808c181e at Xtimerint+0xae
Uptime: 12h23m19s

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff805dd478
stack pointer           = 0x28:0xfffffe0075d5a850
frame pointer           = 0x28:0xfffffe0075d5a8c0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 11 (idle: cpu0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1545530359
KDB: stack backtrace:
#0 0xffffffff8060c007 at kdb_backtrace+0x67
#1 0xffffffff805c3ab3 at vpanic+0x1a3
#2 0xffffffff805c3903 at panic+0x43
#3 0xffffffff808e2d5f at trap_fatal+0x35f
#4 0xffffffff808e2db9 at trap_pfault+0x49
#5 0xffffffff808e23fe at trap+0x29e
#6 0xffffffff808bfd15 at calltrap+0x8
#7 0xffffffff80918768 at handleevents+0x1a8
#8 0xffffffff80918dd9 at timercb+0xa9
#9 0xffffffff80953d1d at lapic_handle_timer+0xad
#10 0xffffffff808c181e at Xtimerint+0xae
#11 0xffffffff8094a83f at cpu_idle_acpi+0x3f
#12 0xffffffff8094a8f7 at cpu_idle+0xa7
#13 0xffffffff805f6275 at sched_idletd+0x515
#14 0xffffffff8058c683 at fork_exit+0x83
#15 0xffffffff808c0cfe at fork_trampoline+0xe
Uptime: 10h14m40s

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address   = 0x0
fault code              = supervisor read instruction, page not present
instruction pointer     = 0x20:0x0
stack pointer           = 0x28:0xfffffe000048ea88
frame pointer           = 0x28:0xfffffe000048eb30
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (swi4: clock (0))
trap number             = 12
panic: page fault
cpuid = 1
time = 1545537943
KDB: stack backtrace:
#0 0xffffffff8060c007 at kdb_backtrace+0x67
#1 0xffffffff805c3ab3 at vpanic+0x1a3
#2 0xffffffff805c3903 at panic+0x43
#3 0xffffffff808e2d5f at trap_fatal+0x35f
#4 0xffffffff808e2db9 at trap_pfault+0x49
#5 0xffffffff808e23fe at trap+0x29e
#6 0xffffffff808bfd15 at calltrap+0x8
#7 0xffffffff805ddbf9 at softclock+0x79
#8 0xffffffff8058f519 at ithread_loop+0x169
#9 0xffffffff8058c683 at fork_exit+0x83
#10 0xffffffff808c0cfe at fork_trampoline+0xe
Uptime: 2h5m2s

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x500b0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff805dd478
stack pointer           = 0x28:0xfffffe0075df23f0
frame pointer           = 0x28:0xfffffe0075df2460
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 0 (if_io_tqg_0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1545562751
KDB: stack backtrace:
#0 0xffffffff8060c007 at kdb_backtrace+0x67
#1 0xffffffff805c3ab3 at vpanic+0x1a3
#2 0xffffffff805c3903 at panic+0x43
#3 0xffffffff808e2d5f at trap_fatal+0x35f
#4 0xffffffff808e2db9 at trap_pfault+0x49
#5 0xffffffff808e23fe at trap+0x29e
#6 0xffffffff808bfd15 at calltrap+0x8
#7 0xffffffff80918768 at handleevents+0x1a8
#8 0xffffffff80918dd9 at timercb+0xa9
#9 0xffffffff80953d1d at lapic_handle_timer+0xad
#10 0xffffffff808c181e at Xtimerint+0xae
#11 0xffffffff806488c2 at m_dup+0x182
#12 0xffffffff806f7c2d at ng_tee_rcvdata+0x3d
#13 0xffffffff806ed419 at ng_apply_item+0x2f9
#14 0xffffffff806eced0 at ng_snd_item+0x130
#15 0xffffffff806f0c5c at ng_ether_input+0x4c
#16 0xffffffff806c200d at ether_nh_input+0x1dd
#17 0xffffffff806d9aa6 at netisr_dispatch_src+0xd6
Uptime: 6h52m21s

sorry for my English
Comment 1 Victor 2018-12-25 07:58:07 UTC
updated to 12.0-STABLE r342388 with CFLAGS= -O0 -pipe build options. The situation has not changed

panic: page fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff805dd328
stack pointer           = 0x28:0xfffffe0075d5a850
frame pointer           = 0x28:0xfffffe0075d5a8c0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 11 (idle: cpu0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1545716928
KDB: stack backtrace:
#0 0xffffffff8060beb7 at kdb_backtrace+0x67
#1 0xffffffff805c3963 at vpanic+0x1a3
#2 0xffffffff805c37b3 at panic+0x43
#3 0xffffffff808e321f at trap_fatal+0x35f
#4 0xffffffff808e3279 at trap_pfault+0x49
#5 0xffffffff808e28be at trap+0x29e
#6 0xffffffff808bfc15 at calltrap+0x8
#7 0xffffffff80918c28 at handleevents+0x1a8
#8 0xffffffff80919299 at timercb+0xa9
#9 0xffffffff809541dd at lapic_handle_timer+0xad
#10 0xffffffff808c171e at Xtimerint+0xae
#11 0xffffffff8094acff at cpu_idle_acpi+0x3f
#12 0xffffffff8094adb7 at cpu_idle+0xa7
#13 0xffffffff805f6125 at sched_idletd+0x515
#14 0xffffffff8058c673 at fork_exit+0x83
#15 0xffffffff808c0bfe at fork_trampoline+0xe
Uptime: 5h53m46s
Comment 2 Mark Johnston freebsd_committer 2019-01-02 19:22:11 UTC
(In reply to Victor from comment #1)
Would you be willing to share the vmcore from the -O0 crash?
Comment 3 Victor 2019-01-03 08:22:32 UTC
(In reply to Mark Johnston from comment #2)
posted on google disk https://drive.google.com/open?id=1E6pFrMYLg_mzWpqTa2hipF3zdORHhs3v
Comment 4 Mark Johnston freebsd_committer 2019-01-03 16:56:01 UTC
(In reply to Victor from comment #3)
Thanks.  Could you also provide the matching /boot/kernel and /usr/lib/debug/boot/kernel?  I didn't notice before that you were on -STABLE and not -RELEASE.
Comment 5 Victor 2019-01-04 16:44:53 UTC
(In reply to Mark Johnston from comment #4)
added to google disk
This bug was observed even with the head and in release it is also present
Comment 6 Victor 2019-01-10 07:09:16 UTC
same problem on another server

12.0-STABLE FreeBSD 12.0-STABLE r342878 OFFICE_GW  amd64

kgdb /boot/kernel/kernel /var/crash/vmcore.6
GNU gdb (GDB) 8.2 [GDB v8.2 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...done.
done.

Unread portion of the kernel message buffer:
kernel trap 9 with interrupts disabled


Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff8053ee2f
stack pointer           = 0x28:0xfffffe0025d8a770
frame pointer           = 0x28:0xfffffe0025d8a7e0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 11 (idle: cpu0)
trap number             = 9
panic: general protection fault
cpuid = 0
time = 1547100690
KDB: stack backtrace:
#0 0xffffffff8056da27 at kdb_backtrace+0x67
#1 0xffffffff80525313 at vpanic+0x1a3
#2 0xffffffff80525163 at panic+0x43
#3 0xffffffff808287df at trap_fatal+0x35f
#4 0xffffffff80827c4d at trap+0x6d
#5 0xffffffff808051c5 at calltrap+0x8
#6 0xffffffff80836db8 at handleevents+0x1a8
#7 0xffffffff80837621 at timercb+0x2a1
#8 0xffffffff8082d219 at hpet_intr_single+0x1b9
#9 0xffffffff8082d2be at hpet_intr+0x8e
#10 0xffffffff804ea32e at intr_event_handle+0xbe
#11 0xffffffff8086c1a8 at intr_execute_handlers+0x58
#12 0xffffffff808722b4 at lapic_handle_intr+0x44
#13 0xffffffff80806359 at Xapic_isr1+0xd9
#14 0xffffffff80375417 at acpi_cpu_idle+0x2e7
#15 0xffffffff80868e8f at cpu_idle_acpi+0x3f
#16 0xffffffff80868f47 at cpu_idle+0xa7
#17 0xffffffff80557c95 at sched_idletd+0x515
Uptime: 19h45m31s
Dumping 1359 out of 4069 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at ./machine/pcpu.h:230
230             __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (OFFSETOF_CURTHREAD));
(kgdb) where
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80524efb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446
#3  0xffffffff80525373 in vpanic (fmt=<optimized out>, ap=0xfffffe0025d8a520) at /usr/src/sys/kern/kern_shutdown.c:872
#4  0xffffffff80525163 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799
#5  0xffffffff808287df in trap_fatal (frame=0xfffffe0025d8a6b0, eva=0) at /usr/src/sys/amd64/amd64/trap.c:929
#6  0xffffffff80827c4d in trap (frame=0xfffffe0025d8a6b0) at /usr/src/sys/amd64/amd64/trap.c:217
#7  <signal handler called>
#8  0xffffffff8053ee2f in callout_process (now=305509015149479) at /usr/src/sys/kern/kern_timeout.c:510
#9  0xffffffff80836db8 in handleevents (now=305509015149479, fake=0) at /usr/src/sys/kern/kern_clocksource.c:213
#10 0xffffffff80837621 in timercb (et=<optimized out>, arg=<optimized out>) at /usr/src/sys/kern/kern_clocksource.c:357
#11 0xffffffff8082d219 in hpet_intr_single (arg=0xfffffe00003670b0) at /usr/src/sys/dev/acpica/acpi_hpet.c:309
#12 0xffffffff8082d2be in hpet_intr (arg=0xfffffe0000367000) at /usr/src/sys/dev/acpica/acpi_hpet.c:327
#13 0xffffffff804ea32e in intr_event_handle (ie=0xfffff80002088b00, frame=0xfffffe0025d8a9a0) at /usr/src/sys/kern/kern_intr.c:1225
#14 0xffffffff8086c1a8 in intr_execute_handlers (isrc=0xfffff800021fde48, frame=0xfffffe0025d8a9a0) at /usr/src/sys/x86/x86/intr_machdep.c:358
#15 0xffffffff808722b4 in lapic_handle_intr (vector=<optimized out>, frame=0x115dbe7e8ecf4) at /usr/src/sys/x86/x86/local_apic.c:1293
#16 <signal handler called>
#17 acpi_cpu_c1 () at /usr/src/sys/x86/x86/cpu_machdep.c:140
#18 0xffffffff80375417 in acpi_cpu_idle (sbt=<optimized out>) at /usr/src/sys/dev/acpica/acpi_cpu.c:1187
#19 0xffffffff80868e8f in cpu_idle_acpi (sbt=56223337) at /usr/src/sys/x86/x86/cpu_machdep.c:433
#20 0xffffffff80868f47 in cpu_idle (busy=0) at /usr/src/sys/x86/x86/cpu_machdep.c:581
#21 0xffffffff80557c95 in sched_idletd (dummy=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2829
#22 0xffffffff804e76a3 in fork_exit (callout=0xffffffff80557780 <sched_idletd>, arg=0x0, frame=0xfffffe0025d8ac00) at /usr/src/sys/kern/kern_fork.c:1059
#23 <signal handler called>
(kgdb)
Comment 7 Mark Johnston freebsd_committer 2019-01-10 19:22:21 UTC
I looked at the kernel dumps.  In all three cases, we crashed while processing a callout that had been mostly zeroed out.  However, in all cases, at offset 0x10 into the callout there is a pointer in the direct map.  Following that pointer leads me to the middle of a ZFS vnode:

(kgdb) frame 9
#9  0xffffffff805dd328 in callout_process (now=91167878320709) at /usr/src/sys/kern/kern_timeout.c:510
510                                             LIST_REMOVE(tmp, c_links.le);
(kgdb) p/x *cc->cc_callwheel[firstb & callwheelmask].lh_first
$21 = {
  c_links = {
    le = {
      le_next = 0x0, 
      le_prev = 0x0
    }, 
    sle = {
      sle_next = 0x0
    }, 
    tqe = {
      tqe_next = 0x0, 
      tqe_prev = 0x0
    }
  }, 
  c_time = 0xfffff803773850c0, 
  c_precision = 0x0, 
  c_arg = 0x0, 
  c_func = 0x0, 
  c_lock = 0x0, 
  c_flags = 0x0, 
  c_iflags = 0x0, 
  c_cpu = 0x0
}
(kgdb) x/32g 0xfffff803773850c0
0xfffff803773850c0:     0x0000000000000000      0xfffff8000a161de0
0xfffff803773850d0:     0xffffffff809bf965      0x0000000005230000
0xfffff803773850e0:     0x0000000000000000      0x0000000000000001
0xfffff803773850f0:     0xffffffff80c96170      0xfffff80463b90100
0xfffff80377385100:     0x0000000000000000      0x0000000000000000
0xfffff80377385110:     0xfffff80377385000      0x0000000000000000
0xfffff80377385120:     0xfffff80377385118      0x0000000000000000
0xfffff80377385130:     0x0000000000000000      0x0000000000000000
0xfffff80377385140:     0xfffff80377385138      0x0000000000000000
0xfffff80377385150:     0x0000000000000000      0x0000000000000000
0xfffff80377385160:     0x0000000300000000      0x0000000000020000
0xfffff80377385170:     0x0000000000000000      0x0000000000000000
0xfffff80377385180:     0x0000000000000000      0x0000000000000000
0xfffff80377385190:     0xfffff80377385188      0x0000000000000000
0xfffff803773851a0:     0x0000000000000000      0x0000000000000000
0xfffff803773851b0:     0x0000000000000000      0x0000000200000000

0xffffffff809bf965 is a pointer to the string "bufobj interlock".
So the pointer in this callout is to the v_actfreelist member
of a vnode.  Indeed, 0xfffff803773850c0 & ~0xff is a pointer to
a ZFS vnode.  So, the pointer in the callout structure is probably
the prev pointer, which means the callout structure is actually
embedded in a vnode.

(kgdb) p *(struct vnode * )((uintptr_t)$23 & ~0xff)
$24 = { 
  v_tag = 0x0,   
  v_op = 0x0,      
  v_data = 0x0,
  v_mount = 0xfffff8000a3fb000,
  v_nmntvnodes = {
    tqe_next = 0x0,  
    tqe_prev = 0x0
  },              
...
  v_vnlock = 0x0,
  v_actfreelist = {
    tqe_next = 0x0,
    tqe_prev = 0xfffff803773850c0
  },
...
  v_iflag = 0,
  v_vflag = 0,
  v_mflag = 0,
  v_writecount = 0,
  v_hash = 0,
  v_type = VMARKER
}

So the only non-zero fields are those of a marker
vnode initialized in __mnt_vnode_first_active().  Those
vnodes are allocated using malloc(), not the vnode zone.
Comment 8 Mark Johnston freebsd_committer 2019-01-10 19:33:12 UTC
sizeof(struct vnode) == 480, so it looks like this is a use-after-free in the
512 byte malloc zone.  The callout is at offset 0xb8 into the structure.  Based
on some skimming of the CTF type graph, this is very likely to be a struct
llentry...
Comment 9 Mark Johnston freebsd_committer 2019-01-10 19:55:36 UTC
Do the systems in question have many short-lived ARP or ND cache entries?
Comment 10 Andriy Gapon freebsd_committer 2019-01-10 21:08:44 UTC
(In reply to Mark Johnston from comment #8)
Mark, bug 234567 could be related.
At least an attachment added in bug 234567, comment #6 looks relevant.
It has this stack trace:
--- trap 0xc, rip = 0xffffffff80493fbf, rsp = 0xfffffe0000477940, rbp = 0xfffffe00004779e0 ---
__rw_wlock_hard() at __rw_wlock_hard+0x1bf/frame 0xfffffe00004779e0
nd6_llinfo_timer() at nd6_llinfo_timer+0x47f/frame 0xfffffe0000477a70
softclock_call_cc() at softclock_call_cc+0x12e/frame 0xfffffe0000477b20
softclock() at softclock+0x79/frame 0xfffffe0000477b40
ithread_loop() at ithread_loop+0x1a7/frame 0xfffffe0000477bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe0000477bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000477bf0
Comment 11 Mark Johnston freebsd_committer 2019-01-11 00:53:36 UTC
Hmm.  The allocation stats for M_LLTABLE are quite low in all three vmcores, i.e., we had freed fewer than 30 or so items before panicking.  That probably means it's not a race.  There's a couple of possibilities: there is some rarely executed code path that always triggers the bug, or struct llentry is a red herring.  I wrote a program to iterate over all types in the CTF graph looking for a struct callout at offset 0xb8; this catches cases like, struct A contains a struct B at offset 0x40 and struct B contains a struct callout at offset 0x78.  However, it didn't turn up any likely candidates aside from the lle_timer field.

struct llentry -> lle_timer
struct in6_llentry -> base -> lle_timer
struct in_llentry -> base -> lle_timer
struct sdhci_slot -> card_delayed_task -> c
struct mpr_softc -> periodic
struct ata_request -> callout

Of course, it's possible that I did something wrong and missed a candidate.
Comment 12 Victor 2019-01-14 07:37:32 UTC
(In reply to Mark Johnston from comment #11)
dumped dumps of other server, can analyze for comparison
https://drive.google.com/open?id=1SjVwUBIEAgxa-sU1i65NiVOhTlUvjzzE
Comment 13 Victor 2019-01-14 12:18:08 UTC
(In reply to Mark Johnston from comment #9)
The server uses IPv4 / IPv6, arp entries in the table are no more than 10-20 with a standard lifetime, there are servers with a similar problem that use only IPv4
Comment 14 Mark Johnston freebsd_committer 2019-01-15 22:36:57 UTC
Looking at the llentry code, I think there's a race between llentry destruction and lookup introduced by the conversion of the afdata read lock to epoch.

When an ARP entry expires, arptimer() acquires the lle write lock and afdata write lock, and unlinks the lle from the lookup table.  There is no mutual exclusion with a concurrent lookup anymore; previously, the ifdata read lock was required for a lookup.  So, if the hash table lookup succeeds before arptimer() removes the entry, the thread performing the lookup will block on the lle lock, holding a de facto reference without having bumped lle_refcount.  arptimer() may release the last reference on the llentry and schedule it for a deferred free after releasing the lle write lock.  The deferred free is probably enough to ensure that the lookup thread won't access freed memory in most cases, though it doesn't work in general.  Worse, though, the thread performing the lookup may acquire a long-lived reference after the lle is scheduled for destruction, and may schedule a callout for the lle, which would cause this panic.

Seems the ND6 code would be vulnerable to the same problem, though I haven't yet looked closely.
Comment 15 Mark Johnston freebsd_committer 2019-01-16 03:21:53 UTC
Created attachment 201176 [details]
patch

Would you (or anyone else affected by the issue) please try the attached patch?  I did some light testing and verified that it applies to stable/12.
Comment 16 Mark Johnston freebsd_committer 2019-01-16 17:29:31 UTC
Created attachment 201193 [details]
patch

Sorry, the previous patch contained a mistake in the IPv6 code.  This patch should be tested instead unless you're not using IPv6.
Comment 17 Victor 2019-01-16 18:13:53 UTC
(In reply to Mark Johnston from comment #16)
ok, today I will update, I will report on the results
Comment 18 Mark Johnston freebsd_committer 2019-01-20 17:07:34 UTC
*** Bug 231080 has been marked as a duplicate of this bug. ***
Comment 19 Mark Johnston freebsd_committer 2019-01-20 17:09:44 UTC
*** Bug 234042 has been marked as a duplicate of this bug. ***
Comment 20 Mark Johnston freebsd_committer 2019-01-20 17:45:30 UTC
https://reviews.freebsd.org/D18906
Comment 21 Victor 2019-01-21 05:38:55 UTC
(In reply to Mark Johnston from comment #16)
uptime more than 3 days without panic, thanks Mark
Comment 22 commit-hook freebsd_committer 2019-01-23 22:19:17 UTC
A commit references this bug:

Author: markj
Date: Wed Jan 23 22:18:24 UTC 2019
New revision: 343363
URL: https://svnweb.freebsd.org/changeset/base/343363

Log:
  Fix an LLE lookup race.

  After the afdata read lock was converted to epoch(9), readers could
  observe a linked LLE and block on the LLE while a thread was
  unlinking the LLE.  The writer would then release the lock and schedule
  the LLE for deferred free, allowing readers to continue and potentially
  schedule the LLE timer.  By the point the timer fires, the structure is
  freed, typically resulting in a crash in the callout subsystem.

  Fix the problem by modifying the lookup path to check for the LLE_LINKED
  flag upon acquiring the LLE lock.  If it's not set, the lookup fails.

  PR:		234296
  Reviewed by:	bz
  Tested by:	sbruno, Victor <chernov_victor@list.ru>,
  		Mike Andrews <mandrews@bit0.com>
  MFC after:	3 days
  Sponsored by:	The FreeBSD Foundation
  Differential Revision:	https://reviews.freebsd.org/D18906

Changes:
  head/sys/netinet/in.c
  head/sys/netinet6/in6.c
Comment 23 Mark Johnston freebsd_committer 2019-01-23 22:25:21 UTC
(In reply to Victor from comment #21)
Thanks for testing.
Comment 24 commit-hook freebsd_committer 2019-01-25 20:25:27 UTC
A commit references this bug:

Author: markj
Date: Fri Jan 25 20:24:54 UTC 2019
New revision: 343454
URL: https://svnweb.freebsd.org/changeset/base/343454

Log:
  MFC r343363, r343364:
  Fix an LLE lookup race.

  PR:	234296

Changes:
_U  stable/12/
  stable/12/sys/netinet/in.c
  stable/12/sys/netinet6/in6.c
Comment 25 commit-hook freebsd_committer 2019-02-05 18:00:34 UTC
A commit references this bug:

Author: emaste
Date: Tue Feb  5 17:59:50 UTC 2019
New revision: 343787
URL: https://svnweb.freebsd.org/changeset/base/343787

Log:
  MFS12 r343454: Fix an LLE lookup race

  PR:		234296
  Submitted by:	markj
  Approved by:	so

Changes:
_U  releng/12.0/
  releng/12.0/sys/netinet/in.c
  releng/12.0/sys/netinet6/in6.c
Comment 26 Mark Johnston freebsd_committer 2019-02-05 18:43:41 UTC
The issue's fixed in all affected branches.
Comment 27 Mark Johnston freebsd_committer 2019-02-05 18:56:33 UTC
*** Bug 235472 has been marked as a duplicate of this bug. ***
Comment 28 Mark Johnston freebsd_committer 2019-02-06 01:25:45 UTC
*** Bug 234567 has been marked as a duplicate of this bug. ***
Comment 29 Mark Johnston freebsd_committer 2019-02-07 18:28:30 UTC
*** Bug 234374 has been marked as a duplicate of this bug. ***
Comment 30 Mark Johnston freebsd_committer 2019-03-08 15:56:53 UTC
*** Bug 236390 has been marked as a duplicate of this bug. ***
Comment 31 Viktor Dukhovni 2019-03-28 03:23:35 UTC
I am seeing what appears to be the same issue in 12.0-p3. Perhaps the issue is not fully addressed.  The workload that triggers the issue involves a high rate DNS queries over IPv4 (2 to 3 thousand per second!) mixed with some TCP connections, both IPv4 and IPv6.  My IPv6 connectivity is 6to4 (stf0).  The panic message (multiple times) is:

Fatal trap 12: page fault while in kernel mode
cpuid = 7; apic id = 07
fault virtual address   = 0x0
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80b4ef6b
stack pointer           = 0x28:0xfffffe01bea18490
frame pointer           = 0x28:0xfffffe01bea18750
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1820 (danescan)
trap number             = 12
panic: page fault
cpuid = 1
time = 1553727560
KDB: stack backtrace:
#0 0xffffffff80be7977 at kdb_backtrace+0x67
#1 0xffffffff80b9b563 at vpanic+0x1a3
#2 0xffffffff80b9b3b3 at panic+0x43
#3 0xffffffff8107496f at trap_fatal+0x35f
#4 0xffffffff810749c9 at trap_pfault+0x49
#5 0xffffffff81073fee at trap+0x29e
#6 0xffffffff8104f465 at calltrap+0x8
#7 0xffffffff80b4e1a5 at kern_kevent+0xb5
#8 0xffffffff80b4deb1 at kern_kevent_generic+0x71
#9 0xffffffff80b4df81 at freebsd11_kevent+0x61
#10 0xffffffff81075449 at amd64_syscall+0x369
#11 0xffffffff8104fd4d at fast_syscall_common+0x101


The "danescan" process was built with GHC 8.6.3 on FreeBSD 11.2, and so was using the FreeBSD 11 kevent interface (freebsd11_kevent).  Just in case that's relevant, I've installed a native FreeBSD 12.0 GHC, and recompiled the code.  I suspect this will not make a difference, but time will tell...
Comment 32 Mark Johnston freebsd_committer 2019-03-28 04:29:07 UTC
(In reply to Viktor Dukhovni from comment #31)
If the backtrace is in kevent it's not going to be the same issue.  Though, we did fix some kevent races late in the 12.0 release cycle.  Would you be willing to provide a kernel dump to me to help diagnose the problem?  If not, it would be helpful to see the output of "addr2line -afi -e /usr/lib/debug/boot/kernel/kernel.debug 0xffffffff80b4ef6b".  In any case, it would help if you submitted a new PR for the issue.

I guess this is a regression after an update?  What were you updating from?
Comment 33 Viktor Dukhovni 2019-03-28 04:56:13 UTC
> Would you be willing to provide a kernel dump to me to help diagnose the problem?

I did not have kernel dumps enabled at the time.  I've now enabled them, but I've also re-installed GHC and recompiled my code.  It crashed again, but no longer in kevent (the 3 panics with the 11.2 binary were all in same kevent code path).

> If not, it would be helpful to see the output of "addr2line -afi -e /usr/lib/debug/boot/kernel/kernel.debug 0xffffffff80b4ef6b".

Here it is:

  $ addr2line -afi -e /usr/lib/debug/boot/kernel/kernel.debug 0xffffffff80b4ef6b
  0xffffffff80b4ef6b
  kqueue_kevent
  /usr/src/sys/kern/kern_event.c:1989

> In any case, it would help if you submitted a new PR for the issue.

About to do that for the new crash, for which I do have a crash dump.

> I guess this is a regression after an update?  What were you updating from?

Yes, I upgraded from 11.2-RELEASE-p9 to 12.0-RELEASE-p3.

The new panic (via dmesg) is:

panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe00c9f87000
cpuid = 0
time = 1553747701
KDB: stack backtrace:
#0 0xffffffff80be7977 at kdb_backtrace+0x67
#1 0xffffffff80b9b563 at vpanic+0x1a3
#2 0xffffffff80b9b3b3 at panic+0x43
#3 0xffffffff80edd120 at unlock_and_deallocate+0
#4 0xffffffff80eda970 at vm_fault+0x60
#5 0xffffffff81074ae3 at trap_pfault+0x163
#6 0xffffffff81073fee at trap+0x29e
#7 0xffffffff8104f465 at calltrap+0x8
#8 0xffffffff80d26cdd at ip_input+0x45d
#9 0xffffffff80cbc576 at netisr_dispatch_src+0xd6
#10 0xffffffff80ca0e63 at ether_demux+0x163
#11 0xffffffff80ca1fc6 at ether_nh_input+0x346
#12 0xffffffff80cbc576 at netisr_dispatch_src+0xd6
#13 0xffffffff80ca1264 at ether_input+0x54
#14 0xffffffff80cb8726 at iflib_rxeof+0xa16
#15 0xffffffff80cb3556 at _task_fn_rx+0x76
#16 0xffffffff80be6204 at gtaskqueue_run_locked+0x144
#17 0xffffffff80be5e68 at gtaskqueue_thread_loop+0x98


I'll reference the new ticket here when I finish writing it up...
Comment 34 Viktor Dukhovni 2019-03-28 05:27:06 UTC
(In reply to Viktor Dukhovni from comment #33)
The new bug is: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236846
Comment 35 Viktor Dukhovni 2019-03-28 05:28:57 UTC
Oh, and I need to be able to recover back to a working system.  What are my options for going back to 11.2?
Comment 36 Mark Johnston freebsd_committer 2019-03-28 15:53:05 UTC
(In reply to Viktor Dukhovni from comment #35)
I don't believe we have any blessed way to downgrade releases, assuming you don't still have the 11.2 installation available as a ZFS snapshot.  I believe it's possible to do it with freebsd-update, but you have to be careful to downgrade both the kernel and userspace before rebooting.  Otherwise the only other option is to rebuild releng/11.2 from source.  Other folks may have suggestions if you ask on the -current mailing list.
Comment 37 Viktor Dukhovni 2019-03-28 16:07:13 UTC
(In reply to Mark Johnston from comment #36)
Sadly, the ZFS snapshot that I should have created, did not happen.

So my best option may be to try and work with you to resolve the 12.0 panics.  Any thoughts on where to go from here?  The panics are now (and it turns out that the first 2 of the original 3 with 12.0-RELEASE-p3, but not the last in kevent) are in ip_input.
Comment 38 Martin Birgmeier 2019-03-29 19:05:02 UTC
I have submitted bug report #236121 which might be related. Mark has a few of my crash dumps.

-- Martin