Bug 227654 - Reproducible crash with lagg+vlan+em
Summary: Reproducible crash with lagg+vlan+em
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Kristof Provost
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2018-04-20 13:31 UTC by Eugene Grosbein
Modified: 2018-10-24 18:22 UTC (History)
3 users (show)

See Also:
kp: mfc-stable12+
kp: mfc-stable11+


Attachments
panic screenshot (701.47 KB, image/png)
2018-04-20 13:31 UTC, Eugene Grosbein
no flags Details
debugging patch for single user only (904 bytes, patch)
2018-04-20 23:19 UTC, Eugene Grosbein
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Grosbein freebsd_committer 2018-04-20 13:31:55 UTC
Created attachment 192681 [details]
panic screenshot

Hi!

I run my workstation under FreeBSD 11.1-STABLE/amd64 r331249 with custom kernel including IPv6 support, options DDB, KDB, INVARIANTS and WITNESS.

The following sequence of commands produces a panic reliably, including cold-booted single user mode:

ifconfig tap0 create up
ifconfig lagg0 create up
ifconfig lagg0 laggport tap0
ifconfig em0 up
ifconfig vlan61 create vlan 61 vlandev em0
ifconfig vlan61 inet 192.168.0.1/24
ifconfig lagg0 laggport em0 # instant panic

Panic messages sometimes printed in bright-white and then system jumps to BIOS POST within a second or so. Sometimes they are pronted in gray then system just hangs solid without any reaction on Ctrl-Alt-ESC to enter KDB and no crashdump generated.

Screenshot is attached.
Comment 1 Eugene Grosbein freebsd_committer 2018-04-20 14:35:13 UTC
$ addr2line -e kernel.debug -i -f -C  ffffffff806fe6ac
ether_output_frame
/data2/src/sys/net/if_ethersubr.c:449
ether_output
/data2/src/sys/net/if_ethersubr.c:435


(kgdb) l /data2/src/sys/net/if_ethersubr.c:449
444     int
445     ether_output_frame(struct ifnet *ifp, struct mbuf *m)
446     {
447             int i;
448     
449             if (PFIL_HOOKED(&V_link_pfil_hook)) {
450                     i = pfil_run_hooks(&V_link_pfil_hook, &m, ifp, PFIL_OUT, NULL);
451     
452                     if (i != 0)
453                             return (EACCES);
Comment 2 Eugene Grosbein freebsd_committer 2018-04-20 23:19:39 UTC
Created attachment 192690 [details]
debugging patch for single user only

Forgot to note that my kernel has VIMAGE too.

I've reproduced this with my home desktop that has serial console, so I've digged this a bit deeper suspecting that curvnet may be not initialized.
I've added some debugging output, the diff is attached.

KASSERT did not catch this for unknown reason, so it's commented out.

Anyway, curvnet occured to be zero, so any attempt to use V_link_pfil_hook dereferences NULL producing this panic:

ether_output_frame: vlan61: curvnet 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0453d37780
ether_output() at ether_output+0x64c/frame 0xfffffe0453d37820
arprequest() at arprequest+0x443/frame 0xfffffe0453d37920
arp_ifinit() at arp_ifinit+0x58/frame 0xfffffe0453d37960
arp_handle_ifllchange() at arp_handle_ifllchange+0x3d/frame 0xfffffe0453d37980
if_setlladdr() at if_setlladdr+0x21e/frame 0xfffffe0453d379e0
taskqueue_run_locked() at taskqueue_run_locked+0x14c/frame 0xfffffe0453d37a40
taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe0453d37a70
fork_exit() at fork_exit+0x84/frame 0xfffffe0453d37ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0453d37ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---


Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address   = 0x28
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80aca683
stack pointer           = 0x28:0xfffffe0453d37790
frame pointer           = 0x28:0xfffffe0453d37820
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (thread taskq)
trap number             = 12
p
Comment 3 Eugene Grosbein freebsd_committer 2018-04-21 09:36:10 UTC
It seems IPv6 has nothing to do with the problem as it is reproduceable with INET4-only kernel too.
Comment 4 Eugene Grosbein freebsd_committer 2018-10-20 20:18:50 UTC
This is 100% repeatable using same command sequence under 12.0-BETA1/i386 installed with all defaults inside VirtualBox VM.

This time it says:

panic: vm_fault_hold: fault on nofault entry, addr: 0

It generates nice crashdump and reboots.
I've uploaded kernel.debug (stock one from 12.0-BETA1/i386 installation image, 18M compressed) and vmcore.0.xz (9.2MB compressed) here:
http://www.grosbein.net/freebsd/crash/20181021/

Here comes kgdb script:

Script started on Sun Oct 21 02:31:58 2018
Command: kgdb kernel.debug /var/crash/vmcore.0
GNU gdb (GDB) 8.2 [GDB v8.2 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "i386-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kernel.debug...done.

Unread portion of the kernel message buffer:
<6>em0: link state changed to DOWN
<6>vlan61: link state changed to DOWN
panic: vm_fault_hold: fault on nofault entry, addr: 0
cpuid = 0
time = 1540062884
KDB: stack backtrace:
#0 0x112d01f at kdb_backtrace+0x4f
#1 0x10e03f7 at vpanic+0x147
#2 0x10e02ab at panic+0x1b
#3 0x14289c5 at vm_fault_hold+0x2a45
#4 0x1425f2e at vm_fault+0x5e
#5 0x16b6ef7 at trap_pfault+0xc7
#6 0x16b64af at trap+0x3cf
#7 0xffc0315d at PTDpde+0x4165
#8 0x11e0122 at ether_output+0x6a2
#9 0x124887d at arprequest+0x44d
#10 0x12493f9 at arp_ifinit+0x59
#11 0x124c5bb at arp_handle_ifllchange+0x3b
#12 0x11db275 at if_setlladdr+0x275
#13 0x11eb900 at vlan_lladdr_fn+0x30
#14 0x113eba9 at taskqueue_run_locked+0x189
#15 0x113fd57 at taskqueue_thread_loop+0x97
#16 0x10a1af1 at fork_exit+0x71
#17 0xffc033ba at PTDpde+0x43c2
Uptime: 7m21s
Physical memory: 2019 MB
Dumping 84 MB: 69 53 37 21 5

__curthread () at ./machine/pcpu.h:226
226	./machine/pcpu.h: No such file or directory.
(kgdb) add-kld if_tap.ko
add symbol table from file "if_tap.ko.debug" at
	.rodata_addr = 0x18c0c134
	set_sysctl_set_addr = 0x18c0c4e0
	set_modmetadata_set_addr = 0x18c0c4f8
	.note.gnu.build-id_addr = 0x18c0c500
	.dynsym_addr = 0x18c0c548
	.gnu.hash_addr = 0x18c0cb18
	.hash_addr = 0x18c0cb58
	.dynstr_addr = 0x18c0ce48
	.rel.dyn_addr = 0x18c0d348
	.text_addr = 0x18c0f000
	.data_addr = 0x18c12000
	set_sysinit_set_addr = 0x18c12270
	set_sysuninit_set_addr = 0x18c12278
	.dynamic_addr = 0x18c13000
	.bss_addr = 0x18c14000
(y or n) y
Reading symbols from if_tap.ko.debug...done.
(kgdb) add-kld if_lagg.ko
add symbol table from file "if_lagg.ko.debug" at
	.rodata_addr = 0x18c15138
	set_sysctl_set_addr = 0x18c16038
	set_modmetadata_set_addr = 0x18c16054
	.note.gnu.build-id_addr = 0x18c16060
	.dynsym_addr = 0x18c160a8
	.gnu.hash_addr = 0x18c16798
	.hash_addr = 0x18c167e8
	.dynstr_addr = 0x18c16b68
	.rel.dyn_addr = 0x18c171e8
	.text_addr = 0x18c1a000
	.data_addr = 0x18c24000
	set_vnet_addr = 0x18c24270
	set_sysinit_set_addr = 0x18c242a0
	set_sysuninit_set_addr = 0x18c242ac
	.dynamic_addr = 0x18c25000
	.bss_addr = 0x18c26000
(y or n) y
Reading symbols from if_lagg.ko.debug...done.
(kgdb) bt full
#0  __curthread () at ./machine/pcpu.h:226
        td = <optimized out>
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:366
        error = <error reading variable error (Cannot access memory at address 0x0)>
        coredump = <optimized out>
#2  0x010e0073 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:446
        once = <error reading variable once (Cannot access memory at address 0x0)>
#3  0x010e0444 in vpanic (
    fmt=0x179ab2c "%s: fault on nofault entry, addr: %#lx", 
    ap=0x16a1462c "U\327p\001") at /usr/src/sys/kern/kern_shutdown.c:872
        buf = "vm_fault_hold: fault on nofault entry, addr: 0", '\000' <repeats 209 times>
        td = 0x7369700
        newpanic = <error reading variable newpanic (Cannot access memory at address 0x1)>
        bootopt = 260
        other_cpus = <optimized out>
#4  0x010e02ab in panic (
    fmt=0x179ab2c "%s: fault on nofault entry, addr: %#lx")
    at /usr/src/sys/kern/kern_shutdown.c:799
        ap = <unavailable>
--Type <RET> for more, q to quit, c to continue without paging--c
#5  0x014289c5 in vm_fault_hold (map=0x2bd5000, vaddr=0, fault_type=<optimized out>, fault_flags=0, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:586
        hardfault = <optimized out>
        nera = <optimized out>
        faultcount = <optimized out>
        wired = 0
        prot = 7 '\a'
        result = 0
        rv = <optimized out>
        behind = <optimized out>
        ahead = <optimized out>
        error = <optimized out>
        locked = <optimized out>
        vp = <optimized out>
        fs = <optimized out>
        dset = <optimized out>
        next_object = <optimized out>
        alloc_req = <optimized out>
        era = <optimized out>
        behavior = <optimized out>
        cluster_offset = <optimized out>
        e_start = <optimized out>
        e_end = <optimized out>
        is_first_object_locked = <error reading variable is_first_object_locked (Cannot access memory at address 0x0)>
        retry_prot = <optimized out>
        retry_pindex = <optimized out>
        retry_object = <optimized out>
        dead = <optimized out>
#6  0x01425f2e in vm_fault (map=0x2bd5000, vaddr=0, fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:536
        td = 0x7369700
        result = <optimized out>
#7  0x016b6ef7 in trap_pfault (frame=0x16a14894, usermode=0, eva=28) at /usr/src/sys/i386/i386/trap.c:882
        td = 0x7369700
        p = <optimized out>
        va = <optimized out>
        ftype = <unavailable>
        rv = <optimized out>
        map = <optimized out>
#8  0x016b64af in trap (frame=0x16a14894) at /usr/src/sys/i386/i386/trap.c:519
        td = <optimized out>
        dr6 = <optimized out>
        addr = <optimized out>
        ucode = <optimized out>
        signo = <optimized out>
        p = 0x206c1e4 <proc0>
        type = 12
        eva = 28
        ksi = <optimized out>
#9  0xffc0315d in ?? ()
No symbol table info available.
#10 0x16a14894 in ?? ()
No symbol table info available.
#11 0x011e0122 in ether_output (ifp=0x7240400, m=<unavailable>, dst=0x16a149e8, ro=0x16a149a8) at /usr/src/sys/net/if_ethersubr.c:435
        linkhdr = "\001\000\000\000\000\227\066\a\\I\241\026\372\251"
        error = <optimized out>
        pflags = 0
        addref = 0
        phdr = <optimized out>
        hlen = <optimized out>
        lle = <optimized out>
        eh = <optimized out>
        t = <optimized out>
#12 0x0124887d in arprequest (ifp=0x7240400, sip=0x16a14a40, tip=0x16a14a40, enaddr=0x1911d94a "") at /usr/src/sys/netinet/if_ether.c:428
        linkhdr = "\377\377\377\377\377\377\000\275\024\247\377\000\b\006\000\000\000\000\000\000\000\000\000"
        carpaddr = <optimized out>
        m = <optimized out>
        ah = <optimized out>
        linkhdrsize = <error reading variable linkhdrsize (Cannot access memory at address 0x18)>
        error = 0
        ro = {ro_rt = 0x0, ro_lle = 0x0, ro_prepend = 0x16a149d0 "\377\377\377\377\377\377", ro_plen = 14, ro_flags = 0, ro_mtu = 0, spare = 0, ro_dst = {sa_len = 0 '\000', sa_family = 0 '\000', sa_data = '\000' <repeats 13 times>}}
        sa = {sa_len = 2 '\002', sa_family = 35 '#', sa_data = "\000\000J\331\021\031\000\000\000\000\000\227\066\a"}
#13 0x012493f9 in arp_announce_ifaddr (ifp=0x7240400, addr=..., enaddr=<unavailable>) at /usr/src/sys/netinet/if_ether.c:1436
No locals.
#14 arp_ifinit (ifp=0x7240400, ifa=0x1911d200) at /usr/src/sys/netinet/if_ether.c:1423
        dst_in = 0x1911d250
        dst = 0x1911d250
#15 0x0124c5bb in arp_handle_ifllchange (ifp=0x7240400) at /usr/src/sys/netinet/if_ether.c:1450
        ifa = 0x1911d200
#16 0x011db275 in if_setlladdr (ifp=0x7240400, lladdr=0x1911d94a "", len=6) at /usr/src/sys/net/if.c:3867
        _ep = <optimized out>
        _t = <optimized out>
        _el = <optimized out>
        ifa = <optimized out>
        sdl = <optimized out>
        nep_et = {datap = {0x0, 0x16a26c80, 0xdeadbeef}, datai = {1}}
        ifr = <optimized out>
#17 0x011eb900 in vlan_lladdr_fn (arg=0x1911ab00, pending=1) at /usr/src/sys/net/if_vlan.c:1306
        ifv = 0x1911ab00
        ifp = <unavailable>
#18 0x0113eba9 in taskqueue_run_locked (queue=0x736d600) at /usr/src/sys/kern/subr_taskqueue.c:465
        tb_first = <optimized out>
        pending = 1
        task = 0x1911ab28
        tb = <optimized out>
#19 0x0113fd57 in taskqueue_thread_loop (arg=0x206f918 <taskqueue_thread>) at /usr/src/sys/kern/subr_taskqueue.c:757
        tqp = <optimized out>
        tq = 0x736d600
#20 0x010a1af1 in fork_exit (callout=0x113fcc0 <taskqueue_thread_loop>, arg=0x206f918 <taskqueue_thread>, frame=0x16a14ba8) at /usr/src/sys/kern/kern_fork.c:1057
        td = 0x7369700
        p = 0x206c1e4 <proc0>
        dtd = <optimized out>
#21 0xffc033ba in ?? ()
No symbol table info available.
(kgdb) frame 11
#11 0x011e0122 in ether_output (ifp=0x7240400, m=<unavailable>, 
    dst=0x16a149e8, ro=0x16a149a8) at /usr/src/sys/net/if_ethersubr.c:435
435		return ether_output_frame(ifp, m);
(kgdb) p *ifp
$1 = {if_link = {cstqe_next = 0x0}, if_clones = {le_next = 0x0, 
    le_prev = 0x757ff1c}, if_groups = {cstqh_first = 0x757ec70, 
    cstqh_last = 0x756d1a4}, if_alloctype = 6 '\006', if_softc = 0x1911ab00, 
  if_llsoftc = 0x0, if_l2com = 0x0, if_dname = 0x1b08e5c <vlanname> "vlan", 
  if_dunit = 61, if_index = 6, if_index_reserved = 0, 
  if_xname = "vlan61\000\000\000\000\000\000\000\000\000", 
  if_description = 0x0, if_flags = 34819, if_drv_flags = 64, 
  if_capabilities = 3, if_capenable = 3, if_linkmib = 0x1911ab14, 
  if_linkmiblen = 20, if_refcount = 1, if_type = 135 '\207', 
  if_addrlen = 6 '\006', if_hdrlen = 4 '\004', if_link_state = 1 '\001', 
  if_mtu = 1500, if_metric = 0, if_baudrate = 0, if_hwassist = 6, 
  if_epoch = 421, if_lastchange = {tv_sec = 1540062864, tv_usec = 921826}, 
  if_snd = {ifq_head = 0x0, ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 50, 
    ifq_mtx = {lock_object = {lo_name = 0x7240430 "vlan61", 
        lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, 
    ifq_drv_head = 0x0, ifq_drv_tail = 0x0, ifq_drv_len = 0, 
    ifq_drv_maxlen = 0, altq_type = 0, altq_flags = 0, altq_disc = 0x0, 
    altq_ifp = 0x7240400, altq_enqueue = 0x0, altq_dequeue = 0x0, 
    altq_request = 0x0, altq_clfier = 0x0, altq_classify = 0x0, 
    altq_tbr = 0x0, altq_cdnr = 0x0}, if_linktask = {ta_link = {
      stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, 
    ta_func = 0x11d5460 <do_link_state_change>, ta_context = 0x7240400}, 
  if_addr_lock = {lock_object = {lo_name = 0x1704959 "if_addr_lock", 
      lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, 
--Type <RET> for more, q to quit, c to continue without paging--  c
  if_addrhead = {cstqh_first = 0x1911d900, cstqh_last = 0x1911d214}, if_multiaddrs = {cstqh_first = 0x1911ac80, cstqh_last = 0x1911a0c0}, if_amcount = 0, if_addr = 0x1911d900, if_hw_addr = 0x757ebd0, if_broadcastaddr = 0x1b08424 <etherbroadcastaddr> "\377\377\377\377\377\377", if_afdata_lock = {lock_object = {lo_name = 0x1771f6b "if_afdata", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, if_afdata = {0x0, 0x0, 0x757ece0, 0x0 <repeats 25 times>, 0x74cff80, 0x0 <repeats 13 times>}, if_afdata_initialized = 2, if_fib = 0, if_vnet = 0x7017060, if_home_vnet = 0x7017060, if_vlantrunk = 0x0, if_bpf = 0x1911aa80, if_pcount = 0, if_bridge = 0x0, if_lagg = 0x0, if_pf_kif = 0x0, if_carp = 0x0, if_label = 0x0, if_netmap = 0x0, if_output = 0x11dfa80 <ether_output>, if_input = 0x11e0720 <ether_input>, if_bridge_input = 0x0, if_bridge_output = 0x0, if_bridge_linkstate = 0x0, if_start = 0x0, if_ioctl = 0x11ea700 <vlan_ioctl>, if_init = 0x11ea470 <vlan_init>, if_resolvemulti = 0x11e0780 <ether_resolvemulti>, if_qflush = 0x11ea6f0 <vlan_qflush>, if_transmit = 0x11ea480 <vlan_transmit>, if_reassign = 0x11e0980 <ether_reassign>, if_get_counter = 0x11d5720 <if_get_counter_default>, if_requestencap = 0x11e08b0 <ether_requestencap>, if_counters = {0x16b61000, 0x16b61010, 0x16b61020, 0x16b61030, 0x16b61040, 0x16b61050, 0x16b61060, 0x16b61070, 0x16b61080, 0x16b61090, 0x16b610a0, 0x16b610b0}, if_hw_tsomax = 65518, if_hw_tsomaxsegcount = 35, if_hw_tsomaxsegsize = 2048, if_snd_tag_alloc = 0x0, if_snd_tag_modify = 0x0, if_snd_tag_query = 0x0, if_snd_tag_free = 0x0, if_pcp = 0 '\000', if_netdump_methods = 0x0, if_epoch_ctx = {data = {0x0, 0x0}}, if_addr_et = {datap = {0x0, 0x0, 0x0}, datai = {0}}, if_maddr_et = {datap = {0x0, 0x0, 0x0}, datai = {0}}, if_ispare = {0, 0, 0, 0}}
(kgdb) p m
$2 = <unavailable>
(kgdb) l
430			if (m == NULL)
431				return (0);
432		}
433	
434		/* Continue with link-layer output */
435		return ether_output_frame(ifp, m);
436	}
437	
438	static bool
439	ether_set_pcp(struct mbuf **mp, struct ifnet *ifp, uint8_t pcp)
(kgdb) p ifp->if_type
$4 = 135 '\207'
(kgdb) p ifp->if_transmit
$5 = (if_transmit_fn_t) 0x11ea480 <vlan_transmit>
(kgdb) p *((struct ifvlan *)ifp->if_softc)->ifv_trunk->parent
$8 = {if_link = {cstqe_next = 0x757d000}, if_clones = {le_next = 0x0, 
    le_prev = 0x0}, if_groups = {cstqh_first = 0x74fe980, 
    cstqh_last = 0x74fe984}, if_alloctype = 6 '\006', if_softc = 0x752a000, 
  if_llsoftc = 0x0, if_l2com = 0x0, if_dname = 0x70444b0 "em", if_dunit = 0, 
  if_index = 1, if_index_reserved = 0, 
  if_xname = "em0", '\000' <repeats 12 times>, if_description = 0x0, 
  if_flags = 34819, if_drv_flags = 64, if_capabilities = 8468635, 
  if_capenable = 8454299, if_linkmib = 0x0, if_linkmiblen = 0, 
  if_refcount = 4, if_type = 161 '\241', if_addrlen = 6 '\006', 
  if_hdrlen = 18 '\022', if_link_state = 1 '\001', if_mtu = 1500, 
  if_metric = 0, if_baudrate = 0, if_hwassist = 6, if_epoch = 1, 
  if_lastchange = {tv_sec = 1540062449, tv_usec = 317821}, if_snd = {
    ifq_head = 0x0, ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 1023, ifq_mtx = {
      lock_object = {lo_name = 0x752b430 "em0", lo_flags = 16973824, 
        lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, ifq_drv_head = 0x0, 
    ifq_drv_tail = 0x0, ifq_drv_len = 0, ifq_drv_maxlen = 1023, altq_type = 0, 
    altq_flags = 1, altq_disc = 0x0, altq_ifp = 0x752b400, altq_enqueue = 0x0, 
    altq_dequeue = 0x0, altq_request = 0x0, altq_clfier = 0x0, 
    altq_classify = 0x0, altq_tbr = 0x0, altq_cdnr = 0x0}, if_linktask = {
    ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, 
    ta_func = 0x11d5460 <do_link_state_change>, ta_context = 0x752b400}, 
  if_addr_lock = {lock_object = {lo_name = 0x1704959 "if_addr_lock", 
      lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, 
    mtx_lock = 129987456}, if_addrhead = {cstqh_first = 0x74e0c00, 
--Type <RET> for more, q to quit, c to continue without paging--q
cstqh_last = Quit
Command exit status: 0
Script done on Sun Oct 21 02:55:39 2018
Comment 5 Eugene Grosbein freebsd_committer 2018-10-21 01:24:30 UTC
I've added additional printf's to sys/net/if_ethersubr.c and found that it panices within ether_output_frame() function.

I've added this just before PFIL_HOOKED(&V_link_pfil_hook) check:

if (ifp->if_index == 6) printf(""ether_output_frame: checking curvnet=%p\n", curvnet);
if (ifp->if_index == 6) printf(""ether_output_frame: V_link_pfil_hook=%p\n", V_link_pfil_hook);

And last lines of dmesg buffer after panic are:

ether_output_frame: checking curvnet=0
panic: vm_fault_hold: fault on nofault entry, addr: 0

So, curvnet is NULL here, hence the panic.
Comment 6 Kristof Provost freebsd_committer 2018-10-21 06:19:58 UTC
This should fix it:

diff --git a/sys/net/if_vlan.c b/sys/net/if_vlan.c
index b75a62c16b3..79ef2422600 100644
--- a/sys/net/if_vlan.c
+++ b/sys/net/if_vlan.c
@@ -1302,8 +1302,13 @@ vlan_lladdr_fn(void *arg, int pending __unused)

        ifv = (struct ifvlan *)arg;
        ifp = ifv->ifv_ifp;
+
+       CURVNET_SET(ifp->if_vnet);
+
        /* The ifv_ifp already has the lladdr copied in. */
        if_setlladdr(ifp, IF_LLADDR(ifp), ifp->if_addrlen);
+
+       CURVNET_RESTORE();
 }

 static int
Comment 7 Eugene Grosbein freebsd_committer 2018-10-21 08:54:24 UTC
(In reply to Kristof Provost from comment #6)

Yes, it help. Thanks! Please commit.
Do you want to perform a merge to stable/12 before release, too?
Comment 8 commit-hook freebsd_committer 2018-10-21 16:52:07 UTC
A commit references this bug:

Author: kp
Date: Sun Oct 21 16:51:36 UTC 2018
New revision: 339547
URL: https://svnweb.freebsd.org/changeset/base/339547

Log:
  vlan: Fix panic with lagg and vlan

  vlan_lladdr_fn() is called from taskqueue, which means there's no vnet context
  set. We can end up trying to send ARP messages (through the iflladdr_event
  event), which requires a vnet context.

  PR:		227654
  MFC after:	3 days

Changes:
  head/sys/net/if_vlan.c
Comment 9 Kristof Provost freebsd_committer 2018-10-21 16:53:02 UTC
(In reply to commit-hook from comment #7)
I'll pick this up with re@ when it's gone through the 3 day MFC period. It might make it before the BETA2 builds start.
Comment 10 Kubilay Kocak freebsd_committer freebsd_triage 2018-10-22 02:57:27 UTC
I'll create a mfc-stable12 flag to use shortly
Comment 11 commit-hook freebsd_committer 2018-10-24 17:32:35 UTC
A commit references this bug:

Author: kp
Date: Wed Oct 24 17:32:31 UTC 2018
New revision: 339690
URL: https://svnweb.freebsd.org/changeset/base/339690

Log:
  MFC r339547:

  vlan: Fix panic with lagg and vlan

  vlan_lladdr_fn() is called from taskqueue, which means there's no vnet context
  set. We can end up trying to send ARP messages (through the iflladdr_event
  event), which requires a vnet context.

  PR:		227654
  Approved by:	re (kib)

Changes:
_U  stable/12/
  stable/12/sys/net/if_vlan.c
Comment 12 commit-hook freebsd_committer 2018-10-24 18:20:18 UTC
A commit references this bug:

Author: kp
Date: Wed Oct 24 18:19:32 UTC 2018
New revision: 339691
URL: https://svnweb.freebsd.org/changeset/base/339691

Log:
  MFC r339547:

  vlan: Fix panic with lagg and vlan

  vlan_lladdr_fn() is called from taskqueue, which means there's no vnet context
  set. We can end up trying to send ARP messages (through the iflladdr_event
  event), which requires a vnet context.

  PR:		227654

Changes:
_U  stable/11/
  stable/11/sys/net/if_vlan.c