Bug 233622 - panic: page not present fault when stopping VIMAGE jail on 12.0-RC2, netgraph
Summary: panic: page not present fault when stopping VIMAGE jail on 12.0-RC2, netgraph
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords: crash, vimage
Depends on:
Blocks:
 
Reported: 2018-11-29 08:19 UTC by Jordan Boland
Modified: 2021-01-06 15:00 UTC (History)
12 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jordan Boland 2018-11-29 08:19:29 UTC
This is reproducible for me - any time I stop this jail the system panics.  I appreciate your patience as I am new to kernel debugging, so if I have omitted necessary information it is out of ignorance and not malice.  :-)

===============================================

Unread portion of the kernel message buffer:
<6>in6_purgeaddr: err=65, destination address delete failed
ng node ng0_unifi_1 needs NGF_REALLY_DIE


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff8263dba6
stack pointer           = 0x28:0xfffffe008caeb6c0
frame pointer           = 0x28:0xfffffe008caeb6e0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 60579 (jail)
trap number             = 12
panic: page fault
cpuid = 0
time = 1543479102
KDB: stack backtrace:
#0 0xffffffff80be74a7 at kdb_backtrace+0x67
#1 0xffffffff80b9b093 at vpanic+0x1a3
#2 0xffffffff80b9aee3 at panic+0x43
#3 0xffffffff8107394f at trap_fatal+0x35f
#4 0xffffffff810739a9 at trap_pfault+0x49
#5 0xffffffff81072fce at trap+0x29e
#6 0xffffffff8104e865 at calltrap+0x8
#7 0xffffffff80ca0dd5 at ether_ifdetach+0x35
#8 0xffffffff80caab14 at vlan_clone_destroy+0x24
#9 0xffffffff80c9ea26 at if_clone_destroyif+0x116
#10 0xffffffff80c9f338 at if_clone_detach+0xc8
#11 0xffffffff80cc7b3c at vnet_destroy+0x13c
#12 0xffffffff80b63480 at prison_deref+0x2b0
#13 0xffffffff80b64d04 at sys_jail_remove+0x364
#14 0xffffffff81074429 at amd64_syscall+0x369
#15 0xffffffff8104f14d at fast_syscall_common+0x101
Uptime: 1m36s
Dumping 768 out of 16178 MB:..3%..11%..21%..32%..42%..53%..61%..71%..82%..92%

__curthread () at ./machine/pcpu.h:230
230     ./machine/pcpu.h: No such file or directory.
(kgdb) list *0xffffffff8263dba6
0xffffffff8263dba6 is in ng_ether_detach (/usr/src/sys/netgraph/ng_ether.c:367).
362     /usr/src/sys/netgraph/ng_ether.c: No such file or directory.
(kgdb) backtrace
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80b9ac7b in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446
#3  0xffffffff80b9b0f3 in vpanic (fmt=<optimized out>, ap=0xfffffe008bcf2410) at /usr/src/sys/kern/kern_shutdown.c:872
#4  0xffffffff80b9aee3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799
#5  0xffffffff8107394f in trap_fatal (frame=0xfffffe008bcf2600, eva=0) at /usr/src/sys/amd64/amd64/trap.c:929
#6  0xffffffff810739a9 in trap_pfault (frame=0xfffffe008bcf2600, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:765
#7  0xffffffff81072fce in trap (frame=0xfffffe008bcf2600) at /usr/src/sys/amd64/amd64/trap.c:441
#8  <signal handler called>
#9  ng_ether_detach (ifp=0xfffff800038a6800) at /usr/src/sys/netgraph/ng_ether.c:367
#10 0xffffffff80ca0dd5 in ether_ifdetach (ifp=0xfffff800038a6800) at /usr/src/sys/net/if_ethersubr.c:981
#11 0xffffffff80caab14 in vlan_clone_destroy (ifc=0xfffff80013c55600, ifp=0xfffff800038a6800)
    at /usr/src/sys/net/if_vlan.c:1106
#12 0xffffffff80c9ea26 in if_clone_destroyif (ifc=0xfffff80013c55600, ifp=0xfffff800038a6800)
    at /usr/src/sys/net/if_clone.c:330
#13 0xffffffff80c9f338 in if_clone_detach (ifc=0xfffff80013c55600) at /usr/src/sys/net/if_clone.c:451
#14 0xffffffff80cc7b3c in vnet_sysuninit () at /usr/src/sys/net/vnet.c:597
#15 vnet_destroy (vnet=0xfffff80013c7dd00) at /usr/src/sys/net/vnet.c:284
#16 0xffffffff80b63480 in prison_deref (pr=0xffffffff81b0b3c0 <prison0>, flags=19) at /usr/src/sys/kern/kern_jail.c:2634
#17 0xffffffff80b64d04 in sys_jail_remove (td=<optimized out>, uap=<optimized out>) at /usr/src/sys/kern/kern_jail.c:2257
#18 0xffffffff81074429 in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#19 amd64_syscall (td=0xfffff80117757000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1076
#20 <signal handler called>
#21 0x000000080030f0aa in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffea28
Comment 1 Kristof Provost freebsd_committer freebsd_triage 2018-11-29 08:26:49 UTC
Can you describe your setup? It looks like you might have a vlan interface involved somewhere, but knowing how it's all set up will likely make reproducing this easier.
Comment 2 Jordan Boland 2018-11-29 08:37:57 UTC
Yes, I can go into some more detail about the networking.

On the host system, only igb1 is currently active.  It is a trunk interface, so on the host I have igb1.1 configured.

I am utilizing Devin Teske's jng to create a netgraph bridge to igb1, exposing the trunked interface to the jail.  This jail only needs 1 VLAN, but I anticipate adding others later that will utilize more, and this seemed more elegant than cloning the VLAN interfaces individually.

In the jail, I am replicate the same setup as the host to access the tagged interface.

host rc.conf:
=============================
ifconfig_igb1="up"
vlans_igb1="1"
ifconfig_igb1_1="192.168.1.2 netmask 255.255.255.0"
=============================


jail rc.conf:
=============================
vlans_ng0_unifi="1"
ifconfig_ng0_unifi="up"
ifconfig_ng0_unifi_1="inet 192.168.1.3 netmask 255.255.255.0"
=============================


For this particular jail the configuration is overkill (I suppose I could just bridge to igb1.1 on the host).  But I am proving out this strategy for some of the other services that I will need to host on this machine later.  Otherwise, I will have to go back to the strategy of mangling multiple FIBs.
Comment 3 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-11-29 11:33:29 UTC
I can probably have a quick look this evening unless anyone beats me to it.
Looking at the backtrace I have a suspicion of what's going on.
Comment 4 Jordan Boland 2018-12-04 09:03:01 UTC
Hi Bjoern & Kristof,

I wanted to provide some additional information that I have gathered today.  I have upgraded this system to 12.0-RC3 and can confirm the issue remains.  I also believe it is related to the netgraph module.  I have converted a jail to use jib and mod if_bridge/if_epair.  I can now stop the jail without experiencing the kernel panic.

I hope this helps confirm or direct your suspicions.  Let me know if I can do any additional tests.  I will look into it further when I can, but it may be some time before I can do so, and it is entirely possible that the networking code exceeds my skill in C.

Best,

Jordan
Comment 5 Bjoern A. Zeeb freebsd_committer freebsd_triage 2019-01-15 23:20:05 UTC
Sorry, I got side-tracked while I was looking at this.  Release it to net@ in case someone else beats me to fixing it.
Comment 6 Eugene Grosbein freebsd_committer freebsd_triage 2019-01-16 09:11:45 UTC
This PR lack many valuable technical details. If the problem is really in netgraph, please you describe used nodes and their hooks and settings because just notice of some "jng" (whatever it is) not enough. Plain ng0 is p2p-interface and vlan is not. You should also supply any additional details that may be relevand and do not forget to show output of "ngctl list" when jail is up and running, before is panices kernel at shutdown.
Comment 7 Arne Steinkamm 2019-09-03 23:37:09 UTC
I can reproduce this panic with 12.0-RELEASE-p7 r350232 and a pretty forward out-of-the-handbook Jail setup.

This should be enough to get a nasty panic:

Host:
/etc/rc.conf:
[...]
vlans_em0="vlcx0"
create_args_vlcx0="vlan 18"
ifconfig_em0="up"
ifconfig_vlcx0="inet 10.8.8.110 netmask 255.255.255.0"
[...]
jail_enable="YES"
jail_confwarn="YES"
jail_parallel_start="NO"
jail_list="jv"
jail_reverse_stop="YES"

---------------------------------------------------------------------

/etc/jail.conf:
exec.start = "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown";
exec.clean;

jv {
        host.hostname = "julesverne.stk.cx";
        path = "/var/local/prison/jv";
        exec.clean;
        exec.system_user = "root";
        exec.jail_user = "root";
        vnet;
        exec.clean;
        vnet.interface = "ng0_jv";
        exec.system_user = "root";
        exec.jail_user = "root";
        exec.prestart += "/l/om/sbin/jng bridge jv em0";
        exec.poststop += "/l/om/sbin/jng shutdown jv";

        # Standard stuff
        exec.consolelog = "/var/local/log/jails/jv_console.log";
        mount.devfs;          #mount devfs
        allow.raw_sockets;    #allow ping-pong
        devfs_ruleset="5";    #devfs ruleset for this jail
        mount.devfs;
}

--------------------------------------------------------------------

Jail /etc/rc.conf:
[...]
ifconfig_ng0_jv="up"
vlans_ng0_jv="jjvcx0"
create_args_jjvcx0="vlan 18"
ifconfig_jjvcx0="inet 10.8.8.190 netmask 255.255.255.0"
[...]


/l/om/sbin/jng is a copy of /usr/src/share/examples/jails/jng

This should be everything you need to get exact the panic described in this bug report.
Comment 8 xsan 2019-11-24 15:42:32 UTC
I have the same problem, and it's very easy way to show that.
I use `qjail` tool to manage jails.

# first create jail, and use vnet for jail.
qjail create -4 192.168.1.101 testjail
qjail config -w em0 -v none testjail

# repeat the follows command, page fault will happend on stop command, and system reboot.
qjail start testjail
qjail stop testjail

System: FreeBSD 12.1-RELEASE amd64

Logs:

Nov 24 21:44:09 FingerAge kernel: epair3a: link state changed to DOWN
Nov 24 21:44:09 FingerAge kernel: epair3b: link state changed to DOWN
Nov 24 21:44:52 FingerAge syslogd: kernel boot file is /boot/kernel/kernel
Nov 24 21:44:52 FingerAge kernel:
Nov 24 21:44:52 FingerAge syslogd: last message repeated 1 times
Nov 24 21:44:52 FingerAge kernel: Fatal trap 12: page fault while in kernel mode
Nov 24 21:44:52 FingerAge kernel: cpuid = 7; apic id = 07
Nov 24 21:44:52 FingerAge kernel: fault virtual address = 0x410
Nov 24 21:44:52 FingerAge kernel: fault code            = supervisor read data, page not present
Nov 24 21:44:52 FingerAge kernel: instruction pointer   = 0x20:0xffffffff80baff2d
Nov 24 21:44:52 FingerAge kernel: stack pointer         = 0x28:0xfffffe00403c3940
Nov 24 21:44:52 FingerAge kernel: frame pointer         = 0x28:0xfffffe00403c39c0
Nov 24 21:44:52 FingerAge kernel: code segment          = base rx0, limit 0xfffff, type 0x1b
Nov 24 21:44:52 FingerAge kernel:                       = DPL 0, pres 1, long 1, def32 0, gran 1
Nov 24 21:44:52 FingerAge kernel: processor eflags      = interrupt enabled, resume, IOPL = 0
Nov 24 21:44:52 FingerAge kernel: current process               = 0 (thread taskq)
Nov 24 21:44:52 FingerAge kernel: trap number           = 12
Nov 24 21:44:52 FingerAge kernel: panic: page fault
Nov 24 21:44:52 FingerAge kernel: cpuid = 7
Nov 24 21:44:52 FingerAge kernel: time = 1574603049
Nov 24 21:44:52 FingerAge kernel: KDB: stack backtrace:
Nov 24 21:44:52 FingerAge kernel: #0 0xffffffff80c1d297 at kdb_backtrace+0x67
Nov 24 21:44:52 FingerAge kernel: #1 0xffffffff80bd05cd at vpanic+0x19d
Nov 24 21:44:52 FingerAge kernel: #2 0xffffffff80bd0423 at panic+0x43
Nov 24 21:44:52 FingerAge kernel: #3 0xffffffff810a7dcc at trap_fatal+0x39c
Nov 24 21:44:52 FingerAge kernel: #4 0xffffffff810a7e19 at trap_pfault+0x49
Nov 24 21:44:52 FingerAge kernel: #5 0xffffffff810a740f at trap+0x29f
Nov 24 21:44:52 FingerAge kernel: #6 0xffffffff81081a0c at calltrap+0x8
Nov 24 21:44:52 FingerAge kernel: #7 0xffffffff80ccd5e1 at if_detach_internal+0x261
Nov 24 21:44:52 FingerAge kernel: #8 0xffffffff80cd490c at if_vmove+0x3c
Nov 24 21:44:52 FingerAge kernel: #9 0xffffffff80cd48b8 at vnet_if_return+0x48
Nov 24 21:44:52 FingerAge kernel: #10 0xffffffff80cfe2b4 at vnet_destroy+0x124
Nov 24 21:44:52 FingerAge kernel: #11 0xffffffff80b98870 at prison_deref+0x2a0
Nov 24 21:44:52 FingerAge kernel: #12 0xffffffff80c2fa74 at taskqueue_run_locked+0x154
Nov 24 21:44:52 FingerAge kernel: #13 0xffffffff80c30da8 at taskqueue_thread_loop+0x98
Nov 24 21:44:52 FingerAge kernel: #14 0xffffffff80b90c23 at fork_exit+0x83
Nov 24 21:44:52 FingerAge kernel: #15 0xffffffff81082a4e at fork_trampoline+0xe
Comment 9 SATO 'paina' Taisuke 2020-03-11 01:43:06 UTC
Hi,

I've encountered the same problem on 12.1R and found a workaround.

The log shown below is how to reproduce the problem using qjail(8).
It's easy like xsan described, stopping VIMAGE jail by qjail.

root@pcv01:~ # uname -a
FreeBSD pcv01.sagamihara.i.paina.net 12.1-RELEASE-p2 FreeBSD 12.1-RELEASE-p2 GENERIC  amd64
root@pcv01:~ # pkg info | grep ^qjail
qjail-5.4                      Utility to quickly deploy and manage jails
root@pcv01:~ # qjail create -4 10.8.0.128 test001
Successfully created  test001
root@pcv01:~ # qjail config -w vmx0 -v none test001
Successfully enabled vnet.interface for test001
Successfully enabled vnet for test001
root@pcv01:~ # qjail start test001
Jail successfully started  test001
root@pcv01:~ # qjail stop test001
(crash!)

I'm using 12.1R on VMware ESXi and the virtual NIC is vmx(4).

And I've found the system crashes when qjail try to destroy epairNa interface,
so I've put 'sleep 1' to qjail before the destruction like:

*** /usr/local/bin/qjail.ORG    Wed Mar  4 20:13:14 2020
--- /usr/local/bin/qjail        Wed Mar 11 01:16:33 2020
***************
*** 2350,2355 ****
--- 2350,2356 ----
          # Disable vnet jails network configuration.
          #
          vnetid=`echo -n "${vnet}" | awk -F "|" '{print $2}'`                                              
+         sleep 1 # XXX: workaround
          ifconfig epair"${vnetid}"a destroy

          # If host has no more vnet jails then disable bridge.

It seems to happen when destroying epairNa just after killing(stopping) jailed process.
Therefore, it can be reproduced by commands like:
# jail -q -f /usr/local/etc/qjail.config/test001 -r test001; ifconfig epair1a destroy

I'm not well on FreeBSD development, so I can't solve the problem alone.
I hope somebody will fix it on future releases.

Thanks!
Comment 10 Kristof Provost freebsd_committer freebsd_triage 2020-12-15 22:36:33 UTC
I've had a very quick look at this. Happily it's trivial to reproduce: kldload ng_ether and then run the /usr/tests/sys/net/if_vlan test.

I think the problem is that during vnet teardown we run SI_SUB_NETGRAPH first, which calls vnet_netgraph_uninit(), where we free all nodes. Only then do we SI_SUB_INIT_IF, which does vnet_vlan_uninit() through which we ether_ifdetach() and ng_ether_detach().

ng_ether_detach() tries to remove the node from the ifp and free it, but before it does so it tries to priv->ifp = NULL (priv being part of the node private information) where we panic because priv is now 0xdeadc0dedeadc0de.
Comment 11 Mark Johnston freebsd_committer freebsd_triage 2020-12-17 19:05:41 UTC
(In reply to Kristof Provost from comment #10)
So vnet_netgraph_uninit() sends a shutdown message to the nodes.  However, ng_ether_shutdown() persists the node, so vnet_netgraph_uninit() tries again to remove it, and succeeds.  ng_ether_shutdown() apparently thinks that the ifnet might have been freed and so doesn't clear itself.  I think it is safe to do so if priv->ifp != NULL, though, and making that change fixes the use-after-free for me.
Comment 12 commit-hook freebsd_committer freebsd_triage 2020-12-23 05:13:30 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=cd698c51790e956fed0975f451d3dfc361dc7c24

commit cd698c51790e956fed0975f451d3dfc361dc7c24
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2020-12-23 05:11:16 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2020-12-23 05:12:16 +0000

    netgraph: Fix ng_ether's shutdown handing

    When tearing down a VNET, netgraph sends shutdown messages to all of the
    nodes before detaching interfaces (SI_SUB_NETGRAPH comes before
    SI_SUB_INIT_IF in teardown order).  ng_ether nodes handle this by
    destroying themselves without detaching from the parent ifnet.  Then,
    when ifnets go away they detach their ng_ether nodes again, triggering a
    use-after-free.

    Handle this by modifying ng_ether_shutdown() to detach from the ifnet.
    If the shutdown was triggered by an ifnet being destroyed, we will clear
    priv->ifp in the ng_ether detach callback, so priv->ifp may be NULL.

    Also get rid of the printf in vnet_netgraph_uninit().  It can be
    triggered trivially by ng_ether since ng_ether_shutdown() persists the
    node unless NG_REALLY_DIE is set.

    PR:             233622
    Reviewed by:    afedorov, kp, Lutz Donnerhacke
    MFC after:      2 weeks
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D27662

 sys/netgraph/ng_base.c  |  4 +---
 sys/netgraph/ng_ether.c | 13 ++++++-------
 2 files changed, 7 insertions(+), 10 deletions(-)
Comment 13 Marek Zarychta 2020-12-24 21:20:22 UTC
(In reply to commit-hook from comment #12
With this patch applied on 12.2-STABLE VNET jails with VLAN interfaces and NETGRAPH are usable. That's really great Christmas gift! Thank you.
Comment 14 commit-hook freebsd_committer freebsd_triage 2021-01-06 14:58:55 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=62489e19ebe287034833d07aeb74ca0f0599be6e

commit 62489e19ebe287034833d07aeb74ca0f0599be6e
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2020-12-23 05:11:16 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-01-06 14:41:53 +0000

    netgraph: Fix ng_ether's shutdown handing

    When tearing down a VNET, netgraph sends shutdown messages to all of the
    nodes before detaching interfaces (SI_SUB_NETGRAPH comes before
    SI_SUB_INIT_IF in teardown order).  ng_ether nodes handle this by
    destroying themselves without detaching from the parent ifnet.  Then,
    when ifnets go away they detach their ng_ether nodes again, triggering a
    use-after-free.

    Handle this by modifying ng_ether_shutdown() to detach from the ifnet.
    If the shutdown was triggered by an ifnet being destroyed, we will clear
    priv->ifp in the ng_ether detach callback, so priv->ifp may be NULL.

    Also get rid of the printf in vnet_netgraph_uninit().  It can be
    triggered trivially by ng_ether since ng_ether_shutdown() persists the
    node unless NG_REALLY_DIE is set.

    PR:             233622
    Reviewed by:    afedorov, kp, Lutz Donnerhacke

    (cherry picked from commit cd698c51790e956fed0975f451d3dfc361dc7c24)

 sys/netgraph/ng_base.c  |  4 +---
 sys/netgraph/ng_ether.c | 13 ++++++-------
 2 files changed, 7 insertions(+), 10 deletions(-)