Bug 228501 - [VIMAGE JAIL] panic: negative refcount 0xfffff8002717643c (when stopping jail)
Summary: [VIMAGE JAIL] panic: negative refcount 0xfffff8002717643c (when stopping jail)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-net mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-26 12:02 UTC by Marie Helene Kvello-Aune
Modified: 2018-05-29 12:10 UTC (History)
3 users (show)

See Also:


Attachments
panic_backtrace.txt (1.08 KB, text/plain)
2018-05-26 12:02 UTC, Marie Helene Kvello-Aune
no flags Details
panic_backtrace_ipv6.txt (995 bytes, text/plain)
2018-05-26 13:20 UTC, Marie Helene Kvello-Aune
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marie Helene Kvello-Aune 2018-05-26 12:02:40 UTC
Created attachment 193709 [details]
panic_backtrace.txt

I have a single jail running on my devbox, and when I stop the jail (either manually or through reboot), the kernel panics with the exact same message and stack trace every time.
Comment 1 Marie Helene Kvello-Aune 2018-05-26 12:04:21 UTC
Additional info:

# uname -a 
FreeBSD venus 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r334213: Sat May 26 13:02:30 CEST 2018 root@venus:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

# cat /etc/jail.conf
exec.start              = "/bin/sh /etc/rc";
exec.stop               = "/bin/sh /etc/rc.shutdown";
exec.clean;
mount.devfs;
mount.fstab             = "/etc/fstab.$name";
path                    = "/usr/jails/$name";
host.hostname           = "$name";

devsamba {
  vnet;
  vnet.interface            = "epair0b";
}


# kldstat
Id Refs Address            Size     Name
 1   39 0xffffffff80200000  23fb800 kernel
 2    2 0xffffffff825fd000     a838 opensolaris.ko
 3    1 0xffffffff82608000   4345f8 zfs.ko
 4    1 0xffffffff82a3d000     2d58 coretemp.ko
 5    1 0xffffffff82a40000     cde0 aesni.ko
 6    1 0xffffffff82a4d000   579f88 vmm.ko
 7    1 0xffffffff82fc7000     7d10 filemon.ko
 8    1 0xffffffff82fcf000    10130 if_bridge.ko
 9    2 0xffffffff82fe0000     7978 bridgestp.ko
10    1 0xffffffff83964000     19a8 fdescfs.ko
11    1 0xffffffff83966000     1ec9 if_epair.ko
12    1 0xffffffff83968000     2388 ums.ko
13    1 0xffffffff8396b000     1780 uhid.ko
14    1 0xffffffff8396d000     26d0 nullfs.ko
Comment 2 Marie Helene Kvello-Aune 2018-05-26 12:13:36 UTC
Update: Just confirmed the panic does not happen on non-VNET jails.
Comment 3 Bjoern A. Zeeb freebsd_committer 2018-05-26 12:34:37 UTC
Can you try to go back to r334117 or just before and see if it happens there as well?
Comment 4 Marie Helene Kvello-Aune 2018-05-26 13:20:08 UTC
Update:

If I manually remove all IPv4 addresses so that only IPv6 addresses remain (or even if I remove ::1 too, but leave fe80::1%lo0 there) before stopping the jail, I get a similar kernel panic as previously (see new attachment: panic_backtrace_ipv6.txt)

I managed to crash the system spectacularly by removing all IP addresses before stopping the jail:
# jexec devsamba ifconfig lo0 inet6 ::1 -alias
# jexec devsamba ifconfig lo0 inet6 fe80::1%lo0 -alias
# jexec devsamba ifconfig lo0 -alias
# service jail stop devsamba

Fatal trap 9: general protection fault while in kernel mode
cpuid = 7; apic id = 07
instruction pointer     = 0x20:0xffffffff80ca2032
stack pointer           = 0x0:0xfffffe0077b96770
frame pointer           = 0x0:0xfffffe0077b96840
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (thread taskq)

## This was printed in console, but not part of dump:
[ thread pid 0 tid 100015 ]
Stopped at rt_foreach_fib_walk_del+0x1c2: call *%eax

I have the default FIB configuration (one fib, which is fib 0).

=====

Summary; all actions taken inside the jail, followed by stopping the jail:

* Manually remove all IPv4 addresses (but leaving IPv6 addresses): panic (negative ref count)
* Manually remove all IP addresses: panic (general protection fault)
* Just stopping the jail: panic (negative ref count)
Comment 5 Marie Helene Kvello-Aune 2018-05-26 13:20:46 UTC
Created attachment 193711 [details]
panic_backtrace_ipv6.txt
Comment 6 Matthew Macy 2018-05-26 17:31:23 UTC
This may be fixed in r334222. Please update.
Comment 7 Marie Helene Kvello-Aune 2018-05-26 19:01:47 UTC
Apparently, this was partially fixed in base r334222.

I tried with r334237 and problem is slightly harder to reproduce. I suspect it now requires two interfaces to have an IP address. (previously, it was enough that lo0 had the default configuration of ipv4+ipv6 addresses)

I'll probe on this a bit more when I get time, but to reproduce: Create any vnet jail, assign an epairXb device to it, have the jails rc.conf configure an IPv4 address on it, start the jail and then stop the jail.
Comment 8 Marie Helene Kvello-Aune 2018-05-27 13:28:21 UTC
Confirmed: There's no panic when repeatidly starting&stopping the jail w/o an IP assigned to epair0b.

If I assign an IP to epair0b (I did so through the jails rc.conf), start the jail, then stop it; there's an immediate panic: negative refcount.

During this test, no network daemons were enabled, sendmail was explicitly disabled, and syslogd were passed the "-ss" flags upon startup.

== Panic summary ==
panic: negative refcount 0xfffff8001d12803c
cpuid = 0
time = 1527428596
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0077b96660
vpanic() at vpanic+0x1a3/frame 0xfffffe0077b966c0
doadump() at doadump/frame 0xfffffe0077b96740
ifa_free() at ifa_free+0x35/frame 0xfffffe0077b96760
in_difaddr_ioctl() at in_difaddr_ioctl+0x460/frame 0xfffffe0077b967c0
in_ifscrub_all() at in_ifscrub_all+0xff/frame 0xfffffe0077b96850
ip_destroy() at ip_destroy+0xbd/frame 0xfffffe0077b96870
vnet_destroy() at vnet_destroy+0x12c/frame 0xfffffe0077b968a0
prison_deref() at prison_deref+0x29d/frame 0xfffffe0077b968e0
taskqueue_run_locked() at taskqueue_run_locked+0x14c/frame 0xfffffe0077b96940
taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe0077b96970
fork_exit() at fork_exit+0x84/frame 0xfffffe0077b969b0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0077b969b0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic


Additional info about jail:
# jexec devsamba sockstat -l
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS
root     casperd    28345 7  stream /var/run/casper
root     syslogd    26069 5  dgram  /var/run/log
root     syslogd    26069 6  dgram  /var/run/logpriv
Comment 9 Matthew Macy 2018-05-28 07:38:10 UTC
Can you tell me what line in in_difaddr_ioctl that corresponds to? There are 3 calls to ifa_free.
Comment 10 Marie Helene Kvello-Aune 2018-05-28 10:01:02 UTC
Does this shed some light on the issue?

----
#0  doadump (textdump=1) at pcpu.h:231
231     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:231
#1  0xffffffff80b747d2 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:446
#2  0xffffffff80b74db3 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:863
#3  0xffffffff80b74b20 in kassert_panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:749

#4  0xffffffff80c74505 in ifa_free (ifa=0xfffff80022f8c400) at refcount.h:65
#5  0xffffffff80cf9420 in in_difaddr_ioctl (cmd=2149607705,
    data=<value optimized out>, ifp=0xfffff80034428000,
    td=<value optimized out>) at /usr/src/sys/netinet/in.c:648

#6  0xffffffff80cf9c1f in in_ifscrub_all () at /usr/src/sys/netinet/in.c:250
#7  0xffffffff80d0bf4d in ip_destroy (unused=<value optimized out>)
    at /usr/src/sys/netinet/ip_input.c:398
#8  0xffffffff80ca74ec in vnet_destroy (vnet=0xfffff8000408ed40)
    at /usr/src/sys/net/vnet.c:597
#9  0xffffffff80b3b6ad in prison_deref (pr=0xffffffff81b0fae0, flags=20)
    at /usr/src/sys/kern/kern_jail.c:2630
#10 0xffffffff80bd027c in taskqueue_run_locked (queue=0xfffff8000461c000)
    at /usr/src/sys/kern/subr_taskqueue.c:465
#11 0xffffffff80bd1048 in taskqueue_thread_loop (arg=<value optimized out>)
    at /usr/src/sys/kern/subr_taskqueue.c:757
#12 0xffffffff80b33ea4 in fork_exit (
    callout=0xffffffff80bd0fc0 <taskqueue_thread_loop>,
    arg=0xffffffff82024dd0, frame=0xfffffe0077b969c0)
    at /usr/src/sys/kern/kern_fork.c:1039
#13 0xffffffff8101f99e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:971
#14 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb)
Comment 11 Marie Helene Kvello-Aune 2018-05-29 12:10:09 UTC
None of the panics described in this PR occurs any more (tested on base r334319); probably fixed in base r334311 through base r334314.