Summary: | [VIMAGE JAIL] panic: negative refcount 0xfffff8002717643c (when stopping jail) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Marie Helene Kvello-Aune <freebsd> | ||||||
Component: | kern | Assignee: | freebsd-net (Nobody) <net> | ||||||
Status: | Closed FIXED | ||||||||
Severity: | Affects Some People | CC: | bz, mmacy, mmacy | ||||||
Priority: | --- | ||||||||
Version: | CURRENT | ||||||||
Hardware: | Any | ||||||||
OS: | Any | ||||||||
Attachments: |
|
Additional info: # uname -a FreeBSD venus 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r334213: Sat May 26 13:02:30 CEST 2018 root@venus:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 # cat /etc/jail.conf exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.clean; mount.devfs; mount.fstab = "/etc/fstab.$name"; path = "/usr/jails/$name"; host.hostname = "$name"; devsamba { vnet; vnet.interface = "epair0b"; } # kldstat Id Refs Address Size Name 1 39 0xffffffff80200000 23fb800 kernel 2 2 0xffffffff825fd000 a838 opensolaris.ko 3 1 0xffffffff82608000 4345f8 zfs.ko 4 1 0xffffffff82a3d000 2d58 coretemp.ko 5 1 0xffffffff82a40000 cde0 aesni.ko 6 1 0xffffffff82a4d000 579f88 vmm.ko 7 1 0xffffffff82fc7000 7d10 filemon.ko 8 1 0xffffffff82fcf000 10130 if_bridge.ko 9 2 0xffffffff82fe0000 7978 bridgestp.ko 10 1 0xffffffff83964000 19a8 fdescfs.ko 11 1 0xffffffff83966000 1ec9 if_epair.ko 12 1 0xffffffff83968000 2388 ums.ko 13 1 0xffffffff8396b000 1780 uhid.ko 14 1 0xffffffff8396d000 26d0 nullfs.ko Update: Just confirmed the panic does not happen on non-VNET jails. Can you try to go back to r334117 or just before and see if it happens there as well? Update: If I manually remove all IPv4 addresses so that only IPv6 addresses remain (or even if I remove ::1 too, but leave fe80::1%lo0 there) before stopping the jail, I get a similar kernel panic as previously (see new attachment: panic_backtrace_ipv6.txt) I managed to crash the system spectacularly by removing all IP addresses before stopping the jail: # jexec devsamba ifconfig lo0 inet6 ::1 -alias # jexec devsamba ifconfig lo0 inet6 fe80::1%lo0 -alias # jexec devsamba ifconfig lo0 -alias # service jail stop devsamba Fatal trap 9: general protection fault while in kernel mode cpuid = 7; apic id = 07 instruction pointer = 0x20:0xffffffff80ca2032 stack pointer = 0x0:0xfffffe0077b96770 frame pointer = 0x0:0xfffffe0077b96840 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (thread taskq) ## This was printed in console, but not part of dump: [ thread pid 0 tid 100015 ] Stopped at rt_foreach_fib_walk_del+0x1c2: call *%eax I have the default FIB configuration (one fib, which is fib 0). ===== Summary; all actions taken inside the jail, followed by stopping the jail: * Manually remove all IPv4 addresses (but leaving IPv6 addresses): panic (negative ref count) * Manually remove all IP addresses: panic (general protection fault) * Just stopping the jail: panic (negative ref count) Created attachment 193711 [details]
panic_backtrace_ipv6.txt
This may be fixed in r334222. Please update. Apparently, this was partially fixed in base r334222. I tried with r334237 and problem is slightly harder to reproduce. I suspect it now requires two interfaces to have an IP address. (previously, it was enough that lo0 had the default configuration of ipv4+ipv6 addresses) I'll probe on this a bit more when I get time, but to reproduce: Create any vnet jail, assign an epairXb device to it, have the jails rc.conf configure an IPv4 address on it, start the jail and then stop the jail. Confirmed: There's no panic when repeatidly starting&stopping the jail w/o an IP assigned to epair0b. If I assign an IP to epair0b (I did so through the jails rc.conf), start the jail, then stop it; there's an immediate panic: negative refcount. During this test, no network daemons were enabled, sendmail was explicitly disabled, and syslogd were passed the "-ss" flags upon startup. == Panic summary == panic: negative refcount 0xfffff8001d12803c cpuid = 0 time = 1527428596 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0077b96660 vpanic() at vpanic+0x1a3/frame 0xfffffe0077b966c0 doadump() at doadump/frame 0xfffffe0077b96740 ifa_free() at ifa_free+0x35/frame 0xfffffe0077b96760 in_difaddr_ioctl() at in_difaddr_ioctl+0x460/frame 0xfffffe0077b967c0 in_ifscrub_all() at in_ifscrub_all+0xff/frame 0xfffffe0077b96850 ip_destroy() at ip_destroy+0xbd/frame 0xfffffe0077b96870 vnet_destroy() at vnet_destroy+0x12c/frame 0xfffffe0077b968a0 prison_deref() at prison_deref+0x29d/frame 0xfffffe0077b968e0 taskqueue_run_locked() at taskqueue_run_locked+0x14c/frame 0xfffffe0077b96940 taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe0077b96970 fork_exit() at fork_exit+0x84/frame 0xfffffe0077b969b0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0077b969b0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic Additional info about jail: # jexec devsamba sockstat -l USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS root casperd 28345 7 stream /var/run/casper root syslogd 26069 5 dgram /var/run/log root syslogd 26069 6 dgram /var/run/logpriv Can you tell me what line in in_difaddr_ioctl that corresponds to? There are 3 calls to ifa_free. Does this shed some light on the issue? ---- #0 doadump (textdump=1) at pcpu.h:231 231 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=1) at pcpu.h:231 #1 0xffffffff80b747d2 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446 #2 0xffffffff80b74db3 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:863 #3 0xffffffff80b74b20 in kassert_panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:749 #4 0xffffffff80c74505 in ifa_free (ifa=0xfffff80022f8c400) at refcount.h:65 #5 0xffffffff80cf9420 in in_difaddr_ioctl (cmd=2149607705, data=<value optimized out>, ifp=0xfffff80034428000, td=<value optimized out>) at /usr/src/sys/netinet/in.c:648 #6 0xffffffff80cf9c1f in in_ifscrub_all () at /usr/src/sys/netinet/in.c:250 #7 0xffffffff80d0bf4d in ip_destroy (unused=<value optimized out>) at /usr/src/sys/netinet/ip_input.c:398 #8 0xffffffff80ca74ec in vnet_destroy (vnet=0xfffff8000408ed40) at /usr/src/sys/net/vnet.c:597 #9 0xffffffff80b3b6ad in prison_deref (pr=0xffffffff81b0fae0, flags=20) at /usr/src/sys/kern/kern_jail.c:2630 #10 0xffffffff80bd027c in taskqueue_run_locked (queue=0xfffff8000461c000) at /usr/src/sys/kern/subr_taskqueue.c:465 #11 0xffffffff80bd1048 in taskqueue_thread_loop (arg=<value optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:757 #12 0xffffffff80b33ea4 in fork_exit ( callout=0xffffffff80bd0fc0 <taskqueue_thread_loop>, arg=0xffffffff82024dd0, frame=0xfffffe0077b969c0) at /usr/src/sys/kern/kern_fork.c:1039 #13 0xffffffff8101f99e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:971 #14 0x0000000000000000 in ?? () Current language: auto; currently minimal (kgdb) None of the panics described in this PR occurs any more (tested on base r334319); probably fixed in base r334311 through base r334314. |
Created attachment 193709 [details] panic_backtrace.txt I have a single jail running on my devbox, and when I stop the jail (either manually or through reboot), the kernel panics with the exact same message and stack trace every time.