Jails managed through the native jails tools with the attached jail.conf file (so called "thin jails", VNET). I can reliably crash the system by issuing a "service jail restart". The crash happens when jails are being stopped. [root@beastie ~]# uname -a FreeBSD beastie 12.0-RELEASE-p4 FreeBSD 12.0-RELEASE-p4 GENERIC amd64 [root@beastie ~]# service jail restart Stopping jails: dns db ldap web nextcloud imap smtp testpacket_write_wait: Connection to xxx.xxx.xxx.xxx port 22: Broken pipe As a workaround I can avoid the crash by inserting a exec.poststop = "sleep 2"; statement in jail.conf. 1 second was not enough to avoid the crash. Also worth noting that the issue does not appear if I totally disable networking, hence my guess this is somehow VIMAGE/VNET related. I've managed to obtain a core dump but my attempts at debugging didn't go far as I'm not experienced with this. Of course happy to do more debugging if this can help identifying the issue. The system is not yet live so I can pretty much try anything on it. [root@beastie /var/crash]# kgdb /boot/kernel/kernel /var/crash/vmcore.0 GNU gdb (GDB) 8.2.1 [GDB v8.2.1 for FreeBSD] Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-portbld-freebsd12.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /boot/kernel/kernel...(no debugging symbols found)...done. 0xffffffff80bcd0bd in sched_switch () (kgdb) bt #0 0xffffffff80bcd0bd in sched_switch () #1 0xffffffff80ba6de1 in mi_switch () #2 0xffffffff80bf554c in sleepq_wait () #3 0xffffffff80ba6817 in _sleep () #4 0xffffffff80bfae71 in taskqueue_thread_loop () #5 0xffffffff80b5bf33 in fork_exit () #6 <signal handler called> My jail.conf file below. Note that the crash happens regardless of whether the jails with special permissions (db, builder) are configured. I also tried tweaking the mount/umount order without success. $release="12.0-RELEASE"; # Release used to create the jail if not already existing host.hostname = "${name}"; path = "/jails/${name}"; exec.consolelog = "/var/log/jail.${name}.console.log"; vnet = "new"; vnet.interface = "epair${jailnum}b"; ### Create the jail if not already exsisting ### exec.prestart = "if [ ! -d "/jails/THINJAILS/${name}" ]; then zfs clone zroot/jails/TEMPLATES/skeleton-${release}@skeleton zroot/jails/THINJAILS/${name}; fi"; exec.prestart += "if [ ! -d "/jails/${name}" ]; then mkdir /jails/${name}; fi"; ### Mount filesystems for shared base (RO) and individual jail skeleton (RW) exec.prestart += "mount_nullfs -o ro /jails/TEMPLATES/base-${release} /jails/${name}"; exec.prestart += "mount_nullfs -o rw /jails/THINJAILS/${name}/etc /jails/${name}/etc"; exec.prestart += "mount_nullfs -o rw /jails/THINJAILS/${name}/home /jails/${name}/home"; exec.prestart += "mount_nullfs -o rw /jails/THINJAILS/${name}/root /jails/${name}/root"; exec.prestart += "mount_nullfs -o rw /jails/THINJAILS/${name}/tmp /jails/${name}/tmp"; exec.prestart += "mount_nullfs -o rw /jails/THINJAILS/${name}/var /jails/${name}/var"; exec.prestart += "mount_nullfs -o rw /jails/THINJAILS/${name}/usr/local /jails/${name}/usr/local"; ### Mount /usr/ports from hosts exec.prestart += "mount_nullfs -o ro /usr/ports /jails/${name}/usr/ports"; ### Mount data filesystem if exists exec.prestart += "if [ -d "/jails/DATA/${name}" ]; then mount_nullfs -o rw /jails/DATA/${name} /jails/${name}/data; fi"; ### Mount devfs with the default ruleset (11) now that the jail filesystems are mounted exec.prestart += "mount -t devfs -o ruleset=4 devfs /jails/${name}/dev"; ### Create an ethernet pair and add to bridge0 (internal bridge) exec.prestart += "ifconfig epair${jailnum} create up"; exec.prestart += "ifconfig epair${jailnum}a description '${name} - host interface'"; exec.prestart += "ifconfig epair${jailnum}b description '${name} - jail interface'"; exec.prestart += "ifconfig bridge0 addm epair${jailnum}a"; exec.start = "ifconfig epair${jailnum}b inet 192.168.1.${jailnum}/24"; exec.start += "ifconfig epair${jailnum}b inet6 xxx:xxx:xxx:xxx::${jailnum}:1/64"; exec.start += "route add default 192.168.1.1"; exec.start += "route -6 add default xxx:xxx:xxx:xxx::1:1"; ### Proceed with boot through rc exec.start += "/bin/sh /etc/rc"; ### Shutdown the jail exec.stop = "/bin/sh /etc/rc.shutdown"; ### Give the jail a couple of seconds to shutdown (avoid issues unmounting filesystems) #exec.poststop = "sleep 2"; ### Destroy jail network interface exec.poststop += "ifconfig epair${jailnum}a destroy"; ### Unmount filesystems exec.poststop += "if [ -d "/jails/DATA/${name}" ]; then umount /jails/${name}/data; fi"; exec.poststop += "umount -f /jails/${name}/dev"; exec.poststop += "umount -f /jails/${name}/usr/ports"; exec.poststop += "umount -f /jails/${name}/etc"; exec.poststop += "umount -f /jails/${name}/home"; exec.poststop += "umount -f /jails/${name}/root"; exec.poststop += "umount -f /jails/${name}/tmp"; exec.poststop += "umount -f /jails/${name}/var"; exec.poststop += "umount -f /jails/${name}/usr/local"; exec.poststop += "umount -f /jails/${name}"; dns { $jailnum=2; }; db { $jailnum=3; allow.sysvipc; }; ldap { $jailnum=4; }; web { $jailnum=5; }; nextcloud { $jailnum=6; }; imap { $jailnum=7; }; smtp { $jailnum=8; }; test { $jailnum=98; }; builder { $jailnum=99; enforce_statfs=0; allow.mount; allow.mount.nullfs; allow.mount.tmpfs; allow.mount.devfs; allow.chflags; };
Issue had somehow disappeared from 12.0-RELEASE with one of the subsequent patches (think around -p3 or -p4). It is unfortunately back after upgrading to 12.1-RELEASE. Adding back the 2 second sleep in jail.conf still works as a workaround though.
(In reply to paul.le.gauret from comment #1) if you have a coredump; check if you have the text files as well; a panic string etc would be helpful; the above 12.0 output was not. man crashinfo (which might automatically run on the boot following the crash) can help (/etc/rc.d/savecore). If you are running a release please make sure debug symbols are installed in /usr/lib/debug/boot/kernel. https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-gdb.html also has some info.
A couple of time now (since July, I think) I see a similar phenomenon on a very new Fujitsu server with 13-CURRENT (FreeBSD 13.0-CURRENT #25 r354673: Wed Nov 13 06:47:48 CET 2019 amd64); we manage the jails with FreeBSD native aboard tools and configure those via /etc/jail.conf. Stopping jails brings down the box 100%, a shutdown, which triggers a clean shutdown I guess, too. In most cases I can circumference the crash by rebooting via "reboot". The box is a dual socket NUMA system, equipted with only 1 CPU and only on RAM bank filled with DIMMS. I'll append the dmesg output afterwards. Due to a toolchain corruption on that system compiling a debugguing kernel isn't possible, so the information I have so far is the panic string from two coredumps: Version String: FreeBSD 13.0-CURRENT #15 r354144: Tue Oct 29 06:21:38 CET 2019 Panic String: page fault and Version String: FreeBSD 13.0-CURRENT #11 r353877: Tue Oct 22 11:02:32 CEST 2019 Panic String: m_getzone: invalid cluster size 0 The cores are too old to compare them with the recent kernel running and at the moment I do not dare to trigger a crash due to several needs of the box and harsh corruptions to the UFS/FFS SSD bearing the OS. Maybe those issue with 12-STABLE and 13-CURRENT are linked, I regret not having an iron runnidng 12-STABLE right now on the same CPU type.
I too am receiving a kernel panic given options similar to the reporter. I've used a screen recorder to capture the panic. If anyone is interested in the video file I'll post it somewhere. If not, here is my transcribe of the video to text. The panic text: Freed UMA keg (rentry) was not empty (17 items). Lost 1 pages of memory. Stack trace looks as follows: #0 0xffffffff80c1d967 at kdb_backtrace+0x67 #1 0xffffffff80bd0dcd at vpanic+0x19d #2 0xffffffff80bd0c23 at panic+0x43 #3 0xffffffff810aab6c at trap_fatal+0x39c #4 0xffffffff810aabbf at trap_pfault+0x4f #5 0xffffffff810aa1f1 at trap+0x2a1 #6 0xffffffff8108373c at calltrap+0x8 #7 0xffffffff80bcb470 at _rm_rlock_hard+0x3b9 #8 0xffffffff80cfb5fe at rtinit+0x2ee #9 0xffffffff80d4d39c at in_scrubprefix+0x23c #10 0xffffffff80d64d7d at rip_ctlinput+0x9d #11 0xffffffff80c5cb7c at pfctlinput+0x5c #12 0xffffffff80cd0cea at if_down+0x13a #13 0xffffffff80cce53a at if_detach_internal+0x87a #14 0xffffffff80ccdcae at if_detach+0x2e #15 0xffffffff82bc7c01 at epair_clone_destroy+0x81 #16 0xffffffff80cd64dd at if_clone_destroyif+0x10d #17 0xffffffff80cd636e at if_clone_destroy+0x1be
For completeness here my jail.conf and pertinent rc.conf jail.conf: ++++++++++++++++++++++++++++++++++ $bridge = "bridge${vlan}"; $epair = "epair${vlan}"; path = "/jails/hosts/$name"; exec.prestart = "ifconfig $bridge create up"; exec.prestart += "ifconfig $bridge addm $name"; exec.prestart += "ifconfig $epair create up"; exec.prestart += "ifconfig $bridge addm ${epair}a"; exec.clean; exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.poststop = "ifconfig $bridge deletem ${epair}a"; exec.poststop = "ifconfig ${epair}a destroy"; vnet; vnet.interface = "${epair}b"; resolver1 { $vlan = "50"; } ++++++++++++++++++++++++++++++++++ rc.conf: ++++++++++++++++++++++++++++++++++ vlans_igb1="resolver1" create_args_resolver1="vlan 50" ifconfig_resolver1="inet 192.168.50.1 netmask 255.255.255.252" ++++++++++++++++++++++++++++++++++
Also this seems to be related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234985
I've fixed the problem with the following workaround: exec.prestop = "ifconfig ${epair}b -vnet $name"; This is taken nearly verbatim from the link I just posted. $name in the command above can be either the name of the jail or the jail id. This is a bug in the VNET cleanup code and it's necessary to remove the epair interface from the jail before stopping it.
(In reply to pprocacci from comment #7) Hi, Your backtrace looks very similar to mine at https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219901 Can you get it to fail consistently? I have been running a script that: - brings the epair interfaces up - attaches one end to a bridge - brings a jail up - adds the other epair interface to the jail - kills the jail - kills the epair interface It only dies randomly in dev/prod boxes :(
exec.prestop = "ifconfig ${epair}b -vnet $name"; Before adding the above, it would kernel panic every single time. The key is removing the vnet interface from the jail prior to shutting the jail down so the VNET cleanup code essentially has no interface to worry about. If you're working on some sort of shell script; on the host you'd: # ifconfig interface_name_inside_of_jail -vnet $jail_name_or_id .... and then proceed to kill off the jail. It shouldn't panic any more in relation to the VNET cleanup code.
This issue is persistent on recent CURRENT ( FreeBSD 13.0-CURRENT #26 r356437: Tue Jan 7 07:19:34 CET 2020 amd64). The only reliable way to reboot the host without violent and destructive crashes is to issue "reboot" on the shell/console as root.
The bug is very easy to reproduce in VIRTUAL MACHINE, eg: VirtualBox, Hyper-V, VMWare or ESXi, but not in real machine.
For the record: I can easily replicate this issue on physical server at work on 12.1-RELEASE-p5. This server is: Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz FreeBSD/SMP: Multiprocessor System Detected: 48 CPUs FreeBSD/SMP: 2 package(s) x 12 core(s) x 2 hardware threads exec.prestop = "ifconfig ${epair}b -vnet $name";
Sorry lost a line exec.prestop = "ifconfig ${epair}b -vnet $name"; Mitigates the issue
I can also reliably reproduce this on a physical machine using the vnet_epair_test.sh script at bug #234985. Server: CPU: AMD Ryzen 5 3600 6-Core Processor (3593.32-MHz K8-class CPU) FreeBSD/SMP: Multiprocessor System Detected: 12 CPUs FreeBSD/SMP: 1 package(s) x 2 cache groups x 3 core(s) x 2 hardware threads Running 'ifconfig ${epair}b -vnet ${jid}' before removing the jail avoids the kernel panic. However, I would prefer to shut my jails down in a clean way rather than just pulling the (network) plug.
This problem is still present in 12-STABLE, CURRENT and 12.1-RELENG. (In reply to pprocacci from comment #7) On 12.1-RELENG (most recent), 12-STABLE and CURRENT (r362906), using the workaround as suggested in comment #7 (see above), using exec.prestop= "ifconfig ${if_vnet}a -vnet ${name}"; where ${if_vnet} is expanded to my epair interface and its subinterface is "a" instead of "b" (a is the interafce owned by the jail in the inner), I receive variable if_net not known error It seems that only the command exec.poststop is affected, all other commands, either stop/start targetting the running jail and those targetting the non-running jail (psotstop/prestart etc.) do not show the error.
Markus Stoff wrote: > Running 'ifconfig ${epair}b -vnet ${jid}' before removing the jail avoids > the kernel panic. However, I would prefer to shut my jails down in a > clean way rather than just pulling the (network) plug. While it's a little awkward-looking, you can do something like this to make sure you've cleanly shut down and detached: exec.prestop = "/usr/sbin/jexec ${name} /bin/sh /etc/rc.shutdown"; exec.prestop += "/sbin/ifconfig epair${ep}b -vnet ${name}"; exec.poststop = "ifconfig $bridge deletem epair${ep}a"; exec.poststop += "ifconfig epair${ep}a destroy"; The notable thing is that exec.prestop and exec.poststop run in system context, not jail context, so you need the jexec to execute the clean shutdown - but it works.
(In reply to Mason Loring Bliss from comment #16) Yes, this will work. It still feels a bit hacky, though... ;-)
same problem here on FreeBSD 12.1 p10
The problem is still present on 12.2-RELEASE-p3.
(In reply to Zhenlei Huang from comment #19) A panic message would be helpful; some folks have noted a tangentially related use-after-free in similar circumstances. It'd be good to note if you're hitting the primary issue that kp fixed or a second UAF.
I am experiencing those crashes for a while now, and they continue to happen even after migrating from 12-STABLE to 13-STABLE recently. Note: I did try every recommendation regarding jail shutdown in /etc/jail.conf, and whether removing vnet before final shutdown or not, doesen't prevent those random crashes. Here my relevant part of /etc/jail.conf regarding the the panic message following below. ------------- /etc/rc.conf ------------------ # # host dependent global settings # $ip4prefixLOCAL = "10.10.10“; $ip6prefixLOCAL = "fd00:e:e:e“; # # global jail settings # $MTU = "mtu 1490"; host.hostname = "${name}"; path = "/usr/home/jails/${name}"; mount.fstab = "/etc/fstab.${name}"; exec.consolelog = "/var/log/jail_${name}_console.log"; vnet = "new"; vnet.interface = "epair${jailID}b"; exec.clean; mount.devfs; persist; # # network settings to apply/destroy during start/stop of every jail # exec.prestart = "sleep 2"; exec.prestart += "/sbin/ifconfig epair${jailID} create up ${MTU}"; exec.prestart += "/sbin/ifconfig bridge0 addm epair${jailID}a"; exec.prestart += "/sbin/ifconfig epair${jailID}a"; exec.start = "/sbin/sysctl net.inet6.ip6.dad_count=0"; exec.start += "/sbin/ifconfig lo0 127.0.0.1 up"; exec.start += "/sbin/ifconfig epair${jailID}b inet ${ip4_addr} ${MTU}"; exec.start += "/sbin/ifconfig epair${jailID}b inet6 ${ip6_addr} ${MTU}"; exec.start += "/sbin/route add default -gateway ${ip4prefixLOCAL}.254"; exec.start += "/sbin/route add -inet6 default -gateway ${ip6prefixLOCAL}::254"; exec.stop = "/sbin/route del default"; exec.stop += "/sbin/route del -inet6 default"; exec.stop += "/bin/sh /etc/rc.shutdown"; # testing: reported to prevent from crashing (BUT: will crash as well!) #exec.poststop = "/sbin/ifconfig epair${jailID}a -vnet ${jailID}"; exec.poststop += "/sbin/ifconfig epair${jailID}a destroy"; # # individual jail settings # [snip] jail5 { $jailID = 5; $ip4_addr = ${ip4prefixLOCAL}.5; $ip6_addr = ${ip6prefixLOCAL}::5/64; exec.start += "/bin/sh /etc/rc"; } jail6 { $jailID = 6; $ip4_addr = ${ip4prefixLOCAL}.6; $ip6_addr = ${ip6prefixLOCAL}::6/64; exec.start += "/bin/sh /etc/rc"; } ------------- /var/log/messages ------------------- Jan 30 20:02:42 <kern.info> mer-waases kernel: epair5a: link state changed to DOWN Jan 30 20:02:42 <kern.info> mer-waases kernel: epair5b: link state changed to DOWN Jan 30 20:02:42 <kern.info> mer-waases kernel: in6_purgeaddr: err=65, destination address delete failed Jan 30 20:02:42 <kern.crit> mer-waases kernel: Freed UMA keg (rtentry) was not empty (1 items). Lost 1 pages of memory. Jan 30 20:02:47 <kern.info> mer-waases kernel: epair6a: link state changed to DOWN Jan 30 20:02:47 <kern.info> mer-waases kernel: epair6b: link state changed to DOWN Jan 30 20:02:48 <kern.info> mer-waases kernel: in6_purgeaddr: err=65, destination address delete failed Jan 30 20:02:48 <kern.crit> mer-waases kernel: Freed UMA keg (rtentry) was not empty (1 items). Lost 1 pages of memory. Jan 30 20:03:33 <syslog.info> mer-waases syslogd: restart Jan 30 20:03:33 <kern.info> mer-waases syslogd: kernel boot file is /boot/kernel/kernel Jan 30 20:03:33 <kern.crit> mer-waases kernel: Jan 30 20:03:33 <kern.crit> mer-waases syslogd: last message repeated 1 times Jan 30 20:03:33 <kern.crit> mer-waases kernel: Fatal trap 12: page fault while in kernel mode Jan 30 20:03:33 <kern.crit> mer-waases kernel: cpuid = 0; apic id = 00 Jan 30 20:03:33 <kern.crit> mer-waases kernel: fault virtual address = 0x0 Jan 30 20:03:33 <kern.crit> mer-waases kernel: fault code = supervisor write data, page not present Jan 30 20:03:33 <kern.crit> mer-waases kernel: instruction pointer = 0x20:0xffffffff80c668be Jan 30 20:03:33 <kern.crit> mer-waases kernel: stack pointer = 0x28:0xfffffe000e9e86c0 Jan 30 20:03:33 <kern.crit> mer-waases kernel: frame pointer = 0x28:0xfffffe000e9e8700 Jan 30 20:03:33 <kern.crit> mer-waases kernel: code segment = base rx0, limit 0xfffff, type 0x1b Jan 30 20:03:33 <kern.crit> mer-waases kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 Jan 30 20:03:33 <kern.crit> mer-waases kernel: processor eflags = interrupt enabled, resume, IOPL = 0 Jan 30 20:03:33 <kern.crit> mer-waases kernel: current process = 12 (swi1: netisr 0) Jan 30 20:03:33 <kern.crit> mer-waases kernel: trap number = 12 Jan 30 20:03:33 <kern.crit> mer-waases kernel: panic: page fault Jan 30 20:03:33 <kern.crit> mer-waases kernel: cpuid = 0 Jan 30 20:03:33 <kern.crit> mer-waases kernel: time = 1612033371 Jan 30 20:03:33 <kern.crit> mer-waases kernel: KDB: stack backtrace: Jan 30 20:03:33 <kern.crit> mer-waases kernel: #0 0xffffffff80c44f65 at kdb_backtrace+0x65 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #1 0xffffffff80bf7bf1 at vpanic+0x181 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #2 0xffffffff80bf7a63 at panic+0x43 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #3 0xffffffff8102b237 at trap_fatal+0x387 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #4 0xffffffff8102b28f at trap_pfault+0x4f Jan 30 20:03:33 <kern.crit> mer-waases kernel: #5 0xffffffff8102a8ed at trap+0x27d Jan 30 20:03:33 <kern.crit> mer-waases kernel: #6 0xffffffff810019e8 at calltrap+0x8 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #7 0xffffffff80c914fe at sowakeup+0x1e Jan 30 20:03:33 <kern.crit> mer-waases kernel: #8 0xffffffff80dcc0f6 at udp_append+0x236 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #9 0xffffffff80dcbc1c at udp_input+0x73c Jan 30 20:03:33 <kern.crit> mer-waases kernel: #10 0xffffffff80d9c3c5 at ip_input+0x125 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #11 0xffffffff80d2c27a at netisr_dispatch_src+0xca Jan 30 20:03:33 <kern.crit> mer-waases kernel: #12 0xffffffff80d10c78 at ether_demux+0x138 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #13 0xffffffff80d12011 at ether_nh_input+0x351 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #14 0xffffffff80d2c27a at netisr_dispatch_src+0xca Jan 30 20:03:33 <kern.crit> mer-waases kernel: #15 0xffffffff80d110c9 at ether_input+0x69 Jan 30 20:03:33 <kern.crit> mer-waases kernel: #16 0xffffffff80d2ca1b at swi_net+0x12b Jan 30 20:03:33 <kern.crit> mer-waases kernel: #17 0xffffffff80bb8e6d at ithread_loop+0x24d Jan 30 20:03:33 <kern.crit> mer-waases kernel: Uptime: 4h47m14s Jan 30 20:03:33 <kern.crit> mer-waases kernel: Automatic reboot in 15 seconds - press a key on the console to abort Jan 30 20:03:33 <kern.crit> mer-waases kernel: Rebooting... Jan 30 20:03:33 <kern.crit> mer-waases kernel: ---<<BOOT>>--- HTH, Michael
(In reply to Kyle Evans from comment #20) Steps to reproduce the kernel panic: Host environment: FreeBSD 12.2 Guest fresh install with kernel debug symbols, VMware Fusion 12.1.0, hardware configured with 4 Processor cores and 1G memory, system updated to 12.2-RELEASE-p3. Host and jail's /etc/rc.conf: ------------- rc.conf ------------------ # The jails share this rc.conf, let's disable the syslog service syslogd_enable="NO" #syslogd_flags="-ss" sendmail_enable="NONE" hostname="" ifconfig_em0="DHCP" dumpdev="AUTO" zfs_enable="YES" ---------------------------------------- Host's /etc/jail.conf: ------------ jail.conf ----------------- # template for all test jails # it is convenient to share host's filesystem path = "/"; exec.clean; vnet = new; vnet.interface = "epair${ifnum}b"; exec.prepare = "/sbin/ifconfig epair${ifnum} create"; exec.prepare += "/sbin/ifconfig epair${ifnum}a inet 192.168.${ifnum}.1/24 up"; exec.start = "/bin/sh /etc/rc"; # I've no ideas why opening and binding a socket would trigger the kernel panic more likely :( exec.start += "/usr/sbin/daemon /usr/bin/nc -l 0.0.0.0 9999"; exec.start += "/sbin/ifconfig epair${ifnum}b inet 192.168.${ifnum}.2/24"; exec.start += "/sbin/route add default 192.168.${ifnum}.1"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.poststop += "/sbin/ifconfig epair${ifnum}a destroy"; test1 { $ifnum = 10; } # with more jails it seems crash the host more likely test2 { $ifnum = 20; } ---------------------------------------- Then repeat stopping and starting jail service, the host crashes about once in 2 or 3 times. # service jail onestart && service jail onestop ... The kernel panic message: Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 04 fault virtual address = 0x410 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80b9f237 stack pointer = 0x28:0xfffffe0015b55370 frame pointer = 0x28:0xfffffe0015b553f0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 93087 (ifconfig) trap number = 12 panic: page fault cpuid = 2 time = 1612193992 KDB: stack backtrace: #0 0xffffffff80c0aa85 at kdb_backtrace+0x65 #1 0xffffffff80bbed3b at vpanic+0x17b #2 0xffffffff80bbebb3 at panic+0x43 #3 0xffffffff8108e911 at trap_fatal+0x391 #4 0xffffffff8108e96f at trap_pfault+0x4f #5 0xffffffff8108dfb6 at trap+0x286 #6 0xffffffff81066938 at calltrap+0x8 #7 0xffffffff80bb9591 at _rm_rlock_hard+0x3c1 #8 0xffffffff80ce5ce6 at rtinit+0x2a6 #9 0xffffffff80d3873e at in_scrubprefix+0x29e #10 0xffffffff80d5001d at rip_ctlinput+0x8d #11 0xffffffff80c4922c at pfctlinput+0x5c #12 0xffffffff80cbb4fa at if_down+0x12a #13 0xffffffff80cb90d0 at if_detach_internal+0x150 #14 0xffffffff80cb8df0 at if_detach+0x50 #15 0xffffffff82b1ebb1 at epair_clone_destroy+0x81 #16 0xffffffff80cc0c4d at if_clone_destroyif+0xdd #17 0xffffffff80cc0b12 at if_clone_destroy+0x1a2 Uptime: 1m22s Dumping 160 out of 982 MB:..10%..20%..30%..40%..50%..60%..70%..80%..90%..100% To be clear, after update to 12.2-RELEASE-p3, it's difficult to crash the host without the below line in jail.conf: exec.start += "/usr/sbin/daemon /usr/bin/nc -l 0.0.0.0 9999"; I'll attach full core text dump later.
Created attachment 222062 [details] Kernel panic core text dump
I use vnet with Netgraph and the problem is still present on: # freebsd-version -k 13.0-RELEASE-p11 # freebsd-version -u 13.0-RELEASE-p11 My Kernel Panic: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x5110000004d8 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d073e5 stack pointer = 0x28:0xfffffe00a1acb9d0 frame pointer = 0x28:0xfffffe00a1acb9d0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 7470 (ifconfig) trap number = 12 panic: page fault cpuid = 0 time = 1649635952 KDB: stack backtrace: #0 0xffffffff80c57535 at kdb_backtrace+0x65 #1 0xffffffff80c09f11 at vpanic+0x181 #2 0xffffffff80c09d83 at panic+0x43 #3 0xffffffff8108b1a7 at trap_fatal+0x387 #4 0xffffffff8108b1ff at trap_pfault+0x4f #5 0xffffffff8108a85d at trap+0x27d #6 0xffffffff81061f08 at calltrap+0x8 #7 0xffffffff80d1d1a9 at ifunit_ref+0x79 #8 0xffffffff80d1f5fb at ifioctl+0x4eb #9 0xffffffff80c76edd at kern_ioctl+0x26d #10 0xffffffff80c76bd6 at sys_ioctl+0xf6 #11 0xffffffff8108baac at amd64_syscall+0x10c #12 0xffffffff8106282e at fast_syscall_common+0xf8 Uptime: 18m12s Automatic reboot in 15 seconds - press a key on the console to abort --> Press a key on the console to reboot, --> or switch off the system now. My global parameters at jail.conf: ########## # GLOBAL # ########## host.hostname = "$name.mydomain.com"; path = "/home/jails/$name"; exec.system_user = "root"; exec.jail_user = "root"; allow.raw_sockets = 1; devfs_ruleset="11"; enforce_statfs = 1; sysvshm = new; sysvsem = new; sysvmsg = new; mount.devfs; The particular parameters for the Jail: myjail { vnet; vnet.interface = ng0_myjail; exec.clean; exec.prestart += "jng bridge myjail vtnet1"; exec.start = "/sbin/ifconfig ng0_myjail 192.168.1.1/24"; exec.start += "/sbin/route add default 192.168.1.254"; exec.start += "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown jail"; exec.poststop = "jng shutdown myjail"; } The following combination of lines: exec.poststop = "/bin/sleep 5"; exec.poststop += "jng shutdown myjail"; Mitigates the issue, run 50 reboots of jail and no panic kernel is generated.
To simplify the steps to repeat, I created a Github repository, https://github.com/gmshake/jail-crash.git
I'm able to reproduce this on 14.1-RELEASE
FreeBSD/amd64 EFI loader, Revision 1.1 Command line arguments: loader.efi Image base: 0x76435000 EFI version: 2.40 EFI Firmware: American Megatrends (rev 5.11) Console: efi (0x20000000) Load Path: \EFI\FREEBSD\LOADER.EFI Load Device: PciRoot(0x0)/Pci(0x1F,0x2)/Sata(0x0,0xFFFF,0x0)/HD(1,GPT,67C41637-685F-11ED-80D1-D05099C13F9B,0x28,0x82000) BootCurrent: 0000 BootOrder: 0000[*] 0005 0004 BootInfo Path: HD(1,GPT,67C41637-685F-11ED-80D1-D05099C13F9B,0x28,0x82000)/\EFI\FREEBSD\LOADER.EFI Ignoring Boot0000: Only one DP found Trying ESP: PciRoot(0x0)/Pci(0x1F,0x2)/Sata(0x0,0xFFFF,0x0)/HD(1,GPT,67C41637-685F-11ED-80D1-D05099C13F9B,0x28,0x82000) Setting currdev to disk1p1: Trying: PciRoot(0x0)/Pci(0x1F,0x2)/Sata(0x0,0xFFFF,0x0)/HD(2,GPT,67CBD3F1-685F-11ED-80D1-D05099C13F9B,0x82800,0x1000000) Setting currdev to disk1p2: Loading /boot/defaults/loader.conf/Sata(0x0,0xFFFF,0x0)/HD(3,GPT,67D35623-685F-11ED-80D1-D05099C13F9B,0x1082800,0x36DC0800) Loading /boot/defaults/loader.confdefault: Loading /boot/device.hints Loading /boot/loader.conf Loading /boot/loader.conf.local ?c/ - ______ ____ _____ _____ | ____| | _ \ / ____| __ \ | |___ _ __ ___ ___ | |_) | (___ | | | | | ___| '__/ _ \/ _ \| _ < \___ \| | | | | | | | | __/ __/| |_) |____) | |__| | | | | | | | || | | | |_| |_| \___|\___||____/|_____/|_____/ ``` ` s` `.....---.......--.``` -/ ╔══════════ Welcome to FreeBSD ═══════════╗ +o .--` /y:` +. ║ ║ yo`:. :o `+- ║ 1. Boot Multi user [Enter] ║ y/ -/` -o/ ║ 2. Boot Single user ║ .- ::/sy+:. ║ 3. Escape to loader prompt ║ / `-- / ║ 4. Reboot ║ `: :` ║ 5. Cons: Dual (Video primary) ║ `: :` ║ ║ / / ║ Options: ║ .- -. ║ 6. Kernel: default/kernel (1 of 4) ║ -- -. ║ 7. Boot Options ║ `:` `:` ║ 8. Boot Environments ║ .-- `--. ║ ║ .---.....----. ╚═════════════════════════════════════════╝ Autoboot in 0 seconds. [Space] to pause Loading kernel... /boot/kernel/kernel text=0x17baa0 text=0xd5efd8 text=0x425e1c data=0x180+0xe80 data=0x1868b0+0x479750 0x8+0x189d38+0x8+0x1ad39c Loading configured modules... /boot/firmware/intel-ucode.bin size=0xba2400 /boot/kernel/zfs.ko size 0x5cd608 at 0x2cda000 /boot/kernel/ipmi.ko size 0x13258 at 0x32a8000 loading required module 'smbus' /boot/kernel/smbus.ko size 0x3ca8 at 0x32bc000 /boot/kernel/mac_portacl.ko size 0x5328 at 0x32c0000 /boot/entropy size=0x1000 /etc/hostid size=0x25 /boot/kernel/cryptodev.ko size 0x77d8 at 0x32c7000 /boot/kernel/geom_mirror.ko size 0x21338 at 0x32cf000 /boot/kernel/geom_eli.ko size 0x1c3f0 at 0x32f1000 staging 0x6e600000 (not copying) tramp 0x6e434000 PT4 0x6e42b000 Start @ 0xffffffff8037c000 ... EFI framebuffer information: addr, size 0xfa000000, 0x1d4c00 dimensions 800 x 600 stride 800 masks 0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000 ---<<BOOT>>--- Copyright (c) 1992-2023 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64 FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9) VT(efifb): resolution 800x600 CPU microcode: updated from 0x7000012 to 0x700001c CPU: Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz (2100.14-MHz K8-class CPU) Origin="GenuineIntel" Id=0x50663 Family=0x6 Model=0x56 Stepping=3 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x7ffefbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x121<LAHF,ABM,Prefetch> Structured Extended Features=0x21cbfbb<FSGSBASE,TSCADJ,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,NFPUSG,PQE,RDSEED,ADX,SMAP,PROCTRACE> Structured Extended Features3=0x9c000400<MD_CLEAR,IBPB,STIBP,L1DFL,SSBD> XSAVE Features=0x1<XSAVEOPT> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics real memory = 34358689792 (32767 MB) avail memory = 33208377344 (31669 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: <ALASKA A M I > FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs FreeBSD/SMP: 1 package(s) x 8 core(s) x 2 hardware threads random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" random: unblocking device. Security policy loaded: TrustedBSD MAC/portacl (mac_portacl) ioapic0 <Version 2.0> irqs 0-23 ioapic1 <Version 2.0> irqs 24-47 Launching APs: 1 2 10 14 11 15 3 6 8 9 12 5 7 13 4 random: entropy device external interface kbd1 at kbdmux0 efirtc0: <EFI Realtime Clock> efirtc0: registered as a time-of-day clock, resolution 1.000000s smbios0: <System Management BIOS> at iomem 0xf05e0-0xf05fe smbios0: Version: 3.0, BCD Revision: 3.0 aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS> acpi0: <ALASKA A M I > acpi0: Power Button (fixed) cpu0: <ACPI CPU> numa-domain 0 on acpi0 atrtc0: <AT realtime clock> port 0x70-0x71,0x74-0x77 irq 8 on acpi0 atrtc0: registered as a time-of-day clock, resolution 1.000000s Event timer "RTC" frequency 32768 Hz quality 0 attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 350 Event timer "HPET1" frequency 14318180 Hz quality 340 Event timer "HPET2" frequency 14318180 Hz quality 340 Event timer "HPET3" frequency 14318180 Hz quality 340 Event timer "HPET4" frequency 14318180 Hz quality 340 Event timer "HPET5" frequency 14318180 Hz quality 340 Event timer "HPET6" frequency 14318180 Hz quality 340 Event timer "HPET7" frequency 14318180 Hz quality 340 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 pcib0: <ACPI Host-PCI bridge> on acpi0 pci0: <ACPI PCI bus> on pcib0 pci0: <dasp, performance counters> at device 11.1 (no driver attached) pci0: <dasp, performance counters> at device 11.2 (no driver attached) pci0: <dasp, performance counters> at device 16.1 (no driver attached) pci0: <dasp, performance counters> at device 16.6 (no driver attached) pci0: <dasp, performance counters> at device 18.1 (no driver attached) acpi_syscontainer0: <System Container> on acpi0 apei0: <ACPI Platform Error Interface> on acpi0 pcib1: <ACPI Host-PCI bridge> port 0xcf8-0xcff numa-domain 0 on acpi0 pci1: <ACPI PCI bus> numa-domain 0 on pcib1 pcib2: <ACPI PCI-PCI bridge> irq 26 at device 1.0 numa-domain 0 on pci1 pci2: <ACPI PCI bus> numa-domain 0 on pcib2 nvme0: <Generic NVMe Device> mem 0xfb400000-0xfb403fff irq 26 at device 0.0 numa-domain 0 on pci2 pcib3: <ACPI PCI-PCI bridge> irq 32 at device 2.0 numa-domain 0 on pci1 pci3: <ACPI PCI bus> numa-domain 0 on pcib3 pcib4: <ACPI PCI-PCI bridge> irq 40 at device 3.0 numa-domain 0 on pci1 pci4: <ACPI PCI bus> numa-domain 0 on pcib4 xhci0: <Intel Lynx Point USB 3.0 controller> mem 0xfb500000-0xfb50ffff irq 19 at device 20.0 numa-domain 0 on pci1 xhci0: 32 bytes context size, 64-bit DMA usbus0: waiting for BIOS to give up control xhci0: Port routing mask set to 0xffffffff usbus0 numa-domain 0 on xhci0 usbus0: 5.0Gbps Super Speed USB v3.0 pci1: <simple comms> at device 22.0 (no driver attached) pci1: <simple comms> at device 22.1 (no driver attached) pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.0 numa-domain 0 on pci1 pci5: <ACPI PCI bus> numa-domain 0 on pcib5 pcib6: <ACPI PCI-PCI bridge> at device 0.0 numa-domain 0 on pci5 pci6: <ACPI PCI bus> numa-domain 0 on pcib6 vgapci0: <VGA-compatible display> port 0xe000-0xe07f mem 0xfa000000-0xfaffffff,0xfb000000-0xfb01ffff irq 16 at device 0.0 numa-domain 0 on pci6 vgapci0: Boot video device pcib7: <ACPI PCI-PCI bridge> irq 18 at device 28.2 numa-domain 0 on pci1 pci7: <ACPI PCI bus> numa-domain 0 on pcib7 igb0: <Intel(R) I210 (Copper)> port 0xd000-0xd01f mem 0xfb200000-0xfb27ffff,0xfb280000-0xfb283fff irq 18 at device 0.0 numa-domain 0 on pci7 igb0: PHY reset is blocked due to SOL/IDER session. igb0: EEPROM V3.16-0 eTrack 0x800004d6 igb0: Using 1024 TX descriptors and 1024 RX descriptors igb0: Using 4 RX queues 4 TX queues igb0: Using MSI-X interrupts with 5 vectors igb0: Ethernet address: d0:50:99:c1:3f:9b igb0: link state changed to UP igb0: netmap queues/slots: TX 4/1024, RX 4/1024 pcib8: <ACPI PCI-PCI bridge> irq 19 at device 28.3 numa-domain 0 on pci1 pci8: <ACPI PCI bus> numa-domain 0 on pcib8 igb1: <Intel(R) I210 (Copper)> port 0xc000-0xc01f mem 0xfb100000-0xfb17ffff,0xfb180000-0xfb183fff irq 19 at device 0.0 numa-domain 0 on pci8 igb1: EEPROM V3.16-0 eTrack 0x800004d6 igb1: Using 1024 TX descriptors and 1024 RX descriptors igb1: Using 4 RX queues 4 TX queues igb1: Using MSI-X interrupts with 5 vectors igb1: Ethernet address: d0:50:99:c1:3f:9c igb1: netmap queues/slots: TX 4/1024, RX 4/1024 ehci0: <Intel Lynx Point USB 2.0 controller USB-A> mem 0xfb513000-0xfb5133ff irq 18 at device 29.0 numa-domain 0 on pci1 usbus1: EHCI version 1.0 usbus1 numa-domain 0 on ehci0 usbus1: 480Mbps High Speed USB v2.0 isab0: <PCI-ISA bridge> at device 31.0 numa-domain 0 on pci1 isa0: <ISA bus> numa-domain 0 on isab0 ahci0: <Intel Lynx Point AHCI SATA controller> port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 0xfb512000-0xfb5127ff irq 16 at device 31.2 numa-domain 0 on pci1 ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported ahcich0: <AHCI channel> at channel 0 on ahci0 ahcich1: <AHCI channel> at channel 1 on ahci0 ahcich2: <AHCI channel> at channel 2 on ahci0 ahcich3: <AHCI channel> at channel 3 on ahci0 ahcich4: <AHCI channel> at channel 4 on ahci0 ahcich5: <AHCI channel> at channel 5 on ahci0 ahciem0: <AHCI enclosure management bridge> on ahci0 acpi_button0: <Power Button> on acpi0 uart0: <16550 or compatible> port 0x2f8-0x2ff irq 3 flags 0x10 on acpi0 ns8250: UART FCR is broken uart0: console (115200,n,8,1) ipmi0: <IPMI System Interface> port 0xca2,0xca3 on acpi0 ipmi0: KCS mode found at io 0xca2 on acpi orm0: <ISA Option ROM> at iomem 0xc0000-0xc7fff pnpid ORM0000 on isa0 est0: <Enhanced SpeedStep Frequency Control> numa-domain 0 on cpu0 Timecounter "TSC" frequency 2099998062 Hz quality 1000 Timecounters tick every 1.000 msec ugen0.1: <Intel XHCI root HUB> at usbus0 ugen1.1: <Intel EHCI root HUB> at usbus1 uhub0 numa-domain 0 on usbus0 uhub0: <Intel XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 uhub1 numa-domain 0 on usbus1 uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1 ZFS filesystem version: 5 ZFS storage pool version: features support (5000) ipmi0: IPMI device rev. 1, firmware rev. 0.17, version 2.0, device support mask 0xbf ipmi0: Number of channels 2 ipmi0: Attached watchdog ipmi0: Establishing power cycle handler nda0 at nvme0 bus 0 scbus7 target 0 lun 1 nda0: <INTEL SSDPEKKW256G7 PSF100C BTPY63560ETA256D> nda0: Serial Number BTPY63560ETA256D nda0: nvme version 1.2 nda0: 244198MB (500118192 512 byte sectors) ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: <INTEL SSDSC2BB480G7 N2010101> ACS-3 ATA SATA 3.x device ada0: Serial Number PHDV637500CA480BGN ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) ada0: Command Queueing enabled ada0: 457862MB (937703088 512 byte sectors) ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: <INTEL SSDSC2BB480G7 N2010101> ACS-3 ATA SATA 3.x device ada1: Serial Number PHDV63750058480BGN ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) ada1: Command Queueing enabled ada1: 457862MB (937703088 512 byte sectors) ses0 at ahciem0 bus 0 scbus6 target 0 lun 0 ses0: <AHCI SGPIO Enclosure 2.00 0001> SEMB S-E-S 2.00 device ses0: SEMB SES Device ses0: ada0,pass0 in 'Slot 00', SATA Slot: scbus0 target 0 ses0: ada1,pass1 in 'Slot 01', SATA Slot: scbus1 target 0 GEOM_ELI: Device ada0p3.eli created. GEOM_ELI: Encryption: AES-XTS 256 GEOM_ELI: Crypto: accelerated software GEOM_MIRROR: Device mirror/swap launched (2/2). GEOM_ELI: Device ada1p3.eli created. GEOM_ELI: Encryption: AES-XTS 256 GEOM_ELI: Crypto: accelerated software Trying to mount root from zfs:zroot/ROOT/default []... uhub1: 2 ports with 2 removable, self powered uhub0: 21 ports with 21 removable, self powered ugen1.2: <vendor 0x8087 product 0x8000> at usbus1 uhub2 numa-domain 0 on uhub1 uhub2: <vendor 0x8087 product 0x8000, class 9/0, rev 2.00/0.05, addr 2> on usbus1 uhub2: 4 ports with 4 removable, self powered Dual Console: Video Primary, Serial Secondary GEOM_ELI: Device mirror/swap.eli created. GEOM_ELI: Encryption: AES-XTS 128 GEOM_ELI: Crypto: accelerated software ichsmb0: <Intel Lynx Point SMBus controller> port 0xf000-0xf01f mem 0xfb511000-0xfb5110ff irq 18 at device 31.3 numa-domain 0 on pci1 smbus0: <System Management Bus> numa-domain 0 on ichsmb0 pchtherm0: <Haswell Thermal Subsystem> irq 18 at device 31.6 numa-domain 0 on pci1 ioat0: <BDXDE IOAT Ch0> mem 0xfb306000-0xfb307fff irq 32 at device 0.0 numa-domain 0 on pci3 ioat0: Capabilities: c2641<Completion_Timeout_Support,DMA_with_Multicasting_Support,Descriptor_Write_Back_Error_Support,DMA_with_DIF,PQ,Block_Fill,Page_Break> ioat1: <BDXDE IOAT Ch1> mem 0xfb304000-0xfb305fff irq 36 at device 0.1 numa-domain 0 on pci3 ioat1: Capabilities: c2641<Completion_Timeout_Support,DMA_with_Multicasting_Support,Descriptor_Write_Back_Error_Support,DMA_with_DIF,PQ,Block_Fill,Page_Break> ioat2: <BDXDE IOAT Ch2> mem 0xfb302000-0xfb303fff irq 37 at device 0.2 numa-domain 0 on pci3 ioat2: Capabilities: c2641<Completion_Timeout_Support,DMA_with_Multicasting_Support,Descriptor_Write_Back_Error_Support,DMA_with_DIF,PQ,Block_Fill,Page_Break> ioat3: <BDXDE IOAT Ch3> mem 0xfb300000-0xfb301fff irq 38 at device 0.3 numa-domain 0 on pci3 ioat3: Capabilities: c2641<Completion_Timeout_Support,DMA_with_Multicasting_Support,Descriptor_Write_Back_Error_Support,DMA_with_DIF,PQ,Block_Fill,Page_Break> acpi_wmi0: <ACPI-WMI mapping> on acpi0 acpi_wmi0: cannot find EC device acpi_wmi1: <ACPI-WMI mapping> on acpi0 acpi_wmi1: cannot find EC device igb1: link state changed to UP lo0: link state changed to UP igb1: link state changed to DOWN lagg0: link state changed to UP igb1: link state changed to UP Security policy loaded: MAC/ntpd (mac_ntpd) warning: total configured swap (64611925 pages) exceeds maximum recommended amount (32497168 pages). warning: increase kern.maxswzone or reduce amount of swap. FreeBSD/amd64 (node3) (ttyu0) login: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d3f676 stack pointer = 0x28:0xfffffe00e247bb90 frame pointer = 0x28:0xfffffe00e247bbc0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2 (clock (0)) rdi: fffff800292d7000 rsi: 000000000000001c rdx: fffff801c8076678 rcx: fffff800292d7000 r8: 00000000ffffffbd r9: 0000000000000018 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe00e247bbc0 r10: fffff80307004200 r11: fffff80414c0c080 r12: 0000000000010180 r13: 00000000000000e2 r14: fffffe00e247bb9c r15: fffff802942bd098 trap number = 12 panic: page fault cpuid = 0 time = 1720158935 KDB: stack backtrace: #0 0xffffffff80b7fbfd at kdb_backtrace+0x5d #1 0xffffffff80b32961 at vpanic+0x131 #2 0xffffffff80b32823 at panic+0x43 #3 0xffffffff80fff91b at trap_fatal+0x40b #4 0xffffffff80fff966 at trap_pfault+0x46 #5 0xffffffff80fd6a48 at calltrap+0x8 #6 0xffffffff80d0c8ad at tcp_default_output+0x1d6d #7 0xffffffff80d1c104 at tcp_timer_rexmt+0x554 #8 0xffffffff80d1b87e at tcp_timer_enter+0xfe #9 0xffffffff80b5089c at softclock_call_cc+0x12c #10 0xffffffff80b520e5 at softclock_thread+0xe5 #11 0xffffffff80aecd1f at fork_exit+0x7f #12 0xffffffff80fd7aae at fork_trampoline+0xe Uptime: 58s Dumping 1402 out of 32611 MB:..2%..11%..21%..31%..42%..51%..61%..71%..81%..91% Dump complete Automatic reboot in 15 seconds - press a key on the console to abort
Fatal trap 12: page fault while in kernel mode cpuid = 12; apic id = 0c fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d3f676 stack pointer = 0x28:0xfffffe0177ca4a70 frame pointer = 0x28:0xfffffe0177ca4aa0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 28431 (tor) rdi: fffff8001b52a800 rsi: 000000000000001c rdx: fffff8004d820e78 rcx: fffff8001b52a800 r8: 00000000ffffffbd r9: 0000000000000018 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe0177ca4aa0 r10: fffff801659cb6c0 r11: fffff80255cfa880 r12: 0000000000010480 r13: 0000000000000000 r14: fffffe0177ca4a7c r15: fffff8024c5d8998 trap number = 12 panic: page fault cpuid = 12 time = 1720160707 KDB: stack backtrace: #0 0xffffffff80b7fbfd at kdb_backtrace+0x5d #1 0xffffffff80b32961 at vpanic+0x131 #2 0xffffffff80b32823 at panic+0x43 #3 0xffffffff80fff91b at trap_fatal+0x40b #4 0xffffffff80fff966 at trap_pfault+0x46 #5 0xffffffff80fd6a48 at calltrap+0x8 #6 0xffffffff80d0c8ad at tcp_default_output+0x1d6d #7 0xffffffff80d1e243 at tcp_usr_disconnect+0x83 #8 0xffffffff80bd6885 at soclose+0x75 #9 0xffffffff80ad1221 at _fdrop+0x11 #10 0xffffffff80ad448a at closef+0x24a #11 0xffffffff80ad8338 at closefp_impl+0x58 #12 0xffffffff8100073b at amd64_syscall+0x67b #13 0xffffffff80fd735b at fast_syscall_common+0xf8 Uptime: 3m19s Dumping 1451 out of 32611 MB:..2%..12%..21%..31%..41%..51%..61%..71%..81%..91% Dump complete Automatic reboot in 15 seconds - press a key on the console to abort
Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d3f676 stack pointer = 0x28:0xfffffe0179b00a70 frame pointer = 0x28:0xfffffe0179b00aa0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 88125 (tor) rdi: fffff8001faf9800 rsi: 000000000000001c rdx: fffff8010d101e78 rcx: fffff8001faf9800 r8: 00000000ffffffbd r9: 0000000000000018 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe0179b00aa0 r10: fffff8003a8291e0 r11: fffff8003aca3880 r12: 0000000000010480 r13: 0000000000000000 r14: fffffe0179b00a7c r15: fffff80179005798 trap number = 12 panic: page fault cpuid = 0 time = 1720160385 KDB: stack backtrace: #0 0xffffffff80b7fbfd at kdb_backtrace+0x5d #1 0xffffffff80b32961 at vpanic+0x131 #2 0xffffffff80b32823 at panic+0x43 #3 0xffffffff80fff91b at trap_fatal+0x40b #4 0xffffffff80fff966 at trap_pfault+0x46 #5 0xffffffff80fd6a48 at calltrap+0x8 #6 0xffffffff80d0c8ad at tcp_default_output+0x1d6d #7 0xffffffff80d1e243 at tcp_usr_disconnect+0x83 #8 0xffffffff80bd6885 at soclose+0x75 #9 0xffffffff80ad1221 at _fdrop+0x11 #10 0xffffffff80ad448a at closef+0x24a #11 0xffffffff80ad8338 at closefp_impl+0x58 #12 0xffffffff8100073b at amd64_syscall+0x67b #13 0xffffffff80fd735b at fast_syscall_common+0xf8 Uptime: 3m41s Dumping 1402 out of 32611 MB:..2%..11%..21%..31%..42%..51%..61%..71%..81%..91% Dump complete Automatic reboot in 15 seconds - press a key on the console to abort
Fatal trap 12: page fault while in kernel mode cpuid = 12; apic id = 0c fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d3f676 stack pointer = 0x28:0xfffffe00e247bb90 frame pointer = 0x28:0xfffffe00e247bbc0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2 (clock (0)) rdi: fffff801aebe1000 rsi: 000000000000001c rdx: fffff80129c5b878 rcx: fffff801aebe1000 r8: 00000000ffffffbd r9: 0000000000000018 Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d3f676 stack pointer = 0x28:0xfffffe0174e8da70 frame pointer = 0x28:0xfffffe0174e8daa0 code segment = base rx0, limit 0xfffff, type 0x1b rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe00e247bbc0 r10: fffff801e3788ec0 r11: fffff8001d979880 r12: 0000000000010480 r13: 0000000000000000 r14: fffffe00e247bb9c r15: fffff80032a99698 trap number = 12 panic: page fault cpuid = 12 time = 1720160925 KDB: stack backtrace: #0 0xffffffff80b7fbfd at kdb_backtrace+0x5d #1 0xffffffff80b32961 at vpanic+0x131 #2 0xffffffff80b32823 at panic+0x43 #3 0xffffffff80fff91b at trap_fatal+0x40b #4 0xffffffff80fff966 at trap_pfault+0x46 #5 0xffffffff80fd6a48 at calltrap+0x8 #6 0xffffffff80d0c8ad at tcp_default_output+0x1d6d #7 0xffffffff80d1c104 at tcp_timer_rexmt+0x554 #8 0xffffffff80d1b87e at tcp_timer_enter+0xfe #9 0xffffffff80b5089c at softclock_call_cc+0x12c #10 0xffffffff80b520e5 at softclock_thread+0xe5 #11 0xffffffff80aecd1f at fork_exit+0x7f #12 0xffffffff80fd7aae at fork_trampoline+0xe Uptime: 1m6s Dumping 1391 out of 32611 MB:..2%..11%..21%..32%..41%..51%..61%..71%..81%..91% Dump complete Automatic reboot in 15 seconds - press a key on the console to abort
Fatal trap 12: page fault while in kernel mode cpuid = 6; apic id = 06 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d3f676 stack pointer = 0x28:0xfffffe0177981a70 frame pointer = 0x28:0xfffffe0177981aa0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 43495 (tor) rdi: fffff8006962e800 rsi: 000000000000001c rdx: fffff80069a9ec78 rcx: fffff8006962e800 r8: 00000000ffffffbd r9: 0000000000000018 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe0177981aa0 r10: fffff8053f13a7a0 r11: fffff800697b0080 r12: 0000000000010480 r13: 00000000000004b8 r14: fffffe0177981a7c r15: fffff8001ada6b98 trap number = 12 panic: page fault cpuid = 6 time = 1720161073 KDB: stack backtrace: #0 0xffffffff80b7fbfd at kdb_backtrace+0x5d #1 0xffffffff80b32961 at vpanic+0x131 #2 0xffffffff80b32823 at panic+0x43 #3 0xffffffff80fff91b at trap_fatal+0x40b #4 0xffffffff80fff966 at trap_pfault+0x46 #5 0xffffffff80fd6a48 at calltrap+0x8 #6 0xffffffff80d0c8ad at tcp_default_output+0x1d6d #7 0xffffffff80d1e243 at tcp_usr_disconnect+0x83 #8 0xffffffff80bd6885 at soclose+0x75 #9 0xffffffff80ad1221 at _fdrop+0x11 #10 0xffffffff80ad448a at closef+0x24a #11 0xffffffff80ad8338 at closefp_impl+0x58 #12 0xffffffff8100073b at amd64_syscall+0x67b #13 0xffffffff80fd735b at fast_syscall_common+0xf8 Uptime: 44s Dumping 1381 out of 32611 MB:..2%..11%..21%..31%..41%..51%..61%..71%..82%..91% Dump complete Automatic reboot in 15 seconds - press a key on the console to abort
Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80d3f676 stack pointer = 0x28:0xfffffe01747fca70 frame pointer = 0x28:0xfffffe01747fcaa0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 23735 (tor) rdi: fffff800283bf800 rsi: 000000000000001c rdx: fffff802051e6e78 rcx: fffff800283bf800 r8: 00000000ffffffbd r9: 0000000000000018 rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe01747fcaa0 r10: fffff8001a4cfe60 r11: fffff8004a311880 r12: 0000000000010480 r13: 0000000000000218 r14: fffffe01747fca7c r15: fffff80028066898 trap number = 12 panic: page fault cpuid = 3 time = 1720161236 KDB: stack backtrace: #0 0xffffffff80b7fbfd at kdb_backtrace+0x5d #1 0xffffffff80b32961 at vpanic+0x131 #2 0xffffffff80b32823 at panic+0x43 #3 0xffffffff80fff91b at trap_fatal+0x40b #4 0xffffffff80fff966 at trap_pfault+0x46 #5 0xffffffff80fd6a48 at calltrap+0x8 #6 0xffffffff80d0c8ad at tcp_default_output+0x1d6d #7 0xffffffff80d1e243 at tcp_usr_disconnect+0x83 #8 0xffffffff80bd6885 at soclose+0x75 #9 0xffffffff80ad1221 at _fdrop+0x11 #10 0xffffffff80ad448a at closef+0x24a #11 0xffffffff80ad8338 at closefp_impl+0x58 #12 0xffffffff8100073b at amd64_syscall+0x67b #13 0xffffffff80fd735b at fast_syscall_common+0xf8 Uptime: 42s Dumping 1381 out of 32611 MB:..2%..11%..21%..31%..41%..51%..61%..71%..82%..91% Dump complete Automatic reboot in 15 seconds - press a key on the console to abort
(In reply to tom+fbsdbugzilla from comment #27) The stack trace looks the same with https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279653 .
Is the issue for what's causing this known yet or is more information needed? I would like to help provide more information if it is needed to isolate this bug.