Created attachment 201308 [details] screenshot of debugger output 13.0-CURRENT drops to debugger on shutdown with IPNAT enabled. Running in VirtualBox 6.0.2. The identical configuration running 12.0-RELEASE-p2, 12.0-STABLE, 11.2- RELEASE-p8 and 11.2-STABLE do not exhibit this behavior. This (VM) is a test machine and I can do anything that will help identify the root of this problem. I have included screenshots of the debug output and a backtrace. Thanks. David Boyd.
Created attachment 201309 [details] screenshot of backtrace
Created attachment 201380 [details] same deal I've also been having this problem, pretty much since I first pulled the source for 13.0-CURRENT, moving from 12.0-CURRENT.
(In reply to waitman from comment #2) This panic is totally unrelated and probably suggests that your panics are not ipfilter related but some other deeper cause.
I'll only look at the ipfilter issue. The waitman@waitman.net issue needs to have its own PR. Some questions: 1. What is the host that VirtualBox is running on? Windows? Solaris? 1a. Are VirtualBox additions installed on the VM? 2. uname -a please. 3. kldstat output. 4. ifconfig -a output. 5. a listing of your ipf.conf and ipnat.conf rules. 6. is ippool in use? If yes, ippool.conf please. 7. It would help a lot if you could get a dump.
More questions (still need the first seven answered though). 8. Which ipfilter services are enabled? ipf, ipnat, ipfs? (ipfilter isn't called during shutdown except when ipfs is enabled.) 9. What flags are enabled for each service? (Specifically, is -p specified for ipnat? But others too.)
Created attachment 201416 [details] uname -a output
1. Host is CentOS EL7 (7.6.1810) 1a. VirtualBox Guest Additions are installed in the VM. 2. See attachment. (uname -a output) 3. See attachment. (kldstat output) 4. See attachment. (ifconfig output) 5. See (2) attachments. (ipf.rules and ipnat.rules) 6. ippool is not in use. 7. See attachement. (core.txt.0) 8. ipf, ipfs, ipmon, ipnat. 9. See attachment. (ipf-ipfs-ipmon-ipnat excerpt from /etc/rc.conf.local)
Created attachment 201417 [details] kldstat output
Created attachment 201418 [details] ifconfig -a output
Created attachment 201419 [details] ipf.rules
Created attachment 201420 [details] ipnat.rules
Created attachment 201421 [details] dump output
Created attachment 201422 [details] excerpt from /etc/rc.conf.local with _flags
As suspected, ipfs is involved. It would help if I could get a copy of the dump itself but we'll try you being my hands and eyes first. Considering your output you have posted, devel/gdb is installed or you're using the deprecated copy in base. Go into kgdb as you did before and enter: frame 17 p ipn <-- This should not be NULL as it's tested at line 1822 above. p ipn->ipn_ipnat.in_size p &ipn->ipn_ipnat p ipn->ipn_ipnat Just out of curiosity, p nat->nat_ptr Else you might need to put the dump file somewhere (not here) so I can fetch it. I'll let you know if we need to do that, but for now as the outputs above should hopefully give us the first hint of what might be happening.
All I get from "frame 17" is "no such command; use "help" to list available commands. Sorry, if I gave the impression that I know what I'm doing ... not so much.
No worries. su to root, su - or sudo -i. Either works. cd /var/crash If devel/gdb is installed: kgdb /boot/kernel/kernel vmcore.last If devel/gdb is not installed use /usr/libexec/kgdb instead. Then enter the frame command and the rest of the kgdb commands. To save time cutting and pasting here, run script(1) first. Then upload the file called typescript into this PR. If it would be easier, do you have a site you can put vmcore.0 (0 could be any number)? I can download the vmcore file and build r343372 here (because it contains the correct offsets for the debugging symbols).
Created attachment 201456 [details] kgdb command output Cy, Attached is the output of the dump commands. Hope this helps. I'll do whatever I can to help with this issue. David.
Created attachment 201468 [details] Patch for PR 235110 Try the attached patch.
Created attachment 201511 [details] dump output after patch was applied The patch was applied but shutdown resulted in another crash. Let me know what you want me to do next. Thanks. David.
Looks like the patch fixed one panic and we hit another. ipf_nat_getent() is trying to obtain a lock it already obtained earlier. I'll send you a patch for this later.
Created attachment 201516 [details] Fix for second panic Can you please also apply this patch? Use the other patch too. This fixes the subsequent witness panic.
Cy, That seems to have fixed this problem. Thanks. David Originally, I was attempting to check to see whether, or not, 13.0-CURRENT still exhibited the symptoms we once worked on in PR 191343 which was closed last year after 4 years of effort. It does (sort of) with a slightly different error message. Should I open a new PR, e-email current@freebsd.org or something else. What do you think. Thanks, again. David.
That's good to hear. Just reopen the old PR and attach the new messages. Did you try the fix I posted? The PR was closed because you never replied.
A commit references this bug: Author: cy Date: Wed Jan 30 20:22:34 UTC 2019 New revision: 343590 URL: https://svnweb.freebsd.org/changeset/base/343590 Log: When copying a NAT rule struct to userland for save by ipfs, use the length of the struct in memmove() rather than an unintialized variable. This fixes the first of two kernel page faults when ipfs is invoked. PR: 235110 Reported by: David.Boyd49@twc.com MFC after: 2 weeks Changes: head/sys/contrib/ipfilter/netinet/ip_nat.c
A commit references this bug: Author: cy Date: Wed Jan 30 20:23:16 UTC 2019 New revision: 343591 URL: https://svnweb.freebsd.org/changeset/base/343591 Log: Do not obtain an already held read lock. This causes a witness panic when ipfs is invoked. This is the second of two panics resolving PR 235110. PR: 235110 Reported by: David.Boyd49@twc.com MFC after: 2 weeks Changes: head/sys/contrib/ipfilter/netinet/ip_nat.c
This fix is incorrect. PR 191343 will require a complete assessment of the ipfs feature.
A commit references this bug: Author: cy Date: Thu Feb 14 00:52:04 UTC 2019 New revision: 344113 URL: https://svnweb.freebsd.org/changeset/base/344113 Log: MFC r343591: Do not obtain an already held read lock. This causes a witness panic when ipfs is invoked. This is the second of two panics resolving PR 235110. PR: 235110 Reported by: David.Boyd49@twc.com Changes: _U stable/10/ stable/10/sys/contrib/ipfilter/netinet/ip_nat.c _U stable/11/ stable/11/sys/contrib/ipfilter/netinet/ip_nat.c _U stable/12/ stable/12/sys/contrib/ipfilter/netinet/ip_nat.c
^Triage: committed back in 2019.