Bug 259645

Summary: crash in_cksumdata (sys/amd64/amd64/in_cksum.c:113) via in4_cksum (sys/netpfil/pf/in4_cksum.c:117)
Product: Base System Reporter: doctor
Component: kernAssignee: Mark Johnston <markj>
Status: Closed FIXED    
Severity: Affects Only Me CC: arnaud, chris, emaste, gallatin, jhb, kp, markj, mjg, net
Priority: --- Keywords: crash, regression
Version: 13.0-RELEASEFlags: koobs: mfc-stable13+
koobs: mfc-stable12-
Hardware: Any   
OS: Any   
URL: https://reviews.freebsd.org/D33096
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254419
Attachments:
Description Flags
crash text
none
2nd crash tesxt
none
file containing technical details upon request
none
Yet another crash from this morning
none
4th crash
none
5th crash
none
apn PF Firewall configuration
none
apn pciconf -lv
none
apn dmesg.boot none

Description doctor 2021-11-04 15:34:40 UTC
Created attachment 229267 [details]
crash text

Attached are the details
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2021-11-05 00:32:38 UTC
Thank you for your report. Could you include additional information, including:

 - /var/run/dmesg.boot output (as an attachment)
 - /etc/rc.conf network configuration (as an attachment)
 - pciconf -lv output (as an attachment)
 - firewall (pf) configuration (as an attachment, sanitized where necessary)
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2021-11-05 08:28:46 UTC
This smells a lot like #254419. Possibly we need to mb_unmapped_to_ext() in pf_check_proto_cksum() as well.
Comment 3 Ed Maste freebsd_committer freebsd_triage 2021-11-05 15:08:37 UTC
> after FreeBSD 13.0 p5 update

update from which version?
Comment 4 doctor 2021-11-05 15:53:16 UTC
FreeBSD 13.0 p4 to 13.0 p5
Comment 5 doctor 2021-11-05 15:53:36 UTC
IT did happen a second time
Comment 6 doctor 2021-11-05 15:56:59 UTC
Created attachment 229295 [details]
2nd crash tesxt
Comment 7 doctor 2021-11-05 16:03:55 UTC
Kubilay Kocak I am adding what you requested in a file called crash20211105
Comment 8 doctor 2021-11-05 16:04:56 UTC
Created attachment 229296 [details]
file containing technical details upon request
Comment 9 Mark Johnston freebsd_committer freebsd_triage 2021-11-05 21:51:16 UTC
(In reply to Kristof Provost from comment #2)
This path is somewhat tricky to fix, it seems, since we'd have to modify a few functions to return a new mbuf, and handle out-of-memory conditions.

So maybe this (WIP) approach is better instead: https://reviews.freebsd.org/D32859
Comment 10 doctor 2021-11-07 01:09:06 UTC
Created attachment 229338 [details]
Yet another crash from this morning

3rd crash since updating to 3.0 p5
Comment 11 doctor 2021-11-07 17:22:17 UTC
This is getting ridiculous.  I am getting crashes rather regulary.  Should I turn pf until this is fixed?
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2021-11-07 20:17:54 UTC
(In reply to doctor from comment #11)
Could you please instead try setting

# sysctl kern.ipc.mb_use_ext_pgs=0

and confirm whether or not the panics go away?  I am working on a proper solution in the meantime.
Comment 13 doctor 2021-11-08 03:08:49 UTC
just set to 0 .

Still This should be treated as a DoS Style takedown. I am adding 2 more core txts.
Comment 14 doctor 2021-11-08 03:11:53 UTC
Created attachment 229351 [details]
4th crash

4th crash text
Comment 15 doctor 2021-11-08 03:13:02 UTC
Created attachment 229352 [details]
5th crash

5th crash report
Comment 16 Mark Johnston freebsd_committer freebsd_triage 2021-11-22 22:43:00 UTC
*** Bug 257627 has been marked as a duplicate of this bug. ***
Comment 17 commit-hook freebsd_committer freebsd_triage 2021-11-24 18:39:50 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=0d9c3423f59bb305301f5a5bc7c8f5daf7b7aa52

commit 0d9c3423f59bb305301f5a5bc7c8f5daf7b7aa52
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-11-24 18:19:54 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-11-24 18:31:16 +0000

    netinet: Implement in_cksum_skip() using m_apply()

    This allows it to work with unmapped mbufs.  In particular,
    in_cksum_skip() calls no longer need to be preceded by calls to
    mb_unmapped_to_ext() to avoid a page fault.

    PR:             259645
    Reviewed by:    gallatin, glebius, jhb
    MFC after:      1 week
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D33096

 sys/netinet/in_cksum.c | 63 +++++++++++++++++++++++++-------------------------
 1 file changed, 32 insertions(+), 31 deletions(-)
Comment 18 commit-hook freebsd_committer freebsd_triage 2021-12-01 12:49:25 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=dfd5240189ca024b268e53df2f0a3076df57b240

commit dfd5240189ca024b268e53df2f0a3076df57b240
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2021-11-24 18:19:54 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2021-12-01 12:43:03 +0000

    netinet: Implement in_cksum_skip() using m_apply()

    This allows it to work with unmapped mbufs.  In particular,
    in_cksum_skip() calls no longer need to be preceded by calls to
    mb_unmapped_to_ext() to avoid a page fault.

    PR:             259645
    Reviewed by:    gallatin, glebius, jhb
    Sponsored by:   The FreeBSD Foundation

    (cherry picked from commit 0d9c3423f59bb305301f5a5bc7c8f5daf7b7aa52)

 sys/netinet/in_cksum.c | 63 +++++++++++++++++++++++++-------------------------
 1 file changed, 32 insertions(+), 31 deletions(-)
Comment 19 Mark Johnston freebsd_committer freebsd_triage 2021-12-01 13:13:20 UTC
Fixed in stable/13 now.  The patches will appear in 13.1, as we have a workaround (kern.ipc.mb_use_ext_pgs=0) in the meantime.  With 13.1 that workaround will not be necessary.
Comment 20 Kubilay Kocak freebsd_committer freebsd_triage 2021-12-01 22:55:22 UTC
(In reply to Mark Johnston from comment #19)

Could you confirm stable/12 is not affected and doesn't need the merge?
Comment 21 Mark Johnston freebsd_committer freebsd_triage 2021-12-03 14:56:13 UTC
(In reply to Kubilay Kocak from comment #20)
Yes, the bug is in 13.0 and later.
Comment 22 Arnaud de Prelle 2021-12-07 07:51:45 UTC
Hi,

I patched yesterday at around 8AM CET from 13.0-p4 to 13.0-p5 and experienced two crashes in a a few minutes at around 11PM CET.
I guess it's related to this bug and will apply the workaround.

I'm running on a Bare Metal server (i5-3570S).
More server details in attachments (prefixed by apn).
Comment 23 Arnaud de Prelle 2021-12-07 07:56:26 UTC
Created attachment 229950 [details]
apn PF Firewall configuration
Comment 24 Arnaud de Prelle 2021-12-07 07:57:31 UTC
Created attachment 229951 [details]
apn pciconf -lv
Comment 25 Arnaud de Prelle 2021-12-07 07:58:18 UTC
Created attachment 229952 [details]
apn dmesg.boot
Comment 26 Mark Johnston freebsd_committer freebsd_triage 2021-12-07 13:45:35 UTC
(In reply to Arnaud de Prelle from comment #22)
Can you share the panic string and backtrace?  Or a copy of /var/crash/core.txt.<N> from the crash.  The files that you attached do not tell me anything.  The bug has nothing to do with 13.0-p5 specifically, the title is misleading.
Comment 27 Arnaud de Prelle 2021-12-07 13:55:55 UTC
(In reply to Mark Johnston from comment #26)
Hi Mark,

I don't have a core file in /var/crash:

# ls -ltra /var/crash
total 12
-rw-r--r--   1 root  wheel    5 Jan 16  2014 minfree
drwxr-x---   2 root  wheel  512 Jan 16  2014 .
drwxr-xr-x  30 root  wheel  512 Dec  6 22:54 ..
#

Note that it's the first and only crash in years that I experience on my FreeBSD server, so I immediately linked this crash to the update performed a few hours before and to this ticket which I was already following.

Now it could be a coincidence ? Electricity cut(s) at OVH (my bare metal hoster) side ? Hardware issue ?
Comment 28 Mark Johnston freebsd_committer freebsd_triage 2021-12-07 14:27:53 UTC
(In reply to Arnaud de Prelle from comment #27)
Do you have kernel dumps configured?  What does "dumpon -l" print?

Without some console logs from the time of the reset it's impossible to say what happened.  There is nothing that can be diagnosed from what you have shared here.  If there is more info available, then I suggest opening a new PR since I'm skeptical that this one is related.
Comment 29 Arnaud de Prelle 2021-12-07 19:26:39 UTC
(In reply to Mark Johnston from comment #28)

Yes, kernel dumps are enabled imho :

BSD:~ # grep dumpdev /etc/rc.conf | grep -v "^#"
dumpdev="AUTO"
BSD:~ # dumpon -l
ada0s1b
# grep ada0s1b /etc/fstab 
/dev/ada0s1b	swap		swap		sw	0	0

I issued a "savecore /var/crash /dev/ada0s1b" which didn't result in files being generated in /var/crash/.

I'll open a dedicated PR should I have more valuable logs/outputs to share.