Bug 238520 - [sctp] Fatal trap 9: general protection fault while in kernel mode
Summary: [sctp] Fatal trap 9: general protection fault while in kernel mode
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Michael Tuexen
URL:
Keywords: crash, stress2
Depends on:
Blocks:
 
Reported: 2019-06-12 12:27 UTC by Peter Holm
Modified: 2020-07-10 16:29 UTC (History)
3 users (show)

See Also:
koobs: mfc-stable12+
tuexen: mfc-stable11?


Attachments
sctp panic debug (304.05 KB, text/plain)
2019-06-12 13:39 UTC, Kubilay Kocak
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Holm freebsd_committer freebsd_triage 2019-06-12 12:27:11 UTC
20190612 11:51:26 all (1/1): sctp.sh
Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer	= 0x20:0xffffffff80ba77a8
stack pointer	        = 0x28:0xfffffe00ae6275f0
frame pointer	        = 0x28:0xfffffe00ae627670
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 69451 (server)
trap number		= 9
panic: general protection fault
cpuid = 2
time = 1560333091
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00ae627300
vpanic() at vpanic+0x19d/frame 0xfffffe00ae627350
panic() at panic+0x43/frame 0xfffffe00ae6273b0
trap_fatal() at trap_fatal+0x39c/frame 0xfffffe00ae627410
trap() at trap+0x6c/frame 0xfffffe00ae627520
calltrap() at calltrap+0x8/frame 0xfffffe00ae627520
--- trap 0x9, rip = 0xffffffff80ba77a8, rsp = 0xfffffe00ae6275f0, rbp = 0xfffffe00ae627670 ---
__mtx_lock_sleep() at __mtx_lock_sleep+0xf8/frame 0xfffffe00ae627670
__mtx_lock_flags() at __mtx_lock_flags+0xee/frame 0xfffffe00ae6276c0
sctp_accept() at sctp_accept+0xa6e/frame 0xfffffe00ae627840
soaccept() at soaccept+0x174/frame 0xfffffe00ae627890
kern_accept4() at kern_accept4+0x26e/frame 0xfffffe00ae627930
accept1() at accept1+0xe8/frame 0xfffffe00ae627990
amd64_syscall() at amd64_syscall+0x291/frame 0xfffffe00ae627ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00ae627ab0

Details @ https://people.freebsd.org/~pho/stress/log/sctp.txt
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2019-06-12 13:39:57 UTC
Created attachment 205007 [details]
sctp panic debug

Attach backtrace to issue, external URL references tend to go stale/missing over time
Comment 2 Michael Tuexen freebsd_committer freebsd_triage 2019-08-04 21:12:17 UTC
Is there a way to reproduce the issue locally?
Comment 3 Peter Holm freebsd_committer freebsd_triage 2019-08-05 06:58:00 UTC
To reproduce the problem run the https://people.freebsd.org/~pho/setup.sh script like this:

root@t2:~pho/stress2/tools # ./setup.sh
Enter non-root test user name: stress
Extracting stress2 to /tmp/work
Tests to run are in /tmp/work/stress2/misc.
To run all tests, type ./all.sh -on
To run for example all tmpfs tests, type ./all.sh -on `grep -l tmpfs *.sh`
To run fdatasync.sh for one hour, type ./all.sh -m 60 fdatasync.sh
root@t2:~pho/stress2/tools # cd /tmp/work/stress2/misc
root@t2:/tmp/work/stress2/misc # ./all.sh -a sctp.sh
Note: including known problem tests.

20190805 07:18:26 all: sctp.sh
20190805 07:18:26 all (1/1): sctp.sh
    swap: run time  0+00:01:00, incarnations  16, load 100, verbose 1
20190805 07:19:40 all: sctp.sh
:
20190805 08:35:17 all: sctp.sh
20190805 08:35:17 all (1/1): sctp.sh
    swap: run time  0+00:01:00, incarnations  17, load 100, verbose 1


Fatal trap 9: general protection fault while in kernel mode
cpuid = 22; apic id = 2a
instruction pointer     = 0x20:0xffffffff80bae898
stack pointer           = 0x28:0xfffffe00cd308710
frame pointer           = 0x28:0xfffffe00cd308790
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 42445 (server)
trap number             = 9
panic: general protection fault
cpuid = 22
time = 1564986950
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00cd308420
vpanic() at vpanic+0x19d/frame 0xfffffe00cd308470
panic() at panic+0x43/frame 0xfffffe00cd3084d0
trap_fatal() at trap_fatal+0x39c/frame 0xfffffe00cd308530
trap() at trap+0x6c/frame 0xfffffe00cd308640
calltrap() at calltrap+0x8/frame 0xfffffe00cd308640
--- trap 0x9, rip = 0xffffffff80bae898, rsp = 0xfffffe00cd308710, rbp = 0xfffffe00cd308790 ---
__mtx_lock_sleep() at __mtx_lock_sleep+0xf8/frame 0xfffffe00cd308790
__mtx_lock_flags() at __mtx_lock_flags+0xee/frame 0xfffffe00cd3087e0
sctp_accept() at sctp_accept+0x517/frame 0xfffffe00cd308850
soaccept() at soaccept+0xa3/frame 0xfffffe00cd308880
kern_accept4() at kern_accept4+0x27a/frame 0xfffffe00cd308920
accept1() at accept1+0xf1/frame 0xfffffe00cd308980
amd64_syscall() at amd64_syscall+0x2d6/frame 0xfffffe00cd308ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00cd308ab0
--- syscall (30, FreeBSD ELF64, sys_accept), rip = 0x8003a4b0a, rsp = 0x7fffffffe318, rbp = 0x7fffffffe800 ---
KDB: enter: panic
[ thread pid 42445 tid 100596 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db> x/s version
version:        FreeBSD 13.0-CURRENT #1 r350555: Sat Aug  3 15:50:17 CEST 2019\012    pho@t2.osted.lan:/usr/src/sys/amd64/compile/PHO\012
db>
Comment 4 Michael Tuexen freebsd_committer freebsd_triage 2019-08-05 07:52:33 UTC
Thanks for the script. I was able to reproduce it.
Comment 5 commit-hook freebsd_committer freebsd_triage 2019-08-06 10:30:13 UTC
A commit references this bug:

Author: tuexen
Date: Tue Aug  6 10:29:20 UTC 2019
New revision: 350626
URL: https://svnweb.freebsd.org/changeset/base/350626

Log:
  Fix a locking issue in sctp_accept.

  PR:			238520
  Reported by:		pho@
  MFC after:		1 week

Changes:
  head/sys/netinet/sctp_usrreq.c
Comment 6 Michael Tuexen freebsd_committer freebsd_triage 2019-08-06 10:32:25 UTC
@pho: Could you retest with the fix from base r350626 included. I think it should fix the issue and was not able to reproduce the issue anymore.
Comment 7 Peter Holm freebsd_committer freebsd_triage 2019-08-06 14:41:05 UTC
r350626 LGTM.
Comment 8 Michael Tuexen freebsd_committer freebsd_triage 2019-08-06 14:43:58 UTC
(In reply to Peter Holm from comment #7)
Thanks!
Comment 9 commit-hook freebsd_committer freebsd_triage 2019-09-07 11:58:38 UTC
A commit references this bug:

Author: tuexen
Date: Sat Sep  7 11:58:32 UTC 2019
New revision: 352001
URL: https://svnweb.freebsd.org/changeset/base/352001

Log:
  MFC r350626:

  Fix a locking issue in sctp_accept.

  PR:			238520
  Reported by:		pho@

Changes:
_U  stable/12/
  stable/12/sys/netinet/sctp_usrreq.c
Comment 10 Kubilay Kocak freebsd_committer freebsd_triage 2019-09-07 12:12:37 UTC
@Michael Does that means stable/11 isn't affected or shouldn't get a merge?
Comment 11 Michael Tuexen freebsd_committer freebsd_triage 2019-09-07 12:25:28 UTC
(In reply to Kubilay Kocak from comment #10)

I haven't changed anything. Not sure how the flags were changed. I only MFCed the fix to stable/12. I'm still planing to MFC it to stable/11.
Comment 12 Kubilay Kocak freebsd_committer freebsd_triage 2019-09-07 12:36:44 UTC
(In reply to Michael Tuexen from comment #11)

Gotcha! The Closed->FIXED threw me off :)
Comment 13 commit-hook freebsd_committer freebsd_triage 2020-05-07 00:51:51 UTC
A commit references this bug:

Author: tuexen
Date: Thu May  7 00:50:51 UTC 2020
New revision: 360727
URL: https://svnweb.freebsd.org/changeset/base/360727

Log:
  MFC r350626: Fix a locking issue in SCTP

  Fix a locking issue in sctp_accept.

  PR:		238520
  Reported by:	pho

Changes:
_U  stable/11/
  stable/11/sys/netinet/sctp_usrreq.c