204802 – [dtrace] panic: fatal double fault in fbt/nfs

Bug 204802 - [dtrace] panic: fatal double fault in fbt/nfs

Summary: [dtrace] panic: fatal double fault in fbt/nfs

Status:	Closed Not A Bug

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	CURRENT
Hardware:	Any Any

Importance:	--- Affects Some People
Assignee:	Mark Johnston

URL:
Keywords:

Depends on:
Blocks:

Reported:	2015-11-25 08:40 UTC by Enji Cooper
Modified:	2015-12-01 17:51 UTC (History)
CC List:	2 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Enji Cooper freebsd_committer

2015-11-25 08:40:28 UTC

I was running the dtrace test suite on ~r290924 and ran into the following panic double fault:

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:

Fatal double fault
rip = 0xffffffff82262310
rsp = 0xfffffe00dd45fc60
rbp = 0xfffffe00dd460f80
cpuid = 2; apic id = 02
panic: double fault
cpuid = 2
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d7ea0e30
vpanic() at vpanic+0x182/frame 0xfffffe00d7ea0eb0
panic() at panic+0x43/frame 0xfffffe00d7ea0f10
dblfault_handler() at dblfault_handler+0xa2/frame 0xfffffe00d7ea0f30
Xdblfault() at Xdblfault+0xac/frame 0xfffffe00d7ea0f30
--- trap 0x17, rip = 0xffffffff82262310, rsp = 0xfffffe00dd45fc60, rbp = 0xfffffe00dd460f80 ---
dtrace_disx86() at dtrace_disx86+0x170/frame 0xfffffe00dd460f80
dtrace_dis_isize() at dtrace_dis_isize+0x76/frame 0xfffffe00dd461590
dtrace_instr_size() at dtrace_instr_size+0x32/frame 0xfffffe00dd4615c0
dtrace_trap() at dtrace_trap+0x138/frame 0xfffffe00dd461630
trap_check() at trap_check+0x1e/frame 0xfffffe00dd461650
calltrap() at calltrap+0x8/frame 0xfffffe00dd461650
--- trap 0xc, rip = 0xffffffff8223b1ab, rsp = 0xfffffe00dd461720, rbp = 0xfffffe00dd461768 ---
dtrace_load8() at dtrace_load8+0x10b/frame 0xfffffe00dd461768
dtrace_dif_subr() at dtrace_dif_subr+0x2385/frame 0xfffffe00dd461ec8
dtrace_dif_emulate() at dtrace_dif_emulate+0x1df4/frame 0xfffffe00dd462488
dtrace_probe() at dtrace_probe+0xbfb/frame 0xfffffe00dd462878
fbt_invop() at fbt_invop+0x1a5/frame 0xfffffe00dd462948
dtrace_invop() at dtrace_invop+0x40/frame 0xfffffe00dd462988
dtrace_invop_start() at dtrace_invop_start+0x22/frame 0xfffffe00dd462af0
clnt_vc_call() at clnt_vc_call+0x7e7/frame 0xfffffe00dd462c60
clnt_reconnect_call() at clnt_reconnect_call+0x676/frame 0xfffffe00dd462d20
newnfs_request() at newnfs_request+0xa53/frame 0xfffffe00dd462e80
nfscl_request() at nfscl_request+0x72/frame 0xfffffe00dd462ed0
nfsrpc_getattr() at nfsrpc_getattr+0xb8/frame 0xfffffe00dd463020
nfs_getattr() at nfs_getattr+0x176/frame 0xfffffe00dd4631d0
VOP_GETATTR_APV() at VOP_GETATTR_APV+0xa0/frame 0xfffffe00dd463200
nfs_lookup() at nfs_lookup+0x483/frame 0xfffffe00dd463530
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xa0/frame 0xfffffe00dd463560
lookup() at lookup+0x59e/frame 0xfffffe00dd4635e0
namei() at namei+0x5a1/frame 0xfffffe00dd4636a0
vn_open_cred() at vn_open_cred+0x21c/frame 0xfffffe00dd463810
kern_openat() at kern_openat+0x25f/frame 0xfffffe00dd463990
amd64_syscall() at amd64_syscall+0x50b/frame 0xfffffe00dd463ab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe00dd463ab0
--- syscall (5, FreeBSD ELF64, sys_open), rip = 0x801385eaa, rsp = 0x7fffffff9938, rbp = 0x7fffffff9970 ---

Comment 1 Mark Johnston freebsd_committer

2015-11-25 18:55:54 UTC

Could you provide the core file? Which test was running when the crash occurred?

Comment 2 Mark Johnston freebsd_committer

2015-11-26 00:08:59 UTC

(For context, ngie provided me with the core file offline.)

This appears to have been caused by a stack overflow.

There's something strange about the way your kernel was compiled. In particular,
stack frames are using way more space than they should be. For instance, we
have:

(kgdb) disas dtrace_disx86
Dump of assembler code for function dtrace_disx86:
0xffffffff822621a0 <dtrace_disx86+0>:   push   %rbp
0xffffffff822621a1 <dtrace_disx86+1>:   mov    %rsp,%rbp
0xffffffff822621a4 <dtrace_disx86+4>:   push   %rbx
0xffffffff822621a5 <dtrace_disx86+5>:   sub    $0x1318,%rsp <--- over 4KB!
0xffffffff822621ac <dtrace_disx86+12>:  mov    %rdi,-0x18(%rbp)
0xffffffff822621b0 <dtrace_disx86+16>:  mov    %esi,-0x1c(%rbp)
0xffffffff822621b3 <dtrace_disx86+19>:  movl   $0x0,-0x2c(%rbp)
0xffffffff822621ba <dtrace_disx86+26>:  movl   $0x0,-0x44(%rbp)
0xffffffff822621c1 <dtrace_disx86+33>:  movl   $0x1,-0x6c(%rbp)

For comparison, I have:

(kgdb) disas dtrace_disx86
Dump of assembler code for function dtrace_disx86:
0xffffffff81504930 <dtrace_disx86+0>:   push   %rbp
0xffffffff81504931 <dtrace_disx86+1>:   mov    %rsp,%rbp
0xffffffff81504934 <dtrace_disx86+4>:   push   %r15
0xffffffff81504936 <dtrace_disx86+6>:   push   %r14
0xffffffff81504938 <dtrace_disx86+8>:   push   %r13
0xffffffff8150493a <dtrace_disx86+10>:  push   %r12
0xffffffff8150493c <dtrace_disx86+12>:  push   %rbx
0xffffffff8150493d <dtrace_disx86+13>:  sub    $0x88,%rsp <--- more reasonable
0xffffffff81504944 <dtrace_disx86+20>:  mov    %esi,%r15d
0xffffffff81504947 <dtrace_disx86+23>:  mov    %rdi,%rax
0xffffffff8150494a <dtrace_disx86+26>:  movl   $0x0,-0x2c(%rbp)
0xffffffff81504951 <dtrace_disx86+33>:  movl   $0x0,-0x30(%rbp)

Both kernels were compiled with clang 3.7.0.

Make sure you don't have any local changes or settings that might be causing
this.

Comment 3 Mark Johnston freebsd_committer

2015-12-01 17:51:51 UTC

This is the result of a problem with the build environment that produced the
kernel, not the kernel itself.