Bug 185827 - [nfs] [panic] Kernel Panic after upgrading NFS server from FreeBSD 9.1 to 9.2
Summary: [nfs] [panic] Kernel Panic after upgrading NFS server from FreeBSD 9.1 to 9.2
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 9.2-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-16 19:10 UTC by John Hickey
Modified: 2014-04-20 13:58 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Hickey 2014-01-16 19:10:00 UTC
After upgrading a heavily used NFS server from FreeBSD 9.1 i386 to 9.2 i386, we get a repeatable kernel panic around the time the NFS server starts during boot.  This happened at every boot with the same trace until I went back to FreeBSD 9.1:




panic: stack overflow detected; backtrace may be corrupted
cpuid = 5
KDB: stack backtrace:
#0 0xc0b180ef at kdb_backtrace+0x4f
#1 0xc0adf36f at panic+0x16f
#2 0xc0b0a482 at __stack_chk_fail+0x12
#3 0xc0cc9e83 at fha_assign+0x433
#4 0xc0a09ee0 at fhanew_assign+0x20
#5 0xc0ce2237 at svc_run_internal+0x767
#6 0xc0ce25f0 at svc_thread_start+0x10
#7 0xc0aaad9f at fork_exit+0xcf
#8 0xc0f36664 at fork_trampoline+0x8
Uptime: 7m22s
Physical memory: 3558 MB
Dumping 265 MB: 250 234 218 202 186 170 154 138 122 106 90 74 58 42 26 10

#0  doadump (textdump=1) at pcpu.h:249
249     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:249
#1  0xc0adf0b5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:449
#2  0xc0adf3b2 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:637
#3  0xc0b0a482 in __stack_chk_fail ()
    at /usr/src/sys/kern/stack_protector.c:17
#4  0xc0cc9e83 in fha_assign (this_thread=0xcf274500, req=0xcf328000,
    softc=0xc12cd860) at /usr/src/sys/nfs/nfs_fha.c:463
#5  0xc0a09ee0 in fhanew_assign (this_thread=0xcf274500, req=0xcf328000)
    at /usr/src/sys/fs/nfsserver/nfs_fha_new.c:271
#6  0xc0ce2237 in svc_run_internal (pool=0xcf272980, ismaster=0)
    at /usr/src/sys/rpc/svc.c:1109
#7  0xc0ce25f0 in svc_thread_start (arg=0xcf272980)
    at /usr/src/sys/rpc/svc.c:1200
#8  0xc0aaad9f in fork_exit (callout=0xc0ce25e0 <svc_thread_start>,
    arg=0xcf272980, frame=0xe876dd08) at /usr/src/sys/kern/kern_fork.c:992
#9  0xc0f36664 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:279
(kgdb)

How-To-Repeat: I am not sure how to repeat this outside of our environment.  We have a large number to NFS clients served by this machine.  It could have to do with a race condition exposed by such a large number of connections happening at boot time.
Comment 1 John Hickey 2014-01-16 19:53:21 UTC
This appears to be the same panic as 184771, which has a low priority.
This panic is a show stopper for our site and we can reproduce it.

John
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2014-04-20 02:48:45 UTC
State Changed
From-To: open->open

Over to maintainer(s). 


Comment 3 Mark Linimon freebsd_committer freebsd_triage 2014-04-20 02:48:45 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs
Comment 4 Rick Macklem freebsd_committer freebsd_triage 2014-04-20 13:55:44 UTC
State Changed
From-To: open->closed


I'm pretty sure this crash was fixed by r259765, which was 
MFC'd to stable/9 as r261061. It was caused by an NFSv2 mount 
and the stack corruption only seemed to affect i386 systems. 
Thanks go to John for his help with isolating the problem. 

If anyone experiences a similar crash with an up to date stable/9 
or 9.3 (when it is released) system, please submit another PR.