Bug 129238

Summary: [panic] System randomly panics
Product: Base System Reporter: Jurriaan Nijkamp <alias>
Component: amd64Assignee: freebsd-amd64 (Nobody) <amd64>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 7.0-RELEASE   
Hardware: Any   
OS: Any   

Description Jurriaan Nijkamp 2008-11-27 17:20:01 UTC
My home server seems to undergo random kernel panics. The panic string
always says "Page fault". I have started a thread on the FreeBSD forums
in hope of getting any answers, but did not. Even a seemlingly very
knowledgeable forum member was unable to determine the cause, let alone
a solution.

The forum thread can be read here:
http://forums.freebsd.org/showthread.php?p=3201

It contains a dmesg of my system, backtraces of the vmcore.* files and
all the things we tried in our quest to solve this mystery. In the end,
it was suggested I should submit a problem report in the hopes of finding
a solution to this.

If any further information is needed, I would be happy to oblige.

How-To-Repeat: Just leaving the server running and it will panic at some time. Panics
seem to be random. At first they occurred frequently when loading a
torrent into my torrent daemon, but this ceased after I reformatted my
ext3 disks to UFS.

The most recent occurrence was when I was streaming a video to my
playstation, using mediatomb. Besides the broad concept of disk and
network IO, there is nothing I can pinpoint.
Comment 1 Jurriaan Nijkamp 2008-12-02 07:13:22 UTC
After another crash yesterday, I got a new dump which, according to
someone who knows what he's talking about, is actually useful. He also
said the problem may lie with faulty memory, but I must say I did a
memtest86+ test last week which said my memory was working perfectly.

I am pasting the output of kgdb below.

---- output of kgdb kernel.debug /var/crash/vmcore.9 ----

[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Unde                                 fined
symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xffff800004003908
fault code              = supervisor read data, page not present
instruction pointer     = 0x8:0xffffffff8072436c
stack pointer           = 0x10:0xffffffffae6d2a00
frame pointer           = 0x10:0x4000000000
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 5551 (sed)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 4h42m45s
Physical memory: 2035 MB
Dumping 311 MB: 296 280 264 248 232 216 200 184 168 152 136 120 104 88 72
56 40                                  24 8

#0  doadump () at pcpu.h:194
194             __asm __volatile("movq %%gs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:194
#1  0x0000000000000004 in ?? ()
#2  0xffffffff804776c9 in boot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:409
#3  0xffffffff80477acd in panic (fmt=0x104 <Address 0x104 out of bounds>)
    at /usr/src/sys/kern/kern_shutdown.c:563
#4  0xffffffff8072edd4 in trap_fatal (frame=0xffffff00018bc9c0,
    eva=18446742974225594576) at /usr/src/sys/amd64/amd64/trap.c:724
#5  0xffffffff8072f1a5 in trap_pfault (frame=0xffffffffae6d2950, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:641
#6  0xffffffff8072fae8 in trap (frame=0xffffffffae6d2950)
    at /usr/src/sys/amd64/amd64/trap.c:410
#7  0xffffffff8071575e in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:169
#8  0xffffffff8072436c in pmap_remove_pages (pmap=0xffffff0001ab7778)
    at /usr/src/sys/amd64/amd64/pmap.c:388
#9  0xffffffff80688238 in vmspace_exit (td=0xffffff00018bc9c0)
    at /usr/src/sys/vm/vm_map.c:404
#10 0xffffffff804577ac in exit1 (td=0xffffff00018bc9c0, rv=0)
    at /usr/src/sys/kern/kern_exit.c:294
#11 0xffffffff80458b5e in sys_exit (td=Variable "td" is not available.
) at /usr/src/sys/kern/kern_exit.c:98
#12 0xffffffff8072f427 in syscall (frame=0xffffffffae6d2c70)
    at /usr/src/sys/amd64/amd64/trap.c:852
#13 0xffffffff8071596b in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:290
#14 0x00000008006a8b3c in ?? ()
Previous frame inner to this frame (corrupt stack?)

---- end kgdb output ----

After this, following instructions given to me, I jumped to step #8 and
did a 'list' command:

---- followup output ----

(kgdb) frame 8
#8  0xffffffff8072436c in pmap_remove_pages (pmap=0xffffff0001ab7778)
    at /usr/src/sys/amd64/amd64/pmap.c:388
388             return (PTmap + ((va >> PAGE_SHIFT) & mask));
(kgdb) list
383     PMAP_INLINE pt_entry_t *
384     vtopte(vm_offset_t va)
385     {
386             u_int64_t mask = ((1ul << (NPTEPGSHIFT + NPDEPGSHIFT +
NPDPEPGSHIFT + NPML4EPGSHIFT)) - 1);
387
388             return (PTmap + ((va >> PAGE_SHIFT) & mask));
389     }
390
391     static __inline pd_entry_t *
392     vtopde(vm_offset_t va)

---- end followup output ----
Comment 2 Jurriaan Nijkamp 2008-12-21 08:14:42 UTC
I switched the RAM from the server with some RAM from my desktop and the problem hasn't occurred since (little over two weeks). The RAM itself was fine, as is the main board, but apparantly they became instable when working together. A case of bad luck, I'd say.

This report can be closed, as far as I'm concerned.
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2008-12-21 17:25:03 UTC
State Changed
From-To: open->closed

Closed at submitter's request.