Bug 19726

Summary: fatal trap 12 / page fault
Product: Base System Reporter: jblaine <jblaine>
Component: kernAssignee: Bill Paul <wpaul>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.0-RELEASE   
Hardware: Any   
OS: Any   

Description jblaine 2000-07-06 04:50:01 UTC
Machine gateway/firewall/NAT box and is crashing and rebooting every 3 or 4 days,
which makes me very grumpy.  I wish I had the knowledge to fix it.  Feel free to
steer me through kgdb goop to help you, please.

(kgdb) symbol-file kernel
Reading symbols from kernel...(no debugging symbols found)...done.
(kgdb) exec-file /var/crash/kernel.0
(kgdb) core-file /var/crash/vmcore.0
IdlePTD 2617344
initial pcb at 2184c0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0171894
stack pointer           = 0x10:0xc01fed04
frame pointer           = 0x10:0xc01fed0c
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = 
trap number             = 12
panic: page fault

syncing disks... 8 8 4 
done
Uptime: 16h1m14s

dumping to dev #ad/0x20001, offset 196608
dump ata0: resetting devices .. done
32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 
---
#0  0xc0132ec8 in boot ()
(kgdb) where
#0  0xc0132ec8 in boot ()
#1  0xc013324c in poweroff_wait ()
#2  0xc01ccda5 in trap_fatal ()
#3  0xc01cca7d in trap_pfault ()
#4  0xc01cc647 in trap ()
#5  0xc0171894 in arpintr ()
(kgdb) quit

Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California. All rights reserved.
FreeBSD 4.0-RELEASE #4: Sun Jul  2 12:37:52 EDT 2000
    root@kickflop:/usr/src/sys/compile/BUNK
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 132631961 Hz
CPU: Pentium/P54C (132.63-MHz 586-class CPU)
  Origin = "GenuineIntel"  Id = 0x52c  Stepping = 12
  Features=0x1bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8>
real memory  = 33554432 (32768K bytes)
config> di pcic0
No such device: pcic0
Invalid command or syntax.  Type `?' for help.
config> di lnc0
No such device: lnc0
Invalid command or syntax.  Type `?' for help.
config> di le0
No such device: le0
Invalid command or syntax.  Type `?' for help.
config> di ie0
No such device: ie0
Invalid command or syntax.  Type `?' for help.
config> di fe0
No such device: fe0
Invalid command or syntax.  Type `?' for help.
config> di ed0
No such device: ed0
Invalid command or syntax.  Type `?' for help.
config> di cs0
No such device: cs0
Invalid command or syntax.  Type `?' for help.
config> di bt0
No such device: bt0
Invalid command or syntax.  Type `?' for help.
config> di aic0
No such device: aic0
Invalid command or syntax.  Type `?' for help.
config> di aha0
No such device: aha0
Invalid command or syntax.  Type `?' for help.
config> di adv0
No such device: adv0
Invalid command or syntax.  Type `?' for help.
config> en sn0
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> po sn0 0x400
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> ir sn0 10
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> f sn0 0
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> q
avail memory = 30208000 (29500K bytes)
Preloaded elf kernel "kernel" at 0xc026d000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc026d09c.
Intel Pentium detected, installing workaround for F00F bug
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
isab0: <Intel 82371FB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
pci0: <ATI Mach64-CT graphics accelerator> at 8.0
rl0: <Accton MPX 5030/5038 10/100BaseTX> port 0xf400-0xf4ff mem 0xfffbf800-0xfff
bf8ff irq 10 at device 13.0 on pci0
rl0: Ethernet address: 00:e0:29:5f:ab:88
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl1: <Accton MPX 5030/5038 10/100BaseTX> port 0xf800-0xf8ff mem 0xfffbfc00-0xfff
bfcff irq 11 at device 16.0 on pci0
rl1: Ethernet address: 00:e0:29:5f:ab:ff
miibus1: <MII bus> on rl1
rlphy1: <RealTek internal media interface> on miibus1
rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0
atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppi0: <Parallel I/O> on ppbus0
unknown0: <WSS/SB> at port 0x534-0x537,0x388-0x38b,0x220-0x22f irq 5 drq 1,0 on 
isa0
unknown1: <Game> at port 0x200-0x207 on isa0
unknown2: <Ctrl> at port 0xf00-0xf07 on isa0
unknown3: <MPU> at port 0x330-0x331 irq 9 on isa0
IP packet filtering initialized, divert enabled, rule-based forwarding enabled, 
default to accept, logging disabled
ad0: 2015MB <ST32140A> [4095/16/63] at ata0-master using BIOSPIO
Mounting root from ufs:/dev/ad0s1a

How-To-Repeat: This crash is from 3PM today.  I was running tinyfugue connected to 1 MUD.
I may have had tinyfugue backgrounded and may have been running PINE.  Very
light network traffic and just myself on the machine (which is the norm
unless I am playing some network games).
Comment 1 billf 2000-07-06 04:52:47 UTC
On Wed, Jul 05, 2000 at 08:47:27PM -0700, jblaine@mitre.org wrote:

> (kgdb) symbol-file kernel
> Reading symbols from kernel...(no debugging symbols found)...done.
> (kgdb) exec-file /var/crash/kernel.0
> (kgdb) core-file /var/crash/vmcore.0

You need to use symbol-file kernel.debug, and things will start to look
a lot more interesting. :->

-- 
Bill Fumerola - Network Architect / Computer Horizons Corp - CHIMES
e-mail: billf@chimesnet.com / billf@FreeBSD.org
Comment 2 Sheldon Hearn 2000-07-06 12:08:58 UTC
On Wed, 05 Jul 2000 20:47:27 MST, jblaine@mitre.org wrote:

> Machine gateway/firewall/NAT box and is crashing and rebooting every 3
> or 4 days, which makes me very grumpy.  I wish I had the knowledge to
> fix it.  Feel free to steer me through kgdb goop to help you, please.

I don't remember having seen this kind of problem reported by anyone
else (a panic in arpintr(), that is), and 4.0-RELEASE has been around
for a while.  Is there anything you can do to rule out hardware (e.g.
confirm that you're not overclocking, re-seat RAM, swap RAM, try an
entirely different box)?

Ciao,
Sheldon.
Comment 3 jblaine 2000-07-06 14:44:13 UTC
> You need to use symbol-file kernel.debug, and things will start to look
> a lot more interesting. :->

Doh.  But hmm...this doesn't look much (if any) different than the
previous info I sent other than it doesn't say 'no debugging symbols
found' or whatever that notice was.

(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash/kernel.0
(kgdb) core-file /var/crash/vmcore.0
IdlePTD 2617344
initial pcb at 2184c0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0171894
stack pointer           = 0x10:0xc01fed04
frame pointer           = 0x10:0xc01fed0c
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = 
trap number             = 12
panic: page fault

syncing disks... 8 8 4 
done
Uptime: 16h1m14s

dumping to dev #ad/0x20001, offset 196608
dump ata0: resetting devices .. done
32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8
7 6 5 4 3 2 1 
---
#0  0xc0132ec8 in boot ()
(kgdb) where
#0  0xc0132ec8 in boot ()
#1  0xc013324c in poweroff_wait ()
#2  0xc01ccda5 in trap_fatal ()
#3  0xc01cca7d in trap_pfault ()
#4  0xc01cc647 in trap ()
#5  0xc0171894 in arpintr ()
(kgdb)
Comment 4 jblaine 2000-07-06 14:49:44 UTC
> I don't remember having seen this kind of problem reported by anyone
> else (a panic in arpintr(), that is), and 4.0-RELEASE has been around
> for a while.  Is there anything you can do to rule out hardware (e.g.
> confirm that you're not overclocking, re-seat RAM, swap RAM, try an
> entirely different box)?

I'm not overclocked.  I can start by reseating the RAM tonight.  If it
crashes again the same way, I'll take out 16 of the 32MB and try it that
way for a bit.  If it crashes again, I'll do a flip-flop of the 16MB
RAM and try that other 16.  Trying a different box is not an option
for me.
Comment 5 jblaine 2000-07-08 17:02:19 UTC
Got another one.  I came home at 2AM, checked CNN's web site for
Wimbledon results for the day, and went to bed.  It crashed shortly
after I left my PC to head to bed and I found it this morning.
Nobody else was on the machine, and the load had to have been
tiny.  The memory had been re-seated a day ago.  I guess now
I will remove 16MB from the machine, but...

(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash/kernel.2
(kgdb) core-file /var/crash/vmcore.2
IdlePTD 2617344
initial pcb at 2184c0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0171894
stack pointer           = 0x10:0xc01fed04
frame pointer           = 0x10:0xc01fed0c
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = 
trap number             = 12
panic: page fault

syncing disks... 6 6 1 
done
Uptime: 1d6h8m13s

dumping to dev #ad/0x20001, offset 196608
dump ata0: resetting devices .. done
32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8
7 6 5 4 3 2 1 
---
#0  0xc0132ec8 in boot ()
(kgdb) where
#0  0xc0132ec8 in boot ()
#1  0xc013324c in poweroff_wait ()
#2  0xc01ccda5 in trap_fatal ()
#3  0xc01cca7d in trap_pfault ()
#4  0xc01cc647 in trap ()
#5  0xc0171894 in arpintr ()
(kgdb)
Comment 6 Sheldon Hearn freebsd_committer freebsd_triage 2000-07-10 13:40:14 UTC
State Changed
From-To: open->feedback

Still waiting for the originator to try swapping out memory, 
since (as I suggested from the start) we haven't seen this 
as a common complaint about 4.0-RELEASE.
Comment 7 jblaine 2000-07-10 15:19:16 UTC
Actually, the first suggestion was for me to re-seat the memory.
I did that, and the box still crashed.  The next suggestion was
to do some memory swapping to see if there were any bad SIMMs.
I took 16 of the 32MB out last night.
Comment 8 jblaine 2000-07-18 19:38:44 UTC
Haven't had any trouble yet with the first 16MB removed, but the
machine has been shutdown cleanly by myself several times.  The current
uptime is 3 days 16 hours and I will do my best to not bring it down
on purpose over the next few days.  Figured I'd update with a status
report since it's been 8 days.

I'm still watching it closely.
Comment 9 Sheldon Hearn 2000-07-18 19:45:01 UTC
On Tue, 18 Jul 2000 11:40:03 MST, Jeff Blaine wrote:

> Haven't had any trouble yet with the first 16MB removed, but the
> machine has been shutdown cleanly by myself several times.  The current
> uptime is 3 days 16 hours and I will do my best to not bring it down
> on purpose over the next few days.  Figured I'd update with a status
> report since it's been 8 days.

Thanks.  I'll be very surprised if it isn't the hardware. :-)

Just one thing you should be aware of: bad hardware at install time can
lead to a corrupt installation.  I recently installed 4.0-RELEASE on a
box with 1 dud DIMM.  After replacing the DIMM, certain programs still
dumped core reproducibly.  I had to re-install those binaries.  Of
course, I figured why waste time saving time and re-installed.  Probably
not the worst of ideas. :-)

Something to beware of.

Ciao,
Sheldon.
Comment 10 jblaine 2000-07-20 21:42:41 UTC
Got another one :<  I'll swap the existing 16MB out tonight with the
other 16MB I used to have in there (which made 32).

(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash/kernel.1
(kgdb) core-file /var/crash/vmcore.1
IdlePTD 2617344
initial pcb at 2184c0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0171894
stack pointer           = 0x10:0xc01fed04
frame pointer           = 0x10:0xc01fed0c
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = 
trap number             = 12
panic: page fault

syncing disks... 
done
Uptime: 5d18h8m19s

dumping to dev #ad/0x20001, offset 229376
dump ata0: resetting devices .. done
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 
---
#0  0xc0132ec8 in boot ()
(kgdb) where
#0  0xc0132ec8 in boot ()
#1  0xc013324c in poweroff_wait ()
#2  0xc01ccda5 in trap_fatal ()
#3  0xc01cca7d in trap_pfault ()
#4  0xc01cc647 in trap ()
#5  0xc0171894 in arpintr ()
(kgdb)
Comment 11 jeffblaine 2000-07-23 07:47:00 UTC
Another one, with completely different 16MB of memory this time
(replaced this afternoon).  Now what do you suggest?  I've
exhausted my 16MB memory hunks.

IdlePTD 2617344
initial pcb at 2184c0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0171894
stack pointer           = 0x10:0xc01fed04
frame pointer           = 0x10:0xc01fed0c
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          =
trap number             = 12
panic: page fault

syncing disks...
done
Uptime: 5d18h8m19s

dumping to dev #ad/0x20001, offset 229376
dump ata0: resetting devices .. done
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
---
#0  0xc0132ec8 in boot ()
(kgdb) where
#0  0xc0132ec8 in boot ()
#1  0xc013324c in poweroff_wait ()
#2  0xc01ccda5 in trap_fatal ()
#3  0xc01cca7d in trap_pfault ()
#4  0xc01cc647 in trap ()
#5  0xc0171894 in arpintr ()
(kgdb)
Comment 12 jblaine 2000-07-26 16:00:53 UTC
Another crash at 6AM this morning.  Mentioning this one only because
it was while the machine was completely idle (nobody logged in, nobody
the hitting web server, nobody with an open IMAP connection, no traffic
going through it at all).

I'll spare you the gdb -k info.  It's the same thing as before.
Comment 13 Sheldon Hearn 2000-07-27 12:59:40 UTC
Just a quick note for the audit trail to note that I've asked Jeff to
supply a backtrace with debugging symbols.  The ``feedback'' state is
still appropriate at this time.

Ciao,
Sheldon.
Comment 14 Sheldon Hearn 2000-07-27 16:24:54 UTC
On Thu, 27 Jul 2000 17:57:02 +0300, Stas Kisel wrote:

> Unfortunately, I don't have backtrace, probably because I've failed to select
> correct swap size at setup time, or because I've failed to run dumpon
> correctly.

Waahoo.  Your dmesg(8) output looks interesting.  Check this out:

> pci0: <unknown card> (vendor=0x1050, dev=0x0940) at 15.0 irq 11
> rl0: <RealTek 8139 10/100BaseTX> port 0xe400-0xe4ff mem 0xfebeff00-0xfebeffff irq 9 at device 16.0 on pci0
> rl1: <RealTek 8139 10/100BaseTX> port 0xe000-0xe0ff mem 0xfebefe00-0xfebefeff irq 11 at device 17.0 on pci0

Notice in particular the IRQ reserved for both pci0 and rl1, both with
_different_ PCI device IDs. :-)

Jeff doesn't seem to have this problem, however.  Nevertheless,
something weird is definitely going on.  Somebody give me a backtrace
with debugging symbols and we'll send this to Bill. :-)

Ciao,
Sheldon.
Comment 15 Stas Kisel 2000-07-27 19:44:36 UTC
> From: Sheldon Hearn <sheldonh@uunet.co.za>

> Waahoo.  Your dmesg(8) output looks interesting.  Check this out:
>
> > pci0: <unknown card> (vendor=0x1050, dev=0x0940) at 15.0 irq 11
> > rl0: <RealTek 8139 10/100BaseTX> port 0xe400-0xe4ff mem 0xfebeff00-0xfebeffff irq 9 at device 16.0 on pci0
> > rl1: <RealTek 8139 10/100BaseTX> port 0xe000-0xe0ff mem 0xfebefe00-0xfebefeff irq 11 at device 17.0 on pci0
>
> Notice in particular the IRQ reserved for both pci0 and rl1, both with
> _different_ PCI device IDs. :-)

It was BIOS trouble. I've fixed it. Unfortunately, this did not help machine
to recognize WinBond.
Let's see if I'll catch another page fault.

\bye
Stas

> Jeff doesn't seem to have this problem, however.  Nevertheless,
> something weird is definitely going on.  Somebody give me a backtrace
> with debugging symbols and we'll send this to Bill. :-)
>
> Ciao,
> Sheldon.
>
Comment 16 Stas Kisel 2000-07-28 10:56:52 UTC
> From: Sheldon Hearn <sheldonh@uunet.co.za>
> On Thu, 27 Jul 2000 17:57:02 +0300, Stas Kisel wrote:
> Waahoo.  Your dmesg(8) output looks interesting.  Check this out:
>
> > pci0: <unknown card> (vendor=0x1050, dev=0x0940) at 15.0 irq 11
> > rl0: <RealTek 8139 10/100BaseTX> port 0xe400-0xe4ff mem 0xfebeff00-0xfebeffff irq 9 at device 16.0 on pci0
> > rl1: <RealTek 8139 10/100BaseTX> port 0xe000-0xe0ff mem 0xfebefe00-0xfebefeff irq 11 at device 17.0 on pci0
>
> Notice in particular the IRQ reserved for both pci0 and rl1, both with
> _different_ PCI device IDs. :-)
>
> Jeff doesn't seem to have this problem, however.  Nevertheless,
> something weird is definitely going on.  Somebody give me a backtrace
> with debugging symbols and we'll send this to Bill. :-)

Our hardware guru says that it is normal for pci2 devices to share irq.
Anyways after I've assigned irq's with BIOS setup and swapped cards
I've got page fault. So I have backtrace and new dmesg.

\bye
Stas

IdlePTD 3317760
initial pcb at 2ac8e0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc018d36c
stack pointer           = 0x10:0xc028e964
frame pointer           = 0x10:0xc028e96c
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          =
trap number             = 12
panic: page fault
syncing disks... 9 2
done
Uptime: 13h57m22s

dumping to dev #ad/0x20001, offset 380928
dump ata0: resetting devices .. done
64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38
 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1
1 10 9 8 7 6 5 4 3 2 1
---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
304                     dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
#1  0xc0138bac in poweroff_wait (junk=0xc0283eef, howto=0)
    at ../../kern/kern_shutdown.c:554
#2  0xc0251cd9 in trap_fatal (frame=0xc028e924, eva=8)
    at ../../i386/i386/trap.c:924
#3  0xc02519b1 in trap_pfault (frame=0xc028e924, usermode=0, eva=8)
    at ../../i386/i386/trap.c:817
#4  0xc02515a7 in trap (frame={tf_fs = -1071775728, tf_es = -1067712496,
      tf_ds = -1067778032, tf_edi = -1, tf_esi = 0, tf_ebp = -1071060628,
      tf_isp = -1071060656, tf_ebx = 0, tf_edx = 40, tf_ecx = 0,
      tf_eax = -1067746560, tf_trapno = 12, tf_err = 0, tf_eip = -1072114836,
      tfcs = 8, tf_eflags = 66118, tf_esp = 0, tf_ss = 0})
    at ../../i386/i386/trap.c:423
#5  0xc018d36c in arpintr () at ../../netinet/if_ether.c:447
(kgdb)

Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
	The Regents of the University of California. All rights reserved.
FreeBSD 4.0-RELEASE #4: Thu Jul 27 14:56:23 EEST 2000
    root@btr.thukraine.com:/usr/src/sys/compile/btr
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (501.14-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x665  Stepping = 5
  Features=0x183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
real memory  = 67108864 (65536K bytes)
config> di ppc0
config> q
avail memory = 61677568 (60232K bytes)
Preloaded elf kernel "kernel" at 0xc0318000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc031809c.
Pentium Pro MTRR support enabled
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82443BX (440 BX) host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pcib1: <Intel 82443BX (440 BX) PCI-PCI (AGP) bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 ATA33 controller> port 0xffa0-0xffaf at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
pci0: <Intel 82371AB/EB (PIIX4) USB controller> at 7.2 irq 10
chip1: <Intel 82371AB Power management controller> port 0x440-0x44f at device 7.3 on pci0
pci0: <ATI Mach64-VT graphics accelerator> at 15.0
rl0: <RealTek 8139 10/100BaseTX> port 0xe400-0xe4ff mem 0xfebeff00-0xfebeffff irq 9 at device 16.0 on pci0
rl0: Ethernet address: 00:50:ba:83:7a:09
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl1: <RealTek 8139 10/100BaseTX> port 0xe000-0xe0ff mem 0xfebefe00-0xfebefeff irq 7 at device 17.0 on pci0
rl1: Ethernet address: 00:50:ba:83:99:c7
miibus1: <MII bus> on rl1
rlphy1: <RealTek internal media interface> on miibus1
rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pci0: <unknown card> (vendor=0x1050, dev=0x0940) at 18.0 irq 11
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model Generic PS/2 mouse, device ID 0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to accept, logging limited to 100 packets/entry by default
DUMMYNET initialized (000106)
BRIDGE 990810, have 8 interfaces
-- index 1  type 6 phy 0 addrl 6 addr 00.50.ba.83.7a.09
-- index 2  type 6 phy 0 addrl 6 addr 00.50.ba.83.99.c7
IP Filter: initialized.  Default = pass all, Logging = enabled
IP Filter: v3.3.8
ad0: 6149MB <WDC WD64AA> [13328/15/63] at ata0-master using UDMA33
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
cd9660: RockRidge Extension
cd9660: RockRidge Extension
cd9660: RockRidge Extension
ppp0: promiscuous mode enabled
ppp0: promiscuous mode disabled
Comment 17 Sheldon Hearn 2000-07-28 11:08:28 UTC
On Fri, 28 Jul 2000 12:56:52 +0300, Stas Kisel wrote:

> So I have backtrace and new dmesg.

Excellent.

Thanks,
Sheldon.
Comment 18 Sheldon Hearn freebsd_committer freebsd_triage 2000-07-28 11:08:34 UTC
State Changed
From-To: feedback->open

Bill, I held back on assigning this to you until we got 
a backtrace from a kernel with debugging symbols. 

Could you take a look? 


Comment 19 Sheldon Hearn freebsd_committer freebsd_triage 2000-07-28 11:08:34 UTC
Responsible Changed
From-To: freebsd-bugs->wpaul

Bill's the man. :-)
Comment 20 jeffblaine 2000-08-01 03:04:57 UTC
For the sake of thoroughness on my part, I'm following through with
a backtrace and dmesg output as well even though someone else
already has:

(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash/kernel.0
(kgdb) core-file /var/crash/vmcore.0
IdlePTD 2613248
initial pcb at 2184a0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0171884
stack pointer           = 0x10:0xc01fece4
frame pointer           = 0x10:0xc01fecec
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          =
trap number             = 12
panic: page fault

syncing disks...
done
Uptime: 6h44m41s

dumping to dev #ad/0x20001, offset 229376
dump ata0: resetting devices .. done
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
304                     dumppcb.pcb_cr3 = rcr3();
(kgdb) where
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
#1  0xc013324c in poweroff_wait (junk=0xc01f6c2f, howto=0)
    at ../../kern/kern_shutdown.c:554
#2  0xc01ccd65 in trap_fatal (frame=0xc01feca4, eva=8)
    at ../../i386/i386/trap.c:924
#3  0xc01cca3d in trap_pfault (frame=0xc01feca4, usermode=0, eva=8)
    at ../../i386/i386/trap.c:817
#4  0xc01cc607 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16,
      tf_edi = -1, tf_esi = 0, tf_ebp = -1071649556, tf_isp = -1071649584,
      tf_ebx = 0, tf_edx = 0, tf_ecx = 0, tf_eax = -1069020928,
      tf_trapno = 12, tf_err = 0, tf_eip = -1072228220, tf_cs = 8,
      tf_eflags = 66118, tf_esp = 0, tf_ss = 0}) at ../../i386/i386/trap.c:423
#5  0xc0171884 in arpintr () at ../../netinet/if_ether.c:447
(kgdb)

======================================================================
# dmesg
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California. All rights reserved.
FreeBSD 4.0-RELEASE #0: Fri Jul 28 09:39:54 EDT 2000
    root@kickflop:/usr/src/sys/compile/BUNK
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 132632175 Hz
CPU: Pentium/P54C (132.63-MHz 586-class CPU)
  Origin = "GenuineIntel"  Id = 0x52c  Stepping = 12
  Features=0x1bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8>
real memory  = 16777216 (16384K bytes)
config> di pcic0
No such device: pcic0
Invalid command or syntax.  Type `?' for help.
config> di lnc0
No such device: lnc0
Invalid command or syntax.  Type `?' for help.
config> di le0
No such device: le0
Invalid command or syntax.  Type `?' for help.
config> di ie0
No such device: ie0
Invalid command or syntax.  Type `?' for help.
config> di fe0
No such device: fe0
Invalid command or syntax.  Type `?' for help.
config> di ed0
No such device: ed0
Invalid command or syntax.  Type `?' for help.
config> di cs0
No such device: cs0
Invalid command or syntax.  Type `?' for help.
config> di bt0
No such device: bt0
Invalid command or syntax.  Type `?' for help.
config> di aic0
No such device: aic0
Invalid command or syntax.  Type `?' for help.
config> di aha0
No such device: aha0
Invalid command or syntax.  Type `?' for help.
config> di adv0
No such device: adv0
Invalid command or syntax.  Type `?' for help.
config> en sn0
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> po sn0 0x400
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> ir sn0 10
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> f sn0 0
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> q
avail memory = 14057472 (13728K bytes)
Preloaded elf kernel "kernel" at 0xc026c000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc026c09c.
Intel Pentium detected, installing workaround for F00F bug
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
isab0: <Intel 82371FB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
pci0: <ATI Mach64-CT graphics accelerator> at 8.0
rl0: <Accton MPX 5030/5038 10/100BaseTX> port 0xf400-0xf4ff mem 0xfffbf800-0xfff
bf8ff irq 10 at device 13.0 on pci0
rl0: Ethernet address: 00:e0:29:5f:ab:88
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl1: <Accton MPX 5030/5038 10/100BaseTX> port 0xf800-0xf8ff mem 0xfffbfc00-0xfff
bfcff irq 11 at device 16.0 on pci0
rl1: Ethernet address: 00:e0:29:5f:ab:ff
miibus1: <MII bus> on rl1
rlphy1: <RealTek internal media interface> on miibus1
rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0
atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
isab0: <Intel 82371FB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
pci0: <ATI Mach64-CT graphics accelerator> at 8.0
rl0: <Accton MPX 5030/5038 10/100BaseTX> port 0xf400-0xf4ff mem 0xfffbf800-0xfff
bf8ff irq 10 at device 13.0 on pci0
rl0: Ethernet address: 00:e0:29:5f:ab:88
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl1: <Accton MPX 5030/5038 10/100BaseTX> port 0xf800-0xf8ff mem 0xfffbfc00-0xfff
bfcff irq 11 at device 16.0 on pci0
rl1: Ethernet address: 00:e0:29:5f:ab:ff
miibus1: <MII bus> on rl1
rlphy1: <RealTek internal media interface> on miibus1
rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0
atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppi0: <Parallel I/O> on ppbus0
unknown0: <WSS/SB> at port 0x534-0x537,0x388-0x38b,0x220-0x22f irq 5 drq 1,0 on
isa0
unknown1: <Game> at port 0x200-0x207 on isa0
unknown2: <Ctrl> at port 0xf00-0xf07 on isa0
unknown3: <MPU> at port 0x330-0x331 irq 9 on isa0
IP packet filtering initialized, divert enabled, rule-based forwarding enabled,
default to accept, logging disabled
ad0: 2015MB <ST32140A> [4095/16/63] at ata0-master using BIOSPIO
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
#
Comment 21 jeffblaine 2000-08-01 03:05:00 UTC
For the sake of thoroughness on my part, I'm following through with
a backtrace and dmesg output as well even though someone else
already has:

(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash/kernel.0
(kgdb) core-file /var/crash/vmcore.0
IdlePTD 2613248
initial pcb at 2184a0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0171884
stack pointer           = 0x10:0xc01fece4
frame pointer           = 0x10:0xc01fecec
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          =
trap number             = 12
panic: page fault

syncing disks...
done
Uptime: 6h44m41s

dumping to dev #ad/0x20001, offset 229376
dump ata0: resetting devices .. done
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
304                     dumppcb.pcb_cr3 = rcr3();
(kgdb) where
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
#1  0xc013324c in poweroff_wait (junk=0xc01f6c2f, howto=0)
    at ../../kern/kern_shutdown.c:554
#2  0xc01ccd65 in trap_fatal (frame=0xc01feca4, eva=8)
    at ../../i386/i386/trap.c:924
#3  0xc01cca3d in trap_pfault (frame=0xc01feca4, usermode=0, eva=8)
    at ../../i386/i386/trap.c:817
#4  0xc01cc607 in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16,
      tf_edi = -1, tf_esi = 0, tf_ebp = -1071649556, tf_isp = -1071649584,
      tf_ebx = 0, tf_edx = 0, tf_ecx = 0, tf_eax = -1069020928,
      tf_trapno = 12, tf_err = 0, tf_eip = -1072228220, tf_cs = 8,
      tf_eflags = 66118, tf_esp = 0, tf_ss = 0}) at ../../i386/i386/trap.c:423
#5  0xc0171884 in arpintr () at ../../netinet/if_ether.c:447
(kgdb)

======================================================================
# dmesg
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California. All rights reserved.
FreeBSD 4.0-RELEASE #0: Fri Jul 28 09:39:54 EDT 2000
    root@kickflop:/usr/src/sys/compile/BUNK
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 132632175 Hz
CPU: Pentium/P54C (132.63-MHz 586-class CPU)
  Origin = "GenuineIntel"  Id = 0x52c  Stepping = 12
  Features=0x1bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8>
real memory  = 16777216 (16384K bytes)
config> di pcic0
No such device: pcic0
Invalid command or syntax.  Type `?' for help.
config> di lnc0
No such device: lnc0
Invalid command or syntax.  Type `?' for help.
config> di le0
No such device: le0
Invalid command or syntax.  Type `?' for help.
config> di ie0
No such device: ie0
Invalid command or syntax.  Type `?' for help.
config> di fe0
No such device: fe0
Invalid command or syntax.  Type `?' for help.
config> di ed0
No such device: ed0
Invalid command or syntax.  Type `?' for help.
config> di cs0
No such device: cs0
Invalid command or syntax.  Type `?' for help.
config> di bt0
No such device: bt0
Invalid command or syntax.  Type `?' for help.
config> di aic0
No such device: aic0
Invalid command or syntax.  Type `?' for help.
config> di aha0
No such device: aha0
Invalid command or syntax.  Type `?' for help.
config> di adv0
No such device: adv0
Invalid command or syntax.  Type `?' for help.
config> en sn0
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> po sn0 0x400
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> ir sn0 10
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> f sn0 0
No such device: sn0
Invalid command or syntax.  Type `?' for help.
config> q
avail memory = 14057472 (13728K bytes)
Preloaded elf kernel "kernel" at 0xc026c000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc026c09c.
Intel Pentium detected, installing workaround for F00F bug
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
isab0: <Intel 82371FB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
pci0: <ATI Mach64-CT graphics accelerator> at 8.0
rl0: <Accton MPX 5030/5038 10/100BaseTX> port 0xf400-0xf4ff mem 0xfffbf800-0xfff
bf8ff irq 10 at device 13.0 on pci0
rl0: Ethernet address: 00:e0:29:5f:ab:88
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl1: <Accton MPX 5030/5038 10/100BaseTX> port 0xf800-0xf8ff mem 0xfffbfc00-0xfff
bfcff irq 11 at device 16.0 on pci0
rl1: Ethernet address: 00:e0:29:5f:ab:ff
miibus1: <MII bus> on rl1
rlphy1: <RealTek internal media interface> on miibus1
rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0
atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
isab0: <Intel 82371FB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
pci0: <ATI Mach64-CT graphics accelerator> at 8.0
rl0: <Accton MPX 5030/5038 10/100BaseTX> port 0xf400-0xf4ff mem 0xfffbf800-0xfff
bf8ff irq 10 at device 13.0 on pci0
rl0: Ethernet address: 00:e0:29:5f:ab:88
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl1: <Accton MPX 5030/5038 10/100BaseTX> port 0xf800-0xf8ff mem 0xfffbfc00-0xfff
bfcff irq 11 at device 16.0 on pci0
rl1: Ethernet address: 00:e0:29:5f:ab:ff
miibus1: <MII bus> on rl1
rlphy1: <RealTek internal media interface> on miibus1
rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0
atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppi0: <Parallel I/O> on ppbus0
unknown0: <WSS/SB> at port 0x534-0x537,0x388-0x38b,0x220-0x22f irq 5 drq 1,0 on
isa0
unknown1: <Game> at port 0x200-0x207 on isa0
unknown2: <Ctrl> at port 0xf00-0xf07 on isa0
unknown3: <MPU> at port 0x330-0x331 irq 9 on isa0
IP packet filtering initialized, divert enabled, rule-based forwarding enabled,
default to accept, logging disabled
ad0: 2015MB <ST32140A> [4095/16/63] at ata0-master using BIOSPIO
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
#
Comment 22 jeffblaine 2000-08-04 23:03:36 UTC
I've tried twice now, and after 4+ hours of grinding my P133 to
a standstill with 4.0-STABLE building, the machine of course crashes
due to the bug I am trying to FIX.  Sigh.

Since the builds do not pick up where they left off, I give up unless
someone has a way around that.  If there's no easy way, I'll just
install 4.1-RELEASE when I set up the new machine I'm replacing this
one with (Monday or so).  It will be using the same network cards, so
the test will still be pretty valid I think.
Comment 23 Stas Kisel 2000-08-08 10:11:37 UTC
> From sheldonh@axl.ops.uunet.co.za  Fri Jul 28 15:40:12 2000
> From: Sheldon Hearn <sheldonh@uunet.co.za>
> cc: freebsd-gnats-submit@FreeBSD.org, jblaine@linus.mitre.org
> Subject: Re: kern/19726: fatal trap 12 / page fault 
>
>
>
> On Fri, 28 Jul 2000 12:56:52 +0300, Stas Kisel wrote:
>
> > So I have backtrace and new dmesg.
>
> Excellent.
>
> Thanks,
> Sheldon.
>

Hi.

Seems that I've found something interesting.
I've looked at value of mbuf pointer - in all my cores (18 now) it is equal
to 0x40020000. And I can not print contents of the mbuf:

(kgdb) p m
$1 = (struct mbuf *) 0x40020000
(kgdb) p *m
cannot read proc at 0

I've changed code a little to store pointer value before m_pullup and to
print some values.
And ( as a temporary measure ) I've tried to drop buffer pointers with
a value of 0x40020000. I suspect this is bad workaround, as it does not
free buffer, and should lead to memory leak but I have to do something
to stop my machine crashing every day! Anyways, my attempt was failed.
Machine still crashes and I don't see any "arpintr:" message in
/var/log/messages :(

But the stored mbuf pointer (before m_pullup) is more legal - I can ever
print a part of it's content.

(kgdb) p m0
$2 = (struct mbuf *) 0xc05b3600
(kgdb) p *m0
$3 = {m_hdr = {mh_next = 0xc05b3c00, mh_nextpkt = 0x0,
    mh_data = 0xc05b3640 "", mh_len = 42, mh_type = 1, mh_flags = 258},
  M_dat = {MH = {MH_pkthdr = {rcvif = 0xc0bf0a00, len = 46, header = 0x306c72can
not read proc at 0

The only one thing stops me from making a conclusion that the trouble is
in mbufs - that the condition in line 452 (see modified source below) has not
been triggered, though it seems on the first glance that it has.

It is evident from the assembler dump below, that the crash happened _after_
continue operator, which was compiled to unconditional jmp (arpintr+188).
The crash happened at address 0xc018a7a4 (arpintr+196). gdb shows that
crash happened at "continue" operator, probably due to code optimisation.

So, here I've encountered something I can not understand - gdb shows me
m equal to 0x40020000, but C condition ( m == 0x40020000 ) is not
satisfied.
And one more thing I can not understand - what for is instruction at
<arpintr+193>: leal   0x0(%esi),%esi    , immediately after unconditional
jmp, and there's no jump to it. Alignment?
I'd be very grateful if someone will explain me these too things.

And, of course, I'd be happy if this little investigation will help
to fix the trouble.

Thank you for attention.

\bye
Stas

(kgdb) up 5
#5  0xc018a7a4 in arpintr () at ../../netinet/if_ether.c:455
455                             continue;
(kgdb) l
450                             continue;
451                     }
452                     if ( m == 0x40020000 ) { /* test */
453                             printf("arpintr: m == 0x40020000\n");
454                             printf("arpintr: m0 == 0x%x\n",m0);
455                             continue;
456                             /*printf("arpintr: m->m_len == %d\n",m->m_len);
457                             printf("arpintr: m0->m_len == %d\n",m0len);*/
458                     }
459                     ar = mtod(m, struct arphdr *);


(kgdb) l arpintr
433             register struct mbuf *m, *m0;
434             register struct arphdr *ar;
435             int s, ml;
436             int m0len; /* test */
437
438             while (arpintrq.ifq_head) {
439                     s = splimp();
440                     IF_DEQUEUE(&arpintrq, m);
441                     splx(s);
442                     if (m == 0 || (m->m_flags & M_PKTHDR) == 0)
(kgdb)
443                             panic("arpintr");
444
445                     m0 = m; /* test */
446                     m0len = m0->m_len; /* test */
447                     if (m->m_len < sizeof(struct arphdr) &&
448                         (m = m_pullup(m, sizeof(struct arphdr)) == NULL)) {
449                             log(LOG_ERR, "arp: runt packet -- m_pullup faile
d.");
450                             continue;
451                     }
452                     if ( m == 0x40020000 ) { /* test */
(kgdb)
453                             printf("arpintr: m == 0x40020000\n");
454                             printf("arpintr: m0 == 0x%x\n",m0);
455                             continue;
456                             /*printf("arpintr: m->m_len == %d\n",m->m_len);
457                             printf("arpintr: m0->m_len == %d\n",m0len);*/
458                     }
459                     ar = mtod(m, struct arphdr *);
460
461                     if (ntohs(ar->ar_hrd) != ARPHRD_ETHER
462                         && ntohs(ar->ar_hrd) != ARPHRD_IEEE802) {
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
#1  0xc0136ec0 in poweroff_wait (junk=0xc025a1cf, howto=0)
    at ../../kern/kern_shutdown.c:554
#2  0xc022c729 in trap_fatal (frame=0xc0264c00, eva=8)
    at ../../i386/i386/trap.c:924
#3  0xc022c401 in trap_pfault (frame=0xc0264c00, usermode=0, eva=8)
    at ../../i386/i386/trap.c:817
#4  0xc022bff7 in trap (frame={tf_fs = -1071251440, tf_es = -1067778032,
      tf_ds = -1067778032, tf_edi = -1, tf_esi = -1067764224,
      tf_ebp = -1071231924, tf_isp = -1071231956, tf_ebx = 0, tf_edx = 40,
      tf_ecx = 0, tf_eax = -1067764224, tf_trapno = 12, tf_err = 0,
      tf_eip = -1072126044, tf_cs = 8, tf_eflags = 66183, tf_esp = 0,
      tf_ss = 0}) at ../../i386/i386/trap.c:423
#5  0xc018a7a4 in arpintr () at ../../netinet/if_ether.c:455


(kgdb) disass arpintr
Dump of assembler code for function arpintr:
0xc018a6e0 <arpintr>:	pushl  %ebp
0xc018a6e1 <arpintr+1>:	movl   %esp,%ebp
0xc018a6e3 <arpintr+3>:	pushl  %edi
0xc018a6e4 <arpintr+4>:	pushl  %esi
0xc018a6e5 <arpintr+5>:	pushl  %ebx
0xc018a6e6 <arpintr+6>:	cmpl   $0x0,0xc026f884
0xc018a6ed <arpintr+13>:	je     0xc018a848 <arpintr+360>
0xc018a6f3 <arpintr+19>:	call   0xc0234ac8 <splimp>
0xc018a6f8 <arpintr+24>:	movl   %eax,%edx
0xc018a6fa <arpintr+26>:	movl   0xc026f884,%ebx
0xc018a700 <arpintr+32>:	testl  %ebx,%ebx
0xc018a702 <arpintr+34>:	je     0xc018a729 <arpintr+73>
0xc018a704 <arpintr+36>:	movl   0x4(%ebx),%eax
0xc018a707 <arpintr+39>:	movl   %eax,0xc026f884
0xc018a70c <arpintr+44>:	testl  %eax,%eax
0xc018a70e <arpintr+46>:	jne    0xc018a71c <arpintr+60>
0xc018a710 <arpintr+48>:	movl   $0x0,0xc026f888
0xc018a71a <arpintr+58>:	movl   %esi,%esi
0xc018a71c <arpintr+60>:	movl   $0x0,0x4(%ebx)
0xc018a723 <arpintr+67>:	decl   0xc026f88c
0xc018a729 <arpintr+73>:	pushl  %edx
0xc018a72a <arpintr+74>:	call   0xc0234a54 <splx>
0xc018a72f <arpintr+79>:	addl   $0x4,%esp
0xc018a732 <arpintr+82>:	testl  %ebx,%ebx
0xc018a734 <arpintr+84>:	je     0xc018a73c <arpintr+92>
0xc018a736 <arpintr+86>:	testb  $0x2,0x12(%ebx)
0xc018a73a <arpintr+90>:	jne    0xc018a748 <arpintr+104>
0xc018a73c <arpintr+92>:	pushl  $0xc024846e
0xc018a741 <arpintr+97>:	call   0xc0136e60 <panic>
0xc018a746 <arpintr+102>:	movl   %esi,%esi
0xc018a748 <arpintr+104>:	movl   %ebx,%esi
0xc018a74a <arpintr+106>:	cmpl   $0x7,0xc(%ebx)
0xc018a74e <arpintr+110>:	ja     0xc018a77c <arpintr+156>
0xc018a750 <arpintr+112>:	pushl  $0x8
0xc018a752 <arpintr+114>:	pushl  %ebx
0xc018a753 <arpintr+115>:	call   0xc01507e0 <m_pullup>
0xc018a758 <arpintr+120>:	addl   $0x8,%esp
0xc018a75b <arpintr+123>:	testl  %eax,%eax
0xc018a75d <arpintr+125>:	sete   %al
0xc018a760 <arpintr+128>:	movzbl %al,%ebx
0xc018a763 <arpintr+131>:	testl  %ebx,%ebx
0xc018a765 <arpintr+133>:	je     0xc018a77c <arpintr+156>
0xc018a767 <arpintr+135>:	pushl  $0xc0248480
0xc018a76c <arpintr+140>:	pushl  $0x3
0xc018a76e <arpintr+142>:	call   0xc0141938 <log>
0xc018a773 <arpintr+147>:	addl   $0x8,%esp
0xc018a776 <arpintr+150>:	jmp    0xc018a6e6 <arpintr+6>
0xc018a77b <arpintr+155>:	nop    
0xc018a77c <arpintr+156>:	cmpl   $0x40020000,%ebx
0xc018a782 <arpintr+162>:	jne    0xc018a7a4 <arpintr+196>
0xc018a784 <arpintr+164>:	pushl  $0xc02484a5
0xc018a789 <arpintr+169>:	call   0xc0141a78 <printf>
0xc018a78e <arpintr+174>:	pushl  %esi
0xc018a78f <arpintr+175>:	pushl  $0xc02484bf
0xc018a794 <arpintr+180>:	call   0xc0141a78 <printf>
0xc018a799 <arpintr+185>:	addl   $0xc,%esp
0xc018a79c <arpintr+188>:	jmp    0xc018a6e6 <arpintr+6> ; continue
0xc018a7a1 <arpintr+193>:	leal   0x0(%esi),%esi

0xc018a7a4 <arpintr+196>:	movl   0x8(%ebx),%ecx	; !!!
0xc018a7a7 <arpintr+199>:	movzwl (%ecx),%eax
0xc018a7aa <arpintr+202>:	xchgb  %ah,%al
0xc018a7ac <arpintr+204>:	cmpw   $0x1,%ax
0xc018a7b0 <arpintr+208>:	je     0xc018a7e0 <arpintr+256>
0xc018a7b2 <arpintr+210>:	movzwl (%ecx),%eax
0xc018a7b5 <arpintr+213>:	xchgb  %ah,%al
0xc018a7b7 <arpintr+215>:	cmpw   $0x6,%ax
0xc018a7bb <arpintr+219>:	je     0xc018a7e0 <arpintr+256>
0xc018a7bd <arpintr+221>:	pushl  $0xc024842e
0xc018a7c2 <arpintr+226>:	pushl  %ecx
0xc018a7c3 <arpintr+227>:	pushl  $0xc02484e0
0xc018a7c8 <arpintr+232>:	pushl  $0x3
0xc018a7ca <arpintr+234>:	call   0xc0141938 <log>
0xc018a7cf <arpintr+239>:	pushl  %ebx
0xc018a7d0 <arpintr+240>:	call   0xc014fcc4 <m_freem>
0xc018a7d5 <arpintr+245>:	addl   $0x14,%esp
0xc018a7d8 <arpintr+248>:	jmp    0xc018a6e6 <arpintr+6>
0xc018a7dd <arpintr+253>:	leal   0x0(%esi),%esi
0xc018a7e0 <arpintr+256>:	movl   %ebx,%esi
0xc018a7e2 <arpintr+258>:	xorl   %edi,%edi
0xc018a7e4 <arpintr+260>:	testl  %ebx,%ebx
0xc018a7e6 <arpintr+262>:	je     0xc018a7f1 <arpintr+273>
0xc018a7e8 <arpintr+264>:	addl   0xc(%esi),%edi
0xc018a7eb <arpintr+267>:	movl   (%esi),%esi
0xc018a7ed <arpintr+269>:	testl  %esi,%esi
0xc018a7ef <arpintr+271>:	jne    0xc018a7e8 <arpintr+264>
0xc018a7f1 <arpintr+273>:	movzbl 0x4(%ecx),%edx
0xc018a7f5 <arpintr+277>:	movzbl 0x5(%ecx),%eax
0xc018a7f9 <arpintr+281>:	leal   0x8(,%eax,2),%eax
0xc018a800 <arpintr+288>:	leal   (%eax,%edx,2),%edx
0xc018a803 <arpintr+291>:	cmpl   %edx,%edi
0xc018a805 <arpintr+293>:	jae    0xc018a824 <arpintr+324>
0xc018a807 <arpintr+295>:	pushl  $0xc024850b
0xc018a80c <arpintr+300>:	pushl  $0x3
0xc018a80e <arpintr+302>:	call   0xc0141938 <log>
0xc018a813 <arpintr+307>:	pushl  %ebx
0xc018a814 <arpintr+308>:	call   0xc014fcc4 <m_freem>
0xc018a819 <arpintr+313>:	addl   $0xc,%esp
0xc018a81c <arpintr+316>:	jmp    0xc018a6e6 <arpintr+6>
0xc018a821 <arpintr+321>:	leal   0x0(%esi),%esi
0xc018a824 <arpintr+324>:	movzwl 0x2(%ecx),%eax
0xc018a828 <arpintr+328>:	xchgb  %ah,%al
0xc018a82a <arpintr+330>:	cmpw   $0x800,%ax
0xc018a82e <arpintr+334>:	jne    0xc018a838 <arpintr+344>
0xc018a830 <arpintr+336>:	pushl  %ebx
0xc018a831 <arpintr+337>:	call   0xc018a850 <in_arpinput>
0xc018a836 <arpintr+342>:	jmp    0xc018a83e <arpintr+350>
0xc018a838 <arpintr+344>:	pushl  %ebx
0xc018a839 <arpintr+345>:	call   0xc014fcc4 <m_freem>
0xc018a83e <arpintr+350>:	addl   $0x4,%esp
0xc018a841 <arpintr+353>:	jmp    0xc018a6e6 <arpintr+6>
0xc018a846 <arpintr+358>:	movl   %esi,%esi
0xc018a848 <arpintr+360>:	leal   0xfffffff4(%ebp),%esp
0xc018a84b <arpintr+363>:	popl   %ebx
0xc018a84c <arpintr+364>:	popl   %esi
0xc018a84d <arpintr+365>:	popl   %edi
0xc018a84e <arpintr+366>:	leave  
0xc018a84f <arpintr+367>:	ret    
End of assembler dump.
Comment 24 jblaine 2000-08-08 14:47:48 UTC
Meanwhile...in another part of the country...I completely replaced
my P133 and motherboard tonight with a Celeron 300A and Abit BH6
and went to install 4.1-RELEASE from CD (because I couldn't even
build world under 4.0 without crashing the P133) and what do you
know... I can't even install the OS (4.0 or 4.1 RELEASE) because
of that wonderful 'ad0/ata0 WRITE command timeout' bug. Consider
me completely unable to provide you with any information on this
bug anymore.
Comment 25 Sheldon Hearn 2000-08-23 17:39:10 UTC
For the record...

------- Forwarded Message

Date: Tue, 22 Aug 2000 19:20:19 +0300
From: Stas Kisel <stask@tiger.thukraine.com>
Message-Id: <200008221620.TAA30766@tiger.thukraine.com>
To: stask@tiger.thukraine.com, wpaul@FreeBSD.ORG
Subject: Re: arpintr
Cc: freebsd-gnats-submit@FreeBSD.ORG, jblaine@linus.mitre.org,
        sheldonh@uunet.co.za

> From wpaul@FreeBSD.ORG  Tue Aug 15 20:07:26 2000

> You are making the unwarranted assumption that the problem you're seeing
> now is related to the problem you were seeing before. Your reasoning is:
> "Well, something is still wrong, therefore it must be caused by the same
> thing as the previous problem." This reasoning is faulty. It is not the
> same bug.
>
> If you want to try to debug this, compile your kernel with options DDB
> and try to break into the kernel debugger next time it wedges.

I could not reproduce it. Probably it is good idea to close PR. Maybe
someone has something against it - but I don't. I could not reproduce
the problem during last week. If one will account this problem again,
it is good idea to try to upgrade to 4.1 or later.

Thank you.

\bye
Stas



------- End of Forwarded Message
Comment 26 stask 2001-02-11 16:21:54 UTC
Hi.

Probably I make wrong assumption of the same sort again, but when
moved from 4.1 to 4.2 I see again similar bug.
I thought this PR kern/19726 is closed so I've joined my complaints
to kern/24608.

\bye
Stas
Comment 27 rvm 2001-04-20 05:10:40 UTC
Hello,

I'm running 4.0-RELEASE, and have identified the problem in
netinet/if_ether.c; there is an operator precedence bug in arpintr().
I haven't looked at the entire history of this file, but the bug got
introduced
some time after 2.0 and is fixed in the latest (4.2-STABLE I believe).
It might have been fixed earlier.  You can patch it by moving a paren as

shown below (or by upgrading, of course :-)

Cheers!
   Rolf

--- if_ether.c.orig Fri Apr 13 19:55:52 2001
+++ if_ether.c Fri Apr 13 19:56:34 2001
@@ -442,7 +442,7 @@
    panic("arpintr");

                 if (m->m_len < sizeof(struct arphdr) &&
-                    (m = m_pullup(m, sizeof(struct arphdr)) == NULL)) {

+                    (m = m_pullup(m, sizeof(struct arphdr))) == NULL) {

    log(LOG_ERR, "arp: runt packet -- m_pullup failed.");
    continue;
   }
Comment 28 Jeroen Ruigrok van der Werven freebsd_committer freebsd_triage 2001-11-15 19:55:05 UTC
State Changed
From-To: open->closed

This has been fixed aeons ago. 

Thanks!