Bug 16740

Summary: The kernel panics with "ffs_clusteralloc: map mismatch"
Product: Base System Reporter: Joakim Henriksson <murduth>
Component: kernAssignee: Kirk McKusick <mckusick>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   

Description Joakim Henriksson 2000-02-16 06:40:01 UTC
When writing to disk the kernel sometimes panics. Either the kernel is in
error or newfs creates a bad filesystem.

Fix: 

Unknown
How-To-Repeat: 
Write to the disk and you'll sooner or later trigger the bug.
Comment 1 Joakim Henriksson 2000-03-01 12:10:10 UTC
Another data point. These crashes often causes the softupdate enabled 
file-system to get duplicate Inodes. This is loosing me files.

I'm scheduled to get a new disk sometimes next week and will be unable to test 
anything with regards to this PR after that. I have crash dumps and a 10Mbps 
network connection if anyone is willing to take a look before this time.
-- 
regards/ Joakim
Comment 2 Sheldon Hearn 2000-07-19 14:34:22 UTC
------- Forwarded Message

Date: Wed, 19 Jul 2000 15:16:12 +0200
From: Joakim Henriksson <murduth@ludd.luth.se>
To: Sheldon Hearn <sheldonh@uunet.co.za>
Subject: Re: misc/20031: kernel randomly panics with ffs_clusteralloc: map 
 mismatch
Message-Id: <200007191316.PAA08680@rmstar.campus.luth.se>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii

> > kern/16740. I can make available a crashdump from today if you
> > wish. Just tell if ftp or http or scp is most convenient. I will need
> > the hostname you wish to access from also. My host access file is
> > pretty restrictive.
> 
> I won't be able to solve this myself, but I might know the right person
> to approach for help.

This would be appreciated, i have had my machine panic twice(!) today due to 
this bug. And all i've got are these lousy t-sh^D^D^D^Dcrashdumps.

> The Environment section of PR 16740 indicates that the panics you
> received were on a 4.0-CURRENT box in February.  I think you said you're
> now using 4.1-RC, which means you're tracking the RELENG_4 stable
> branch, right?

You bet!

> If that's the case, one thing I can think of is that you may have
> _copied_ the softupdates source files instead of creating symbolic links
> for them.  Of course, now we don't use symlinks, because the soft
> updates license has changed.

Neg on that. I did have the links and got bitten by the old "cvsup don't know 
what to do with symlinks trick"

> Could you please verify that the following files on your filesystem are
> regular files?  Also, please check the versions that you have, as per
> the version numbers in brackets.  You can use the ident(1) utility to
> find the RCS/CVS version of a file.
> 
> 	/usr/src/sys/ufs/ffs/ffs_softdep.c		[1.57.2.1]
> 	/usr/src/sys/ufs/ffs/ffs_softdep_stub.c		[1.7]
> 	/usr/src/sys/ufs/ffs/softdep.h			[1.7.2.1]

murduth@rmstar /usr/src/sys/ufs/ffs >ident ffs_softdep.c
ffs_softdep.c:
     $FreeBSD: src/sys/ufs/ffs/ffs_softdep.c,v 1.57.2.1 2000/06/22 19:27:42 
peter Exp $
murduth@rmstar /usr/src/sys/ufs/ffs >ident ffs_softdep_stub.c 
ffs_softdep_stub.c:
     $FreeBSD: src/sys/ufs/ffs/ffs_softdep_stub.c,v 1.7 2000/01/10 00:24:22 
mckusick Exp $
murduth@rmstar /usr/src/sys/ufs/ffs >ident softdep.h 
softdep.h:
     $FreeBSD: src/sys/ufs/ffs/softdep.h,v 1.7.2.1 2000/06/22 19:27:42 peter 
Exp $
murduth@rmstar /usr/src/sys/ufs/ffs >

So nothing deviant about this...

> I've made the assumption that you are actually using soft updates.  Are
> you? :-)

Yep, but i seem to remember turning them of to see if that helped. And it 
didn't if i remember correctly. It's been a couple of months since this test.

> I'm not too interested in PR 20031, since it relates to an old release
> on the RELENG_3 branch.  I'm much more interested in your PR, especially
> if you can confirm that the panics occur on a recent RELENG_4 system.

I would think that it occurs in 5.0-CURRENT too, but i'm too much of a wuss to 
try it out. What with all the controversia about the interrupt threads ;)



Othermail question

> Oops, I forgot another thing.  Some mail I saw in the archives suggested
> that this might be cause by weird compiler optimizations.  Are you using
> COPTFLAGS other than "-O -pipe"?

murduth@rmstar /usr/src/sys/compile/RMSTAR #cat /etc/make.conf 
RSAREF=NO
USA_RESIDENT=NO
MASTER_SITE_OVERRIDE=ftp:/ftp.se.freebsd.org/pub/FreeBSD/ports/distfiles/${DIST
_SUBDIR}/
murduth@rmstar /usr/src/sys/compile/RMSTAR #


I always build my kernel with "make depend all" so nothing strange here either.

John Baldwin mailed freebsd-fs about this last week or so and he had the same 
chipset in the mobo. My guess is that there is a race, bug or something in the 
ata driver. But i can't substansiate it. We'll see what the originator of 
20031 says. I mailed him about his chipset.
- -- 
regards/ Joakim




------- End of Forwarded Message
Comment 3 chris 2000-07-24 05:39:43 UTC
Hi folks.  I'm the originator of PR#20031, which was closed because of its
similarity to 16740, and have an update that I hope might be useful.

Our system continued to crash (3 times in the last 24 hours!) with the
same bug, so today I upgraded to 4.0R.  The upgrade went smoothly, with no
notable disk errors, etc.  The only odd thing was this set of messages
when sysinstall first started up:

  (null): MODE_SENSE_BIG command timeout - resetting
  ata0: resetting devices .. done

Anyway, after about 30 minutes when there wasn't really any significant
disk activity, she paniced:

  start=0, len=2, fs=/
  panic: ffs_alloccg: map corrupted

I'm sorry, I didn't have kernel debugging set up at that time under the
new OS version.  However, I've since installed a kernel with debugging
symbols, so I'm ready for it the next time (at 3:30 AM, sigh).

Note that this is different from my original error, "ffs_clusteralloc: map
mismatch".

I saw something in the mail archives about disabling a burst setting in
the PCI BIOS fixing map mismatch problems with SCSI.  I don't really know
what that might mean, but thought I'd mention it.

Sheldon had asked for some output from boot -v.  It's below, in all its
expansiveness (sorry if that's more than you wanted).  

I hope this helps - I'm buying beer/pizza/both for whomever can help me
make this go away :)

Chris

Jul 23 23:16:46 nollie /kernel: Copyright (c) 1992-2000 The FreeBSD Project.
Jul 23 23:16:46 nollie /kernel: Copyright (c) 1982, 1986, 1989, 1991, 1993
Jul 23 23:16:46 nollie /kernel: The Regents of the University of California. All rights reserved.
Jul 23 23:16:46 nollie /kernel: FreeBSD 4.0-RELEASE #0: Sun Jul 23 23:12:20 EST 2000
Jul 23 23:16:46 nollie /kernel: root@nollie.summersault.com:/usr/src/sys/compile/NOLLIE.072300debug
Jul 23 23:16:46 nollie /kernel: Calibrating clock(s) ... TSC clock: 651536490 Hz, i8254 clock: 1193291 Hz
Jul 23 23:16:46 nollie /kernel: CLK_USE_I8254_CALIBRATION not specified - using default frequency
Jul 23 23:16:46 nollie /kernel: Timecounter "i8254"  frequency 1193182 Hz
Jul 23 23:16:46 nollie /kernel: CLK_USE_TSC_CALIBRATION not specified - using old calibration method
Jul 23 23:16:46 nollie /kernel: CPU: Pentium III/Pentium III Xeon (651.48-MHz 686-class CPU)
Jul 23 23:16:46 nollie /kernel: Origin = "GenuineIntel"  Id = 0x681  Stepping = 1
Jul 23 23:16:46 nollie /kernel: Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,XMM>
Jul 23 23:16:46 nollie /kernel: real memory  = 267321344 (261056K bytes)
Jul 23 23:16:46 nollie /kernel: Physical memory chunk(s):
Jul 23 23:16:46 nollie /kernel: 0x00001000 - 0x0009ffff, 651264 bytes (159 pages)
Jul 23 23:16:46 nollie /kernel: 0x00394000 - 0x0fee7fff, 263536640 bytes (64340 pages)
Jul 23 23:16:46 nollie /kernel: avail memory = 255741952 (249748K bytes)
Jul 23 23:16:46 nollie /kernel: bios32: Found BIOS32 Service Directory header at 0xc00fae30
Jul 23 23:16:46 nollie /kernel: bios32: Entry = 0xfb2a0 (c00fb2a0)  Rev = 0  Len = 1
Jul 23 23:16:46 nollie /kernel: pcibios: PCI BIOS entry at 0xb2d0
Jul 23 23:16:46 nollie /kernel: pnpbios: Found PnP BIOS data at 0xc00fbc50
Jul 23 23:16:46 nollie /kernel: pnpbios: Entry = f0000:bc80  Rev = 1.0
Jul 23 23:16:46 nollie /kernel: Other BIOS signatures found:
Jul 23 23:16:46 nollie /kernel: ACPI: 000f60d0
Jul 23 23:16:46 nollie /kernel: Preloaded elf kernel "kernel" at 0xc037b000.
Jul 23 23:16:46 nollie /kernel: Pentium Pro MTRR support enabled
Jul 23 23:16:46 nollie /kernel: md0: Malloc disk
Jul 23 23:16:46 nollie /kernel: Creating DISK md0
Jul 23 23:16:46 nollie /kernel: Math emulator present
Jul 23 23:16:46 nollie /kernel: pci_open(1):    mode 1 addr port (0x0cf8) is 0x80000050
Jul 23 23:16:46 nollie /kernel: pci_open(1a):   mode1res=0x80000000 (0x80000000)
Jul 23 23:16:46 nollie /kernel: pci_cfgcheck:   device 0 [class=060000] [hdr=00] is there (id=71208086)
Jul 23 23:16:46 nollie /kernel: npx0: <math processor> on motherboard
Jul 23 23:16:46 nollie /kernel: npx0: INT 16 interface
Jul 23 23:16:46 nollie /kernel: pci_open(1):    mode 1 addr port (0x0cf8) is 0x00000000
Jul 23 23:16:47 nollie /kernel: pci_open(1a):   mode1res=0x80000000 (0x80000000)
Jul 23 23:16:47 nollie /kernel: pci_cfgcheck:   device 0 [class=060000] [hdr=00] is there (id=71208086)
Jul 23 23:16:47 nollie /kernel: pcib0: <Intel 82810 (i810 GMCH) Host To Hub bridge> on motherboard
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x8086, dev=0x7120, revid=0x03
Jul 23 23:16:47 nollie /kernel: class=06-00-00, hdrtype=0x00, mfdev=0
Jul 23 23:16:47 nollie /kernel: subordinatebus=0        secondarybus=0
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x8086, dev=0x7121, revid=0x03
Jul 23 23:16:47 nollie /kernel: class=03-00-00, hdrtype=0x00, mfdev=0
Jul 23 23:16:47 nollie /kernel: subordinatebus=0        secondarybus=0
Jul 23 23:16:47 nollie /kernel: intpin=a, irq=9
Jul 23 23:16:47 nollie /kernel: map[10]: type 1, range 32, base d8000000, size 26
Jul 23 23:16:47 nollie /kernel: map[14]: type 1, range 32, base e0000000, size 19
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x8086, dev=0x2418, revid=0x02
Jul 23 23:16:47 nollie /kernel: class=06-04-00, hdrtype=0x01, mfdev=0
Jul 23 23:16:47 nollie /kernel: subordinatebus=1        secondarybus=1
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x8086, dev=0x2410, revid=0x02
Jul 23 23:16:47 nollie /kernel: class=06-01-00, hdrtype=0x00, mfdev=1
Jul 23 23:16:47 nollie /kernel: subordinatebus=0        secondarybus=0
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x8086, dev=0x2411, revid=0x02
Jul 23 23:16:47 nollie /kernel: class=01-01-80, hdrtype=0x00, mfdev=0
Jul 23 23:16:47 nollie /kernel: subordinatebus=0        secondarybus=0
Jul 23 23:16:47 nollie /kernel: map[20]: type 1, range 32, base 0000f000, size  4
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x8086, dev=0x2412, revid=0x02
Jul 23 23:16:47 nollie /kernel: class=0c-03-00, hdrtype=0x00, mfdev=0
Jul 23 23:16:47 nollie /kernel: subordinatebus=0        secondarybus=0
Jul 23 23:16:47 nollie /kernel: intpin=d, irq=11
Jul 23 23:16:47 nollie /kernel: map[20]: type 1, range 32, base 0000e000, size  5
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x8086, dev=0x2415, revid=0x02
Jul 23 23:16:47 nollie /kernel: class=04-01-00, hdrtype=0x00, mfdev=0
Jul 23 23:16:47 nollie /kernel: subordinatebus=0        secondarybus=0
Jul 23 23:16:47 nollie /kernel: intpin=b, irq=10
Jul 23 23:16:47 nollie /kernel: map[10]: type 1, range 32, base 0000e800, size  8
Jul 23 23:16:47 nollie /kernel: map[14]: type 1, range 32, base 0000ec00, size  6
Jul 23 23:16:47 nollie /kernel: pci0: <PCI bus> on pcib0
Jul 23 23:16:47 nollie /kernel: pci0: <Intel 82810 (i810 GMCH) SVGA controller> (vendor=0x8086, dev=0x7121) at 1.0 irq 9
Jul 23 23:16:47 nollie /kernel: pcib1: <Intel 82801AA (ICH) Hub to PCI bridge> at device 30.0 on pci0
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x10b7, dev=0x9055, revid=0x00
Jul 23 23:16:47 nollie /kernel: class=02-00-00, hdrtype=0x00, mfdev=0
Jul 23 23:16:47 nollie /kernel: subordinatebus=0        secondarybus=0
Jul 23 23:16:47 nollie /kernel: intpin=a, irq=10
Jul 23 23:16:47 nollie /kernel: map[10]: type 1, range 32, base 0000d000, size  7
Jul 23 23:16:47 nollie /kernel: map[14]: type 1, range 32, base de001000, size  7
Jul 23 23:16:47 nollie /kernel: found-> vendor=0x9004, dev=0x8778, revid=0x01
Jul 23 23:16:47 nollie /kernel: class=01-00-00, hdrtype=0x00, mfdev=0
Jul 23 23:16:47 nollie /kernel: subordinatebus=0        secondarybus=0
Jul 23 23:16:47 nollie /kernel: intpin=a, irq=11
Jul 23 23:16:47 nollie /kernel: map[10]: type 1, range 32, base 0000d400, size  8
Jul 23 23:16:47 nollie /kernel: map[14]: type 1, range 32, base de000000, size 12
Jul 23 23:16:47 nollie /kernel: pci1: <PCI bus> on pcib1
Jul 23 23:16:47 nollie /kernel: xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0xd000-0xd07f mem 0xde001000-0xde00107f irq 10 at device 5.0 on pci1
Jul 23 23:16:47 nollie /kernel: xl0: Ethernet address: 00:10:4b:34:4f:55
Jul 23 23:16:47 nollie /kernel: xl0: media options word: a
Jul 23 23:16:47 nollie /kernel: xl0: found MII/AUTO
Jul 23 23:16:47 nollie /kernel: miibus0: <MII bus> on xl0
Jul 23 23:16:47 nollie /kernel: xlphy0: <3Com internal media interface> on miibus0
Jul 23 23:16:47 nollie /kernel: xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
Jul 23 23:16:47 nollie /kernel: xl0: supplying EUI64: 00:10:4b:ff:fe:34:4f:55
Jul 23 23:16:47 nollie /kernel: bpf: xl0 attached
Jul 23 23:16:47 nollie /kernel: ahc0: <Adaptec 2940 Pro Ultra SCSI adapter> port 0xd400-0xd4ff mem 0xde000000-0xde000fff irq 11 at device 11.0 on pci1
Jul 23 23:16:47 nollie /kernel: ahc0: Reading SEEPROM...done.
Jul 23 23:16:47 nollie /kernel: ahc0: internal 50 cable not present, internal 68 cable is present
Jul 23 23:16:47 nollie /kernel: ahc0: external cable not present
Jul 23 23:16:47 nollie /kernel: ahc0: BIOS eeprom is present
Jul 23 23:16:47 nollie /kernel: ahc0: 68 pin termination Enabled
Jul 23 23:16:47 nollie /kernel: ahc0: 50 pin termination Enabled
Jul 23 23:16:47 nollie /kernel: ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
Jul 23 23:16:47 nollie /kernel: ahc0: Downloading Sequencer Program... 415 instructions downloaded
Jul 23 23:16:47 nollie /kernel: isab0: <Intel 82801AA (ICH) PCI to LPC bridge> at device 31.0 on pci0
Jul 23 23:16:47 nollie /kernel: isa0: <ISA bus> on isab0
Jul 23 23:16:47 nollie /kernel: atapci0: <Intel ICH ATA66 controller> port 0xf000-0xf00f at device 31.1 on pci0
Jul 23 23:16:47 nollie /kernel: ata0: iobase=0x01f0 altiobase=0x03f6 bmaddr=0xf000
Jul 23 23:16:47 nollie /kernel: ata0: mask=03 status0=41 status1=00
Jul 23 23:16:47 nollie /kernel: ata0: mask=03 status0=00 status1=00
Jul 23 23:16:47 nollie /kernel: ata0: devices = 0x4
Jul 23 23:16:47 nollie /kernel: ata0: at 0x1f0 irq 14 on atapci0
Jul 23 23:16:47 nollie /kernel: ata1: iobase=0x0170 altiobase=0x0376 bmaddr=0xf008
Jul 23 23:16:47 nollie /kernel: ata1: mask=03 status0=00 status1=00
Jul 23 23:16:47 nollie /kernel: ata1: mask=03 status0=00 status1=00
Jul 23 23:16:47 nollie /kernel: ata1: devices = 0x0
Jul 23 23:16:47 nollie /kernel: ata1: probe allocation failed
Jul 23 23:16:47 nollie /kernel: pci0: <Intel 82801AA (ICH) USB controller> (vendor=0x8086, dev=0x2412) at 31.2 irq 11
Jul 23 23:16:47 nollie /kernel: chip1: <Intel 82801AA (ICH) AC'97 Audio Controller> port 0xec00-0xec3f,0xe800-0xe8ff irq 10 at device 31.5 on pci0
Jul 23 23:16:47 nollie /kernel: Trying Read_Port at 203
Jul 23 23:16:47 nollie /kernel: Trying Read_Port at 243
Jul 23 23:16:47 nollie /kernel: Trying Read_Port at 283
Jul 23 23:16:47 nollie /kernel: Trying Read_Port at 2c3
Jul 23 23:16:47 nollie /kernel: Trying Read_Port at 303
Jul 23 23:16:47 nollie /kernel: Trying Read_Port at 343
Jul 23 23:16:47 nollie /kernel: Trying Read_Port at 383
Jul 23 23:16:47 nollie /kernel: Trying Read_Port at 3c3
Jul 23 23:16:47 nollie /kernel: devclass_alloc_unit: ata0 already exists, using next available unit number
Jul 23 23:16:47 nollie /kernel: devclass_alloc_unit: ata1 already exists, using next available unit number
Jul 23 23:16:47 nollie /kernel: isa_probe_children: disabling PnP devices
Jul 23 23:16:47 nollie /kernel: isa_probe_children: probing non-PnP devices
Jul 23 23:16:47 nollie /kernel: fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
Jul 23 23:16:47 nollie /kernel: fdc0: FIFO enabled, 8 bytes threshold
Jul 23 23:16:47 nollie /kernel: fd0: <1440-KB 3.5" drive> on fdc0 drive 0
Jul 23 23:16:47 nollie /kernel: ata3: iobase=0x0170 altiobase=0x0376 bmaddr=0x0000
Jul 23 23:16:47 nollie /kernel: ata3: mask=03 status0=00 status1=00
Jul 23 23:16:47 nollie /kernel: ata3: mask=03 status0=00 status1=00
Jul 23 23:16:47 nollie /kernel: ata3: devices = 0x0
Jul 23 23:16:47 nollie /kernel: ata3: probe allocation failed
Jul 23 23:16:47 nollie /kernel: bt0: Failed Status Reg Test - ff
Jul 23 23:16:47 nollie /kernel: bt_isa_probe: Probe failed at 0x330
Jul 23 23:16:47 nollie /kernel: bt0: Failed Status Reg Test - ff
Jul 23 23:16:47 nollie /kernel: bt_isa_probe: Probe failed at 0x334
Jul 23 23:16:47 nollie /kernel: bt0: Failed Status Reg Test - ff
Jul 23 23:16:47 nollie /kernel: bt_isa_probe: Probe failed at 0x230
Jul 23 23:16:47 nollie /kernel: bt0: Failed Status Reg Test - ff
Jul 23 23:16:47 nollie /kernel: bt_isa_probe: Probe failed at 0x234
Jul 23 23:16:47 nollie /kernel: bt0: Failed Status Reg Test - ff
Jul 23 23:16:47 nollie /kernel: bt_isa_probe: Probe failed at 0x130
Jul 23 23:16:47 nollie /kernel: bt0: Failed Status Reg Test - ff
Jul 23 23:16:47 nollie /kernel: bt_isa_probe: Probe failed at 0x134
Jul 23 23:16:47 nollie /kernel: aha0: status reg test failed ff
Jul 23 23:16:47 nollie last message repeated 5 times
Jul 23 23:16:47 nollie /kernel: atkbdc0: <keyboard controller (i8042)> at port 0x60-0x6f on isa0
Jul 23 23:16:47 nollie /kernel: atkbd0: <AT Keyboard> irq 1 on atkbdc0
Jul 23 23:16:47 nollie /kernel: atkbd: the current kbd controller command byte 0067
Jul 23 23:16:47 nollie /kernel: atkbd: keyboard ID 0x41ab (2)
Jul 23 23:16:47 nollie /kernel: kbdc: RESET_KBD return code:00fa
Jul 23 23:16:47 nollie /kernel: kbdc: RESET_KBD status:00aa
Jul 23 23:16:47 nollie /kernel: kbd0: atkbd0, AT 101/102 (2), config:0x0, flags:0x3d0000
Jul 23 23:16:47 nollie /kernel: psm0: current command byte:0067
Jul 23 23:16:47 nollie /kernel: kbdc: TEST_AUX_PORT status:0000
Jul 23 23:16:48 nollie /kernel: kbdc: RESET_AUX return code:00fe
Jul 23 23:16:48 nollie last message repeated 2 times
Jul 23 23:16:48 nollie /kernel: kbdc: DIAGNOSE status:0055
Jul 23 23:16:48 nollie /kernel: kbdc: TEST_KBD_PORT status:0000
Jul 23 23:16:48 nollie /kernel: psm0: failed to reset the aux device.
Jul 23 23:16:48 nollie /kernel: vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Jul 23 23:16:48 nollie /kernel: fb0: vga0, vga, type:VGA (5), flags:0x7007f
Jul 23 23:16:48 nollie /kernel: fb0: port:0x3c0-0x3df, crtc:0x3d4, mem:0xa0000 0x20000
Jul 23 23:16:48 nollie /kernel: fb0: init mode:24, bios mode:3, current mode:24
Jul 23 23:16:48 nollie /kernel: fb0: window:0xc00b8000 size:32k gran:32k, buf:0 size:32k
Jul 23 23:16:48 nollie /kernel: VGA parameters upon power-up
Jul 23 23:16:48 nollie /kernel: 50 18 10 00 00 00 03 00 02 67 5f 4f 50 82 55 81 
Jul 23 23:16:48 nollie /kernel: bf 1f 00 4f 0e 0f 00 00 07 80 9c 8e 8f 28 1f 96 
Jul 23 23:16:48 nollie /kernel: b9 a3 ff 00 01 02 03 04 05 14 07 38 39 3a 3b 3c 
Jul 23 23:16:48 nollie /kernel: 3d 3e 3f 0c 00 0f 08 00 00 00 00 00 10 0e 00 ff 
Jul 23 23:16:48 nollie /kernel: VGA parameters in BIOS for mode 24
Jul 23 23:16:48 nollie /kernel: 50 18 10 00 10 00 03 00 02 67 5f 4f 50 82 55 81 
Jul 23 23:16:48 nollie /kernel: bf 1f 00 4f 0d 0e 00 00 00 00 9c 8e 8f 28 1f 96 
Jul 23 23:16:48 nollie /kernel: b9 a3 ff 00 01 02 03 04 05 14 07 38 39 3a 3b 3c 
Jul 23 23:16:48 nollie /kernel: 3d 3e 3f 0c 00 0f 08 00 00 00 00 00 10 0e 00 ff 
Jul 23 23:16:48 nollie /kernel: EGA/VGA parameters to be used for mode 24
Jul 23 23:16:48 nollie /kernel: 50 18 10 00 10 00 03 00 02 67 5f 4f 50 82 55 81 
Jul 23 23:16:48 nollie /kernel: bf 1f 00 4f 0d 0e 00 00 00 00 9c 8e 8f 28 1f 96 
Jul 23 23:16:48 nollie /kernel: b9 a3 ff 00 01 02 03 04 05 14 07 38 39 3a 3b 3c 
Jul 23 23:16:48 nollie /kernel: 3d 3e 3f 0c 00 0f 08 00 00 00 00 00 10 0e 00 ff 
Jul 23 23:16:48 nollie /kernel: sc0: <System console> on isa0
Jul 23 23:16:48 nollie /kernel: sc0: VGA <16 virtual consoles, flags=0x200>
Jul 23 23:16:48 nollie /kernel: sc0: fb0, kbd0, terminal emulator: sc (syscons terminal)
Jul 23 23:16:48 nollie /kernel: pcic1: not probed (disabled)
Jul 23 23:16:48 nollie /kernel: sio0: irq maps: 0x41 0x51 0x41 0x41
Jul 23 23:16:48 nollie /kernel: sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
Jul 23 23:16:48 nollie /kernel: sio0: type 16550A
Jul 23 23:16:48 nollie /kernel: sio1: irq maps: 0x41 0x49 0x41 0x41
Jul 23 23:16:48 nollie /kernel: sio1 at port 0x2f8-0x2ff irq 3 on isa0
Jul 23 23:16:48 nollie /kernel: sio1: type 16550A
Jul 23 23:16:48 nollie /kernel: sio2: not probed (disabled)
Jul 23 23:16:48 nollie /kernel: sio3: not probed (disabled)
Jul 23 23:16:48 nollie /kernel: ppc0: parallel port found at 0x378
Jul 23 23:16:48 nollie /kernel: ppc0: ECP SPP ECP+EPP SPP
Jul 23 23:16:48 nollie /kernel: ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
Jul 23 23:16:48 nollie /kernel: ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
Jul 23 23:16:48 nollie /kernel: ppc0: FIFO with 16/16/16 bytes threshold
Jul 23 23:16:48 nollie /kernel: ppi0: <Parallel I/O> on ppbus0
Jul 23 23:16:48 nollie /kernel: lpt0: <Printer> on ppbus0
Jul 23 23:16:48 nollie /kernel: lpt0: Interrupt-driven port
Jul 23 23:16:48 nollie /kernel: plip0: <PLIP network interface> on ppbus0
Jul 23 23:16:48 nollie /kernel: bpf: lp0 attached
Jul 23 23:16:48 nollie /kernel: isa_probe_children: probing PnP devices
Jul 23 23:16:48 nollie /kernel: BIOS Geometries:
Jul 23 23:16:48 nollie /kernel: 0:03fefe3f 0..1022=1023 cylinders, 0..254=255 heads, 1..63=63 sectors
Jul 23 23:16:48 nollie /kernel: 0 accounted for
Jul 23 23:16:48 nollie /kernel: Device configuration finished.
Jul 23 23:16:48 nollie /kernel: bpf: sl0 attached
Jul 23 23:16:48 nollie /kernel: bpf: ppp0 attached
Jul 23 23:16:48 nollie /kernel: new masks: bio 40084040, tty 4003009a, net 4007049a
Jul 23 23:16:48 nollie /kernel: bpf: lo0 attached
Jul 23 23:16:48 nollie /kernel: bpf: gif0 attached
Jul 23 23:16:48 nollie /kernel: bpf: gif1 attached
Jul 23 23:16:48 nollie /kernel: bpf: gif2 attached
Jul 23 23:16:48 nollie /kernel: bpf: gif3 attached
Jul 23 23:16:48 nollie /kernel: bpf: stf0 attached
Jul 23 23:16:48 nollie /kernel: bpf: faith0 attached
Jul 23 23:16:48 nollie /kernel: ata0-master: piomode=2 dmamode=-1 udmamode=-1 dmaflag=0
Jul 23 23:16:48 nollie /kernel: ata0-master: timeout waiting for command=ef s=00 e=00
Jul 23 23:16:48 nollie /kernel: ata0-master: failed setting up PIO2 mode on generic chip
Jul 23 23:16:48 nollie /kernel: ata0-master: using PIO mode set by BIOS
Jul 23 23:16:48 nollie /kernel: (null): MODE_SENSE_BIG command timeout - resetting
Jul 23 23:16:48 nollie /kernel: ata0: resetting devices .. ata0: mask=01 status0=00 status1=00
Jul 23 23:16:48 nollie /kernel: done
Jul 23 23:16:48 nollie /kernel: (null): MODE_SENSE_BIG command timeout - resetting
Jul 23 23:16:48 nollie /kernel: ata0: resetting devices .. ata0: mask=01 status0=00 status1=00
Jul 23 23:16:48 nollie /kernel: done
Jul 23 23:16:48 nollie /kernel: (null): MODE_SENSE_BIG command timeout - resetting
Jul 23 23:16:48 nollie /kernel: ata0: resetting devices .. ata0: mask=01 status0=00 status1=00
Jul 23 23:16:48 nollie /kernel: done
Jul 23 23:16:48 nollie /kernel: (null): MODE_SENSE_BIG command timeout - resetting
Jul 23 23:16:48 nollie /kernel: ata0: resetting devices .. ata0: mask=01 status0=00 status1=00
Jul 23 23:16:48 nollie /kernel: done
Jul 23 23:16:48 nollie /kernel: acd0: <MATSHITA CR-574/1.06> CDROM drive at ata0 as master
Jul 23 23:16:48 nollie /kernel: acd0: read 344KB/s (689KB/s), 211KB buffer, BIOSPIO
Jul 23 23:16:48 nollie /kernel: acd0: Reads: CD-DA
Jul 23 23:16:48 nollie /kernel: acd0: Audio: play, 256 volume levels
Jul 23 23:16:48 nollie /kernel: acd0: Mechanism: ejectable tray
Jul 23 23:16:48 nollie /kernel: acd0: Medium: no/blank disc inside, unlocked
Jul 23 23:16:48 nollie /kernel: Waiting 15 seconds for SCSI devices to settle
Jul 23 23:16:48 nollie /kernel: (noperiph:ahc0:0:-1:-1): SCSI bus reset delivered. 0 SCBs aborted.
Jul 23 23:16:48 nollie /kernel: ahc0: target 0 using 16bit transfers
Jul 23 23:16:48 nollie /kernel: ahc0: target 0 synchronous at 20.0MHz, offset = 0x8
Jul 23 23:16:48 nollie /kernel: Creating DISK da0
Jul 23 23:16:48 nollie /kernel: pass0 at ahc0 bus 0 target 0 lun 0
Jul 23 23:16:48 nollie /kernel: pass0: <IBM DPSS-318350N S80D> Fixed Direct Access SCSI-3 device 
Jul 23 23:16:48 nollie /kernel: pass0: Serial Number         ZE0A4425
Jul 23 23:16:48 nollie /kernel: pass0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
Jul 23 23:16:48 nollie /kernel: da0 at ahc0 bus 0 target 0 lun 0
Jul 23 23:16:48 nollie /kernel: da0: <IBM DPSS-318350N S80D> Fixed Direct Access SCSI-3 device 
Jul 23 23:16:48 nollie /kernel: da0: Serial Number         ZE0A4425
Jul 23 23:16:48 nollie /kernel: da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
Jul 23 23:16:48 nollie /kernel: da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
Jul 23 23:16:48 nollie /kernel: Mounting root from ufs:/dev/da0s1a
Jul 23 23:16:48 nollie /kernel: da0s1: type 0xa5, start 63, end = 35841014, size 35840952 : OK
Jul 23 23:16:48 nollie /kernel: start_init: trying /sbin/init



-- Chris Hardie -----------------------------
----- mailto:chris@summersault.com ----------
-------- http://www.summersault.com/chris/ --
Comment 4 Joakim Henriksson 2000-07-24 08:15:03 UTC
> I'm sorry, I didn't have kernel debugging set up at that time under the
> new OS version.  However, I've since installed a kernel with debugging
> symbols, so I'm ready for it the next time (at 3:30 AM, sigh).
> 
> Note that this is different from my original error, "ffs_clusteralloc: map
> mismatch".
> 
> I saw something in the mail archives about disabling a burst setting in
> the PCI BIOS fixing map mismatch problems with SCSI.  I don't really know
> what that might mean, but thought I'd mention it.

They might mean turning of the PCI burst transfer mode. This will give you 
less performance of the PCI bus.

> Sheldon had asked for some output from boot -v.  It's below, in all its
> expansiveness (sorry if that's more than you wanted).  

It's easier to trim if there is to much than to reconstruct if there is to 
little ;)

> I hope this helps - I'm buying beer/pizza/both for whomever can help me
> make this go away :)

Mr Ian Dowse looked at my crashdumps and saw that the codepath was through the 
reallocation code. He sugested trying to turn that of by doing (as superuser):

sysctl -w vfs.ffs.doreallocblks=0

I haven't seen any crashes since i turned it off. But that really doesn't say 
anything since this problem is so hard to catch. Try it and mail to the list 
whether it works or not, since both datapoints are valuable.
-- 
regards/ Joakim
Comment 5 Sheldon Hearn freebsd_committer freebsd_triage 2000-07-24 10:06:52 UTC
Responsible Changed
From-To: freebsd-bugs->sos

This is looking more and more like something Soren should take 
a look at.
Comment 6 chris 2000-07-24 15:28:24 UTC
On Mon, 24 Jul 2000, Joakim Henriksson wrote:

> Mr Ian Dowse looked at my crashdumps and saw that the codepath was through the 
> reallocation code. He sugested trying to turn that of by doing (as superuser):
> 
> sysctl -w vfs.ffs.doreallocblks=0
> 
> I haven't seen any crashes since i turned it off. But that really doesn't say 
> anything since this problem is so hard to catch. Try it and mail to the list 
> whether it works or not, since both datapoints are valuable.

I've executed that command and will let you know of any noticable results
one way or the other.

The box crashed again this morning (before I ran the above command!), and
I was able to get a crash dump run through gdb, printed below.

I'm happy to let anyone who wants to poke around, if need be.

Chris

bash-2.03# gdb -k
...
(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash/kernel.5
(kgdb) core-file /var/crash/vmcore.5
IdlePTD 3723264
initial pcb at 306580
panicstr: ffs_clusteralloc: map mismatch
panic messages:
---
panic: ffs_clusteralloc: map mismatch

syncing disks... 146 94 62 32 11 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
giving up on 1 buffers
Uptime: 2h50m11s

dumping to dev #da/0x20001, offset 542848
dump 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 12 8 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
304                     dumppcb.pcb_cr3 = rcr3();
(kgdb) where
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
#1  0xc01765a0 in poweroff_wait (junk=0xc02cc9e0, howto=228)
    at ../../kern/kern_shutdown.c:554
#2  0xc023c855 in ffs_clusteralloc (ip=0xc1112500, cg=228, bpref=7471112, len=5)
    at ../../ufs/ffs/ffs_alloc.c:1182
#3  0xc023bafb in ffs_hashalloc (ip=0xc1112500, cg=228, pref=7471112, size=5, 
    allocator=0xc023c634 <ffs_clusteralloc>) at ../../ufs/ffs/ffs_alloc.c:768
#4  0xc023b4b3 in ffs_reallocblks (ap=0xccdace04) at ../../ufs/ffs/ffs_alloc.c:442
#5  0xc019bdea in cluster_write (bp=0xc60c5320, filesize=40960) at vnode_if.h:1056
#6  0xc0241cc2 in ffs_write (ap=0xccdacea0) at ../../ufs/ufs/ufs_readwrite.c:495
#7  0xc01a6492 in vn_write (fp=0xc110f540, uio=0xccdaceec, cred=0xc0f6e400, 
    flags=0, p=0xcccf9ba0) at vnode_if.h:363
#8  0xc0183517 in dofilewrite (p=0xcccf9ba0, fp=0xc110f540, fd=1, buf=0x805a000, 
    nbyte=8192, offset=-1, flags=0) at ../../sys/file.h:156
#9  0xc018341b in write (p=0xcccf9ba0, uap=0xccdacf80)
    at ../../kern/sys_generic.c:298
#10 0xc02867ae in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 1, 
      tf_esi = 0, tf_ebp = -1077936936, tf_isp = -858075180, tf_ebx = 8192, 
      tf_edx = 2, tf_ecx = 13, tf_eax = 4, tf_trapno = 12, tf_err = 2, 
      tf_eip = 134556056, tf_cs = 31, tf_eflags = 518, tf_esp = -1077937076, 
      tf_ss = 47}) at ../../i386/i386/trap.c:1073
#11 0xc02787b6 in Xint0x80_syscall ()
#12 0x8048836 in ?? ()
#13 0x80482ad in ?? ()
#14 0x80480f9 in ?? ()
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
#1  0xc01765a0 in poweroff_wait (junk=0xc02cc9e0, howto=228)
    at ../../kern/kern_shutdown.c:554
#2  0xc023c855 in ffs_clusteralloc (ip=0xc1112500, cg=228, bpref=7471112, len=5)
    at ../../ufs/ffs/ffs_alloc.c:1182
#3  0xc023bafb in ffs_hashalloc (ip=0xc1112500, cg=228, pref=7471112, size=5, 
    allocator=0xc023c634 <ffs_clusteralloc>) at ../../ufs/ffs/ffs_alloc.c:768
#4  0xc023b4b3 in ffs_reallocblks (ap=0xccdace04) at ../../ufs/ffs/ffs_alloc.c:442
#5  0xc019bdea in cluster_write (bp=0xc60c5320, filesize=40960) at vnode_if.h:1056
#6  0xc0241cc2 in ffs_write (ap=0xccdacea0) at ../../ufs/ufs/ufs_readwrite.c:495
#7  0xc01a6492 in vn_write (fp=0xc110f540, uio=0xccdaceec, cred=0xc0f6e400, 
    flags=0, p=0xcccf9ba0) at vnode_if.h:363
#8  0xc0183517 in dofilewrite (p=0xcccf9ba0, fp=0xc110f540, fd=1, buf=0x805a000, 
    nbyte=8192, offset=-1, flags=0) at ../../sys/file.h:156
#9  0xc018341b in write (p=0xcccf9ba0, uap=0xccdacf80)
    at ../../kern/sys_generic.c:298
#10 0xc02867ae in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 1, 
      tf_esi = 0, tf_ebp = -1077936936, tf_isp = -858075180, tf_ebx = 8192, 
      tf_edx = 2, tf_ecx = 13, tf_eax = 4, tf_trapno = 12, tf_err = 2, 
      tf_eip = 134556056, tf_cs = 31, tf_eflags = 518, tf_esp = -1077937076, 
      tf_ss = 47}) at ../../i386/i386/trap.c:1073
#11 0xc02787b6 in Xint0x80_syscall ()
#12 0x8048836 in ?? ()
#13 0x80482ad in ?? ()
#14 0x80480f9 in ?? ()


-- Chris Hardie -----------------------------
----- mailto:chris@summersault.com ----------
-------- http://www.summersault.com/chris/ --
Comment 7 chris 2000-07-25 14:23:56 UTC
On Mon, 24 Jul 2000, Joakim Henriksson wrote:

> Mr Ian Dowse looked at my crashdumps and saw that the codepath was through the 
> reallocation code. He sugested trying to turn that of by doing (as superuser):
> 
> sysctl -w vfs.ffs.doreallocblks=0
> 
> I haven't seen any crashes since i turned it off. But that really doesn't say 
> anything since this problem is so hard to catch. Try it and mail to the list 
> whether it works or not, since both datapoints are valuable.

So our machine crashed again this morning, despite the execution of the
above command earlier yesterday.  However, the panic message was
different:

Jul 25 02:10:43 nollie savecore: reboot after panic: ffs_blkfree: freeing free frag

Does this lead us anywhere?

The other thing to note is that the crash happened at almost exactly the
same moment as it did yesterday:

bash-2.03# last -2 reboot
reboot           ~                         Tue Jul 25 02:11 
reboot           ~                         Mon Jul 24 02:12 

It's as if one of the periodic scripts (which start running at 01:59) is
running some command or visiting some node that set off the fireworks.

I hope this helps. (If any of you want off this thread, let me know).
Chris

-- Chris Hardie -----------------------------
----- mailto:chris@summersault.com ----------
-------- http://www.summersault.com/chris/ --
Comment 8 chris 2000-07-29 07:04:25 UTC
Hi folks.  I'm interested in getting some sort of prognosis for this bug
report.  I haven't participated in this process before to this extent, and
so I don't have a good sense of what I should expect in terms of response
times, focused attention on the problem, and actual fixes.

While I do not mean to imply that my problem is necessarily your problem,
I do know that I need to take some sort of action soon - I just can't keep
getting up in the middle of the night to come in and reboot our production
server.  If I have to replace the hardware, I obviously need to do that
soon (it was bought new almost 30 days ago).  If there's some sort of
patch or alternative debugging technique to try, you have my full
cooperation.

I consider you folks to be the best qualified so far to say whether or not
there's a chance of finding a solution in a reasonable amount of time, and
so I'm looking to you for advice -- if not based on your knowledge of this
particular problem, then at least based on your previous experiences with
solving high-priority, critical-severity bugs submitted to gnats.

Thanks so much!
Chris

-- Chris Hardie -----------------------------
----- mailto:chris@summersault.com ----------
-------- http://www.summersault.com/chris/ --
Comment 9 Sheldon Hearn 2000-07-31 10:56:37 UTC
On Sat, 29 Jul 2000 01:04:25 EST, Chris Hardie wrote:

> Hi folks.  I'm interested in getting some sort of prognosis for this
> bug report.  I haven't participated in this process before to this
> extent, and so I don't have a good sense of what I should expect in
> terms of response times, focused attention on the problem, and actual
> fixes.

These things are very hard to pinpoint generally.  They depend very much
on the problem, the area of FreeBSD that it affects, the time available
to the folks responsible for that area, and often even the phase of the
moon.

Because both you and Joakim use the same ATA chipset, and because this
panic() seems otherwise uncommon, I assigned the PR to Soren Schmidt,
the ATA maintainer.  Soren is usually quite quick to look into
well-documented problem reports.

Because the panic() occurs in the ffs code, I've also asked Brian
Feldman to take a look.  Brian is often very quick to take a look at
things like this.

The only thing you can really do to speed things up is to make sure that
you answer any questions that they may have as quickly as possible.

Ciao,
Sheldon.
Comment 10 sos 2000-07-31 12:06:10 UTC
It seems Sheldon Hearn wrote:
> 
> 
> On Sat, 29 Jul 2000 01:04:25 EST, Chris Hardie wrote:
> 
> > Hi folks.  I'm interested in getting some sort of prognosis for this
> > bug report.  I haven't participated in this process before to this
> > extent, and so I don't have a good sense of what I should expect in
> > terms of response times, focused attention on the problem, and actual
> > fixes.
> 
> These things are very hard to pinpoint generally.  They depend very much
> on the problem, the area of FreeBSD that it affects, the time available
> to the folks responsible for that area, and often even the phase of the
> moon.
> 
> Because both you and Joakim use the same ATA chipset, and because this
> panic() seems otherwise uncommon, I assigned the PR to Soren Schmidt,
> the ATA maintainer.  Soren is usually quite quick to look into
> well-documented problem reports.

I just read the PR, and one of the systems is SCSI based from what I
can read from the dmesg, so its hardly ATA related...o
Ata any rate I cant reproduce it on ony of the ATA HW I have here...

Please assign this to the prober persons....

-Søren
Comment 11 Sheldon Hearn freebsd_committer freebsd_triage 2000-07-31 12:47:14 UTC
Responsible Changed
From-To: sos->freebsd-bugs

Soren is confident that this isn't ATA-related.  Brian Feldman 
has been asked to take a look.  Depending on what he finds, 
this'll probably end up assigned to green, mckusick or dillon.
Comment 12 chris 2000-08-02 06:04:33 UTC
Given the recent prognosis of "this could take a while", we need to take
action to make our system more stable, so it is my intention to move from
our SCSI drive to an IDE drive sometime in the next few days.

I realize this limits opportunities to debug the problem on our system, so
I wanted to make sure you didn't need anything else from us before we made
the switch.

I also wanted to note that we've been able to isolate the panics/crashes
to happening 99% of the time during the run of a particular software
package which does produce significant disk activity.  If examining
anything about this software would be useful, please let me know (again,
sooner rather than later).

Thanks for all your help thus far.
Chris
Comment 13 Joakim Henriksson 2000-08-16 15:27:29 UTC
Hello, i have some further datapoints to the bugs in the fs code that sev=
eral=20
persons have been bitten by.

I have a new panic that is also shared by another person. Mr Ian Dowse id=
entified that the rellocation code was in the code path of the crash. I'v=
e since then tried to run with the rellocation code turned off.

Instead of the "ffs_clusteralloc: map mismatch" i now get this panic mess=
age
"panic: ffs_blkfree: freeing free frag" it would seem that the bug is som=
ewhere else than the rellocation code.

Here is some info from the crash dump (available on request, together wit=
h the older one). Hopefully it will help someone and hopefully someone wi=
ll look into it, please?

Oh yeah, i'm running STABLE as of 2000-08-15.

IdlePTD 3665920
initial pcb at 2f7060
panicstr: ffs_blkfree: freeing free frag
panic messages:
---
panic: ffs_blkfree: freeing free frag


#0  boot (howto=3D256) at ../../kern/kern_shutdown.c:302
302                     dumppcb.pcb_cr3 =3D rcr3();
(kgdb) bt
#0  boot (howto=3D256) at ../../kern/kern_shutdown.c:302
#1  0xc014aafc in poweroff_wait (junk=3D0xc0297320, howto=3D-1071025440)
    at ../../kern/kern_shutdown.c:552
#2  0xc01fdfc7 in ffs_blkfree (ip=3D0xc87c3c48, bno=3D3239, size=3D1024)
    at ../../ufs/ffs/ffs_alloc.c:1375
#3  0xc0202132 in handle_workitem_freeblocks (freeblks=3D0xc1004400)
    at ../../ufs/ffs/ffs_softdep.c:1981
#4  0xc0201b58 in softdep_setup_freeblocks (ip=3D0xc103d400, length=3D0)
    at ../../ufs/ffs/ffs_softdep.c:1677
#5  0xc01ff72e in ffs_truncate (vp=3D0xc9389c80, length=3D0, flags=3D0, c=
red=3D0x0,=20
    p=3D0xc87b5780) at ../../ufs/ffs/ffs_inode.c:195
#6  0xc0209d2e in ufs_inactive (ap=3D0xc87c3f04) at ../../ufs/ufs/ufs_ino=
de.c:84
#7  0xc020ee19 in ufs_vnoperate (ap=3D0xc87c3f04)
    at ../../ufs/ufs/ufs_vnops.c:2285
#8  0xc01756d2 in vput (vp=3D0xc9389c80) at vnode_if.h:794
#9  0xc0202ec8 in handle_workitem_remove (dirrem=3D0xc0f2dc00)
    at ../../ufs/ffs/ffs_softdep.c:2668
#10 0xc02005b9 in softdep_process_worklist (matchmnt=3D0x0)
    at ../../ufs/ffs/ffs_softdep.c:557
#11 0xc0175003 in sched_sync () at ../../kern/vfs_subr.c:1034
#12 0xc026152c in fork_trampoline ()
Cannot access memory at address 0x8000.
(kgdb) up 2
#2  0xc01fdfc7 in ffs_blkfree (ip=3D0xc87c3c48, bno=3D3239, size=3D1024)
    at ../../ufs/ffs/ffs_alloc.c:1375
1375                                    panic("ffs_blkfree: freeing free =
frag");
(kgdb) print frags
$2 =3D 0
(kgdb) print blksfree[(bno+i)/8]
$5 =3D 255 '=FF'


--=20
regards/ Joakim




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Comment 14 Sheldon Hearn freebsd_committer freebsd_triage 2000-08-16 16:13:04 UTC
Responsible Changed
From-To: freebsd-bugs->mckusick

The last backtrace in the audit trail (the one with reallocation 
turned off) makes this look very much like a soft updates 
problem. 

Kirk, I realize that the audit trail is long, but the  
contributors have tried hard to provide debugging info for you. 
Could you take a look? :-)
Comment 15 chris 2001-04-01 14:02:09 UTC
Greetings.  We just had another one of these crashes, this time with an
IDE drive, which hadn't happened to us before.  I saw that this PR was
still open and thought I'd submit the datapoints I got from the crash.
The low uptime was from a manual reboot; this machine has been stable
otherwise.  Please advise.

Chris

chris@nollie chris> uname -a
FreeBSD nollie.summersault.com 4.2-RELEASE FreeBSD 4.2-RELEASE #0: Sat Jan
20 07:45:58 EST 2001 root@nollie.summersault.com:/usr/src/sys/compile/NOLLIE.012001  i386


IdlePTD 4485120
initial pcb at 3a5340
panicstr: ffs_clusteralloc: map mismatch
panic messages:
---
panic: ffs_clusteralloc: map mismatch

syncing disks... 151 136 97 65 28 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
giving up on 1 buffers
Uptime: 4d21h9m8s

dumping to dev #ad/0x20001, offset 542720
dump ata0: resetting devices .. done
255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238
237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220
219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202
201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184
183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166
165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148
147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130
129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112
111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91
90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66
65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41
40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
---
#0  dumpsys () at ../../kern/kern_shutdown.c:469
469             if (dumping++) {
(kgdb) where
#0  dumpsys () at ../../kern/kern_shutdown.c:469
#1  0xc01a440b in boot (howto=256) at ../../kern/kern_shutdown.c:309
#2  0xc01a4788 in poweroff_wait (junk=0xc0357480, howto=89) at
../../kern/kern_shutdown.c:556
#3  0xc0298ac5 in ffs_clusteralloc (ip=0xc1771000, cg=89, bpref=2916360,
len=2) at ../../ufs/ffs/ffs_alloc.c:1190
#4  0xc0297e03 in ffs_hashalloc (ip=0xc1771000, cg=89, pref=2916360,
size=2, allocator=0xc02988a0 <ffs_clusteralloc>)
    at ../../ufs/ffs/ffs_alloc.c:768
#5  0xc02977bb in ffs_reallocblks (ap=0xce144dc0) at
../../ufs/ffs/ffs_alloc.c:442
#6  0xc01cbe02 in cluster_write (bp=0xc6731958, filesize=16384,
seqcount=3) at vnode_if.h:1056
#7  0xc02a31ea in ffs_write (ap=0xce144e6c) at
../../ufs/ufs/ufs_readwrite.c:500
#8  0xc01d6584 in vn_write (fp=0xc176bf40, uio=0xce144edc,
cred=0xc1c26600, flags=0, p=0xce141e00) at vnode_if.h:363
#9  0xc01b2051 in dofilewrite (p=0xce141e00, fp=0xc176bf40, fd=8,
buf=0x81d000, nbyte=8192, offset=-1, flags=0)
    at ../../sys/file.h:159
#10 0xc01b1f37 in write (p=0xce141e00, uap=0xce144f80) at
../../kern/sys_generic.c:310
#11 0xc0306ff5 in syscall2 (frame={tf_fs = 7995439, tf_es = 8192047, tf_ds
= -1078001617, tf_edi = 1653676,
      tf_esi = 8507392, tf_ebp = -1077998688, tf_isp = -837529644, tf_ebx
= 1653676, tf_edx = 2777024, tf_ecx = 0,
      tf_eax = 4, tf_trapno = 0, tf_err = 7, tf_eip = 653365, tf_cs = 31,
tf_eflags = 518, tf_esp = -1077998708, tf_ss = 47})
    at ../../i386/i386/trap.c:1150
#12 0x9f835 in ?? ()
#13 0x9d8d6 in ?? ()
#14 0x9d86a in ?? ()
#15 0x9aa10 in ?? ()
#16 0x797bd in ?? ()
#17 0x49f6f in ?? ()
#18 0x4a72a in ?? ()
#19 0x4a595 in ?? ()
#20 0x4a502 in ?? ()
#21 0x4afa3 in ?? ()
#22 0x4a9f7 in ?? ()
#23 0x4a567 in ?? ()
#24 0x4a502 in ?? ()
#25 0x4ac5c in ?? ()
#26 0x4a935 in ?? ()
#27 0x4a567 in ?? ()
#28 0x4a502 in ?? ()
#29 0x4afa3 in ?? ()
#30 0x4a9f7 in ?? ()
#31 0x4a567 in ?? ()
#32 0x4a502 in ?? ()
#33 0x4afa3 in ?? ()
#34 0x4a9f7 in ?? ()
#35 0x4a567 in ?? ()
#36 0x4a502 in ?? ()
#37 0x4afa3 in ?? ()
#38 0x4a9f7 in ?? ()
#39 0x4a567 in ?? ()
#40 0x4a502 in ?? ()
#41 0x4afa3 in ?? ()
#42 0x4a9f7 in ?? ()
#43 0x4a567 in ?? ()
#44 0x4a502 in ?? ()
#45 0x4b234 in ?? ()
#46 0x4a97d in ?? ()
#47 0x4a567 in ?? ()
#48 0x4a502 in ?? ()
#49 0x4ac5c in ?? ()
#50 0x4a935 in ?? ()
#51 0x4a567 in ?? ()
#52 0x4a502 in ?? ()
#53 0x4b234 in ?? ()
#54 0x4a97d in ?? ()
#55 0x4a567 in ?? ()
#56 0x4a502 in ?? ()
#57 0x4ac5c in ?? ()
#58 0x4a935 in ?? ()
#59 0x4a567 in ?? ()
#60 0x4a502 in ?? ()
#61 0x4a3fc in ?? ()
#62 0x4a0aa in ?? ()
#63 0x7f28 in ?? ()
#64 0x6a1f in ?? ()
#65 0x4338 in ?? ()
#66 0x3df6 in ?? ()
#67 0x2efe in ?? ()
#68 0x278a in ?? ()
#69 0x110b in ?? ()
#70 0x107e in ?? ()
(kgdb) bt
#0  dumpsys () at ../../kern/kern_shutdown.c:469
#1  0xc01a440b in boot (howto=256) at ../../kern/kern_shutdown.c:309
#2  0xc01a4788 in poweroff_wait (junk=0xc0357480, howto=89) at
../../kern/kern_shutdown.c:556
#3  0xc0298ac5 in ffs_clusteralloc (ip=0xc1771000, cg=89, bpref=2916360,
len=2) at ../../ufs/ffs/ffs_alloc.c:1190
#4  0xc0297e03 in ffs_hashalloc (ip=0xc1771000, cg=89, pref=2916360,
size=2, allocator=0xc02988a0 <ffs_clusteralloc>)
    at ../../ufs/ffs/ffs_alloc.c:768
#5  0xc02977bb in ffs_reallocblks (ap=0xce144dc0) at
../../ufs/ffs/ffs_alloc.c:442
#6  0xc01cbe02 in cluster_write (bp=0xc6731958, filesize=16384,
seqcount=3) at vnode_if.h:1056
#7  0xc02a31ea in ffs_write (ap=0xce144e6c) at
../../ufs/ufs/ufs_readwrite.c:500
#8  0xc01d6584 in vn_write (fp=0xc176bf40, uio=0xce144edc,
cred=0xc1c26600, flags=0, p=0xce141e00) at vnode_if.h:363
#9  0xc01b2051 in dofilewrite (p=0xce141e00, fp=0xc176bf40, fd=8,
buf=0x81d000, nbyte=8192, offset=-1, flags=0)
    at ../../sys/file.h:159
#10 0xc01b1f37 in write (p=0xce141e00, uap=0xce144f80) at
../../kern/sys_generic.c:310
#11 0xc0306ff5 in syscall2 (frame={tf_fs = 7995439, tf_es = 8192047, tf_ds
= -1078001617, tf_edi = 1653676,
      tf_esi = 8507392, tf_ebp = -1077998688, tf_isp = -837529644, tf_ebx
= 1653676, tf_edx = 2777024, tf_ecx = 0,
      tf_eax = 4, tf_trapno = 0, tf_err = 7, tf_eip = 653365, tf_cs = 31,
tf_eflags = 518, tf_esp = -1077998708, tf_ss = 47})
    at ../../i386/i386/trap.c:1150
#12 0x9f835 in ?? ()
#13 0x9d8d6 in ?? ()
#14 0x9d86a in ?? ()
#15 0x9aa10 in ?? ()
#16 0x797bd in ?? ()
#17 0x49f6f in ?? ()
#18 0x4a72a in ?? ()
#19 0x4a595 in ?? ()
#20 0x4a502 in ?? ()
#21 0x4afa3 in ?? ()
#22 0x4a9f7 in ?? ()
#23 0x4a567 in ?? ()
#24 0x4a502 in ?? ()
#25 0x4ac5c in ?? ()
#26 0x4a935 in ?? ()
#27 0x4a567 in ?? ()
#28 0x4a502 in ?? ()
#29 0x4afa3 in ?? ()
#30 0x4a9f7 in ?? ()
#31 0x4a567 in ?? ()
#32 0x4a502 in ?? ()
#33 0x4afa3 in ?? ()
#34 0x4a9f7 in ?? ()
#35 0x4a567 in ?? ()
#36 0x4a502 in ?? ()
#37 0x4afa3 in ?? ()
#38 0x4a9f7 in ?? ()
#39 0x4a567 in ?? ()
#40 0x4a502 in ?? ()
#41 0x4afa3 in ?? ()
#42 0x4a9f7 in ?? ()
#43 0x4a567 in ?? ()
#44 0x4a502 in ?? ()
#45 0x4b234 in ?? ()
#46 0x4a97d in ?? ()
#47 0x4a567 in ?? ()
#48 0x4a502 in ?? ()
#49 0x4ac5c in ?? ()
#50 0x4a935 in ?? ()
#51 0x4a567 in ?? ()
#52 0x4a502 in ?? ()
#53 0x4b234 in ?? ()
#54 0x4a97d in ?? ()
#55 0x4a567 in ?? ()
#56 0x4a502 in ?? ()
#57 0x4ac5c in ?? ()
#58 0x4a935 in ?? ()
#59 0x4a567 in ?? ()
#60 0x4a502 in ?? ()
#61 0x4a3fc in ?? ()
#62 0x4a0aa in ?? ()
#63 0x7f28 in ?? ()
#64 0x6a1f in ?? ()
#65 0x4338 in ?? ()
#66 0x3df6 in ?? ()
#67 0x2efe in ?? ()
#68 0x278a in ?? ()
#69 0x110b in ?? ()
#70 0x107e in ?? ()
Comment 16 Kirk McKusick freebsd_committer freebsd_triage 2002-02-11 00:20:05 UTC
State Changed
From-To: open->closed

This problem is believed to be fixed by changes in the buffer cache 
code in recent months.