Bug 21354

Summary: FreeBSD4.1 + SMP + vinum = crashes
Product: Base System Reporter: jason-freebsd-bugs <jason-freebsd-bugs>
Component: miscAssignee: Greg Lehey <grog>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.1-RELEASE   
Hardware: Any   
OS: Any   

Description jason-freebsd-bugs 2000-09-18 13:40:01 UTC
I'm running FreeBSD 4.1-RELEASE and a kernel compiled with SMP support.
I'm also using vinum - I have four IDE disks on two Promise UDMA66 IDE
cards in a Raid-5 array.  When rsync'ing /usr to the raid array, the
machine consistently locks up and reboots.  When I boot a non-SMP
(but otherwise identical) kernel, the problem goes away.

Fix: 

Disabling SMP seems to fix the problem.
How-To-Repeat: The problem is seen with any large transfer (or many small transfers)
to a vinum raid5 array of IDE disks under an SMP kernel.
Comment 1 jason-freebsd-bugs 2000-09-27 13:16:29 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Supporting info:

Again, I'm running stock FreeBSD 4.1-RELEASE with a recompiled kernel (the
only major changes from GENERIC are the removal of unnecesary hardware
drivers and the addition of SMP).  The hardware is an Intel N440BX
motherboard, two PIII's, 256M RAM, two Promise UDMA66 IDE cards and four
46G UDMA66 IDE drives.

I can reliably create, init, and newfs the raid array, and then reliably
crash it while trying to rsync /usr (~600M) to the array.

When I run a UP kernel, I can usually finish the rsync, but not always.  
(Ie, in contrast to the original report, I have seen some problems even
with a uni-processor kernel.)

The failure mode is a kernel panic followed by a reboot.


 From `vinum list`:
- ------------------------------------------------------------------
4 drives:
D c                     State: up       Device /dev/ad4s1e      Avail: 1321/42942 MB (3%)
D d                     State: up       Device /dev/ad6s1f      Avail: 1321/42942 MB (3%)
D e                     State: up       Device /dev/ad8s1e      Avail: 1321/42942 MB (3%)
D f                     State: up       Device /dev/ad10s1e     Avail: 1321/42942 MB (3%)

1 volumes:
V cavern                State: up       Plexes:       1 Size:        121 GB

1 plexes:
P cavern.p0          R5 State: up       Subdisks:     4 Size:        121 GB

4 subdisks:
S cavern.p0.s0          State: up       PO:        0  B Size:         40 GB
S cavern.p0.s1          State: up       PO:      256 kB Size:         40 GB
S cavern.p0.s2          State: up       PO:      512 kB Size:         40 GB
S cavern.p0.s3          State: up       PO:      768 kB Size:         40 GB
- ------------------------------------------------------------------


 From /var/log/vinum_history:
- ------------------------------------------------------------------
14 Sep 2000 22:04:43.092292 *** vinum started ***
14 Sep 2000 22:04:43.101914 resetconfig 
14 Sep 2000 22:04:46.717013 *** Created devices ***
14 Sep 2000 22:04:43.101914 resetconfig 
14 Sep 2000 22:04:46.717013 *** Created devices ***
14 Sep 2000 22:05:02.215521 *** vinum started ***
14 Sep 2000 22:05:02.216099 create -f /etc/vinum.conf 
drive c device /dev/ad4s1e
drive d device /dev/ad6s1f
drive e device /dev/ad8s1e
drive f device /dev/ad10s1e
volume cavern
        plex org raid5 256k
                sd length 42620782k drive c
                sd length 42620782k drive d
                sd length 42620782k drive e
                sd length 42620782k drive f
14 Sep 2000 22:05:02.251302 *** Created devices ***
14 Sep 2000 22:05:07.970622 *** vinum started ***
14 Sep 2000 22:05:07.971189 init -w 
14 Sep 2000 22:05:11.535856 *** vinum started ***
14 Sep 2000 22:05:13.088690 l 
14 Sep 2000 22:05:36.347080 *** vinum started ***
14 Sep 2000 22:05:36.347669 init -w cavern 
14 Sep 2000 22:05:56.197179 *** vinum started ***
14 Sep 2000 22:05:56.197755 init -w cavern.p0 
23 Sep 2000 03:47:35.715201 *** vinum started ***
23 Sep 2000 03:47:35.746544 resetconfig 
23 Sep 2000 03:47:40.124410 *** Created devices ***
23 Sep 2000 03:47:35.746544 resetconfig 
23 Sep 2000 03:47:40.124410 *** Created devices ***
23 Sep 2000 03:47:47.220455 *** vinum started ***
23 Sep 2000 03:47:47.221278 create -f /etc/vinum.conf 
drive c device /dev/ad4s1e
drive d device /dev/ad6s1f
drive e device /dev/ad8s1e
drive f device /dev/ad10s1e
volume cavern
        plex org raid5 256k
                sd length 42620782k drive c
                sd length 42620782k drive d
                sd length 42620782k drive e
                sd length 42620782k drive f
23 Sep 2000 03:47:47.266729 *** Created devices ***
23 Sep 2000 03:47:51.677338 *** vinum started ***
23 Sep 2000 03:47:57.945374 init -w cavern.p0 
24 Sep 2000 05:32:33.074229 *** vinum started ***
24 Sep 2000 05:32:33.101023 list 
- ------------------------------------------------------------------

 From /var/log/messages:
- ------------------------------------------------------------------
Sep 23 03:20:07 walter /kernel: ad4: 43967MB <Maxtor 94610U6> [89331/16/63] at ata2-master using UDMA66
Sep 23 03:20:07 walter /kernel: ad6: 43967MB <Maxtor 94610U6> [89331/16/63] at ata3-master using UDMA66
Sep 23 03:20:07 walter /kernel: ad8: 43967MB <Maxtor 94610U6> [89331/16/63] at ata4-master using UDMA66
Sep 23 03:20:07 walter /kernel: ad10: 43967MB <Maxtor 94610U6> [89331/16/63] at ata5-master using UDMA66
Sep 23 03:20:07 walter /kernel: Mounting root from ufs:/dev/ad4s1a
Sep 23 03:20:07 walter /kernel: WARNING: / was not properly dismounted
Sep 23 03:20:07 walter /kernel: vinum: loaded
Sep 23 03:20:07 walter /kernel: vinum: reading configuration from /dev/ad8s1e
Sep 23 03:20:07 walter /kernel: vinum: updating configuration from /dev/ad10s1e
Sep 23 03:20:07 walter /kernel: vinum: updating configuration from /dev/ad6s1f
Sep 23 03:20:07 walter /kernel: vinum: updating configuration from /dev/ad4s1e
Sep 23 03:20:07 walter savecore: reboot after panic: page fault
Sep 23 03:20:07 walter savecore: no dump, not enough free space on device (233854 available, need 264560)
Sep 23 03:47:40 walter /kernel: vinum: CONFIGURATION OBLITERATED
Sep 23 03:47:47 walter /kernel: vinum: drive c is up
Sep 23 03:47:47 walter /kernel: vinum: drive d is up
Sep 23 03:47:47 walter /kernel: vinum: drive e is up
Sep 23 03:47:47 walter /kernel: vinum: drive f is up
Sep 23 03:47:47 walter /kernel: vinum: removing 660 blocks of partial stripe at the end of cavern.p0
Sep 23 03:47:57 walter /kernel: vinum: cavern.p0.s1 is initializing by force
Sep 23 03:47:57 walter /kernel: vinum: cavern.p0 is initializing
Sep 23 03:47:57 walter /kernel: vinum: cavern.p0.s0 is initializing by force
Sep 23 03:47:57 walter /kernel: vinum: cavern.p0.s2 is initializing by force
Sep 23 03:47:58 walter /kernel: vinum: cavern.p0.s3 is initializing by force
Sep 23 04:50:10 walter /kernel: vinum: cavern.p0.s2 is initialized by force
Sep 23 04:50:10 walter /kernel: vinum: cavern.p0.s2 is initialized
Sep 23 04:50:20 walter /kernel: vinum: cavern.p0.s0 is initialized by force
Sep 23 04:50:20 walter /kernel: vinum: cavern.p0.s0 is initialized
Sep 23 04:58:52 walter /kernel: vinum: cavern.p0.s1 is initialized by force
Sep 23 04:58:52 walter /kernel: vinum: cavern.p0.s1 is initialized
Sep 23 05:08:15 walter /kernel: vinum: cavern.p0.s3 is initialized by force
Sep 23 05:08:15 walter /kernel: vinum: cavern.p0.s0 is up
Sep 23 05:08:15 walter /kernel: vinum: cavern.p0.s1 is up
Sep 23 05:08:15 walter /kernel: vinum: cavern.p0.s2 is up
Sep 23 05:08:15 walter /kernel: vinum: cavern.p0.s3 is up
Sep 23 05:08:15 walter /kernel: vinum: cavern.p0 is up
Sep 23 05:08:15 walter /kernel: vinum: cavern is up
Sep 23 05:08:15 walter /kernel: vinum: cavern.p0.s3 is up
- ------------------------------------------------------------------
Sep 24 05:24:48 walter /kernel: Copyright (c) 1992-2000 The FreeBSD Project.
Sep 24 05:24:48 walter /kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
Sep 24 05:24:48 walter /kernel: ad4: 43967MB <Maxtor 94610U6> [89331/16/63] at ata2-master using UDMA66
Sep 24 05:24:48 walter /kernel: ad6: 43967MB <Maxtor 94610U6> [89331/16/63] at ata3-master using UDMA66
Sep 24 05:24:48 walter /kernel: ad8: 43967MB <Maxtor 94610U6> [89331/16/63] at ata4-master using UDMA66
Sep 24 05:24:48 walter /kernel: ad10: 43967MB <Maxtor 94610U6> [89331/16/63] at ata5-master using UDMA66
Sep 24 05:24:48 walter /kernel: Mounting root from ufs:/dev/ad4s1a
Sep 24 05:24:48 walter /kernel: WARNING: / was not properly dismounted
Sep 24 05:24:48 walter /kernel: vinum: loaded
Sep 24 05:24:48 walter /kernel: vinum: reading configuration from /dev/ad10s1e
Sep 24 05:24:48 walter /kernel: vinum: updating configuration from /dev/ad8s1e
Sep 24 05:24:48 walter /kernel: vinum: updating configuration from /dev/ad4s1e
Sep 24 05:24:48 walter /kernel: vinum: updating configuration from /dev/ad6s1f
Sep 24 05:24:48 walter savecore: reboot after panic: page fault
Sep 24 05:24:48 walter savecore: no dump, not enough free space on device (233668 available, need 264560)
Sep 24 05:38:53 walter /kernel: WARNING: R/W mount of /f1 denied.  Filesystem is not clean - run fsck
- ------------------------------------------------------------------

- From my attempt to debug the kernel dump (a buffer overrun?!):
- ------------------------------------------------------------------
(kgdb) where
#0  boot (howto=0x104) at ../../kern/kern_shutdown.c:302
#1  0xc015c9b8 in poweroff_wait (junk=0xc02d610f, howto=0x0) at ../../kern/kern_shutdown.c:552
#2  0xc0293e1f in trap_fatal (frame=0xff806ccc, eva=0x30) at ../../i386/i386/trap.c:927
#3  0xc0293ab5 in trap_pfault (frame=0xff806ccc, usermode=0x0, eva=0x30) at ../../i386/i386/trap.c:820
#4  0xc0293663 in trap (frame={tf_fs = 0xff800018, tf_es = 0xccc90010, tf_ds = 0x10, tf_edi = 0x0, 
      tf_esi = 0xccc94700, tf_ebp = 0xff806d10, tf_isp = 0xff806cf8, tf_ebx = 0xc02f9a1c, tf_edx = 0x68030040, 
      tf_ecx = 0xe3af2980, tf_eax = 0x0, tf_trapno = 0xc, tf_err = 0x0, tf_eip = 0xc02463dd, tf_cs = 0x8, 
      tf_eflags = 0x10202, tf_esp = 0xccc94700, tf_ss = 0xff806d30}) at ../../i386/i386/trap.c:426
#5  0xc02463dd in acquire_lock (lk=0xc02f9a1c) at machine/globals.h:114
#6  0xc024a0cc in softdep_update_inodeblock (ip=0xccc94700, bp=0xd1e97ba8, waitfor=0x0)
    at ../../ufs/ffs/ffs_softdep.c:3585
#7  0xc02456c9 in ffs_update (vp=0xe3af2980, waitfor=0x0) at ../../ufs/ffs/ffs_inode.c:105
#8  0xc024d39d in ffs_sync (mp=0xccb6c000, waitfor=0x2, cred=0xc0a99900, p=0xc0324240)
    at ../../ufs/ffs/ffs_vfsops.c:987
#9  0xc0189cbf in sync (p=0xc0324240, uap=0x0) at ../../kern/vfs_syscalls.c:544
#10 0xc015c3d3 in boot (howto=0x100) at ../../kern/kern_shutdown.c:224
#11 0xc015c9b8 in poweroff_wait (junk=0xc02d610f, howto=0x0) at ../../kern/kern_shutdown.c:552
#12 0xc0293e1f in trap_fatal (frame=0xff806ea8, eva=0xce4f601c) at ../../i386/i386/trap.c:927
#13 0xc0293ab5 in trap_pfault (frame=0xff806ea8, usermode=0x0, eva=0xce4f601c) at ../../i386/i386/trap.c:820
#14 0xc0293663 in trap (frame={tf_fs = 0xc0160018, tf_es = 0xccd70010, tf_ds = 0xccd70010, tf_edi = 0xccd7cd88, 
      tf_esi = 0x0, tf_ebp = 0xff806ef0, tf_isp = 0xff806ed4, tf_ebx = 0xce4f601c, tf_edx = 0x2000, 
      tf_ecx = 0xccc5b800, tf_eax = 0x0, tf_trapno = 0xc, tf_err = 0x0, tf_eip = 0xccba2112, tf_cs = 0x8, 
      tf_eflags = 0x10246, tf_esp = 0xccd7cc00, tf_ss = 0xccd7cc00}) at ../../i386/i386/trap.c:426
#15 0xccba2112 in unlockrange () at /usr/src/sys/modules/vinum/../../dev/vinum/vinumlock.c:284
#16 0xccb9ebfc in complete_rqe () at /usr/src/sys/modules/vinum/../../dev/vinum/vinuminterrupt.c:199
#17 0xc0181c33 in biodone (bp=0xccd7cd88) at ../../kern/vfs_bio.c:2637
#18 0xc026c378 in ad_interrupt (request=0xcdd4b140) at ../../dev/ata/ata-disk.c:571
#19 0xc0268cf6 in ata_intr (data=0xccabcd80) at ../../dev/ata/ata-all.c:1126
#20 0xc029c2e1 in intr_mux (arg=0xc0a927a0) at ../../i386/isa/intr_machdep.c:569
(kgdb) f 16
#16 0xccb9ebfc in complete_rqe () at /usr/src/sys/modules/vinum/../../dev/vinum/vinuminterrupt.c:199
199	}
(kgdb) l
194			VOL[rq->volplex.volno].active--;	    /* another request finished */
195		    biodone(ubp);				    /* top level buffer completed */
196		    freerq(rq);					    /* return the request storage */
197		}
198	    }
199	}
200	
201	/* Free a request block and anything hanging off it */
202	void
203	freerq(struct request *rq)
(kgdb) f 15
#15 0xccba2112 in unlockrange () at /usr/src/sys/modules/vinum/../../dev/vinum/vinumlock.c:284
284	}
(kgdb) l
279	#ifdef VINUMDEBUG
280	    if (debug & DEBUG_LASTREQS)
281		logrq(loginfo_lock, (union rqinfou) pos, bp);
282	#endif
283	    return pos;
284	}
285	
286	/* Unlock a volume and let the next one at it */
287	void
288	unlockrange(int plexno, struct rangelock *lock)
(kgdb)
- ------------------------------------------------------------------


 -Jason

 ---------
 I myself manage to hold large numbers of wholly irreconcilable views
 simultaneously, without the least difficulty.  I do not think others 
 are less versatile.			--Salman Rushdie




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.2 (GNU/Linux)
Comment: See https://private.idealab.com/public/jason/jason.gpg

iD8DBQE50eUjswXMWWtptckRAqWLAKD5jWp2Qj5rZnWF0Mu4mN83se8rmQCgvGro
Nu1aNnNg+nCSp6FKF/27y5A=
=OJMh
-----END PGP SIGNATURE-----
Comment 2 jason-freebsd-bugs 2001-05-25 01:30:08 UTC
This turned out to be an ata problem in 4.1 and was fixed in 4.2

Please close this pr.


 -Jason
Comment 3 Greg Lehey freebsd_committer freebsd_triage 2001-05-25 02:41:17 UTC
State Changed
From-To: open->closed

Submitter requested closure. 


Comment 4 Greg Lehey freebsd_committer freebsd_triage 2001-05-25 02:41:17 UTC
Responsible Changed
From-To: freebsd-bugs->grog

grog handles Vinum PRs.