Bug 15611 - EIDE Large Disk Support, Newfs problem, File system corruption,IBM-DPTA-353750
Summary: EIDE Large Disk Support, Newfs problem, File system corruption,IBM-DPTA-353750
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 1999-12-21 19:40 UTC by rjbubon
Modified: 2001-03-13 20:20 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description rjbubon 1999-12-21 19:40:01 UTC
1) Using whole disk as one filesystems. Newfs exits with the following:

 72548384, 72613920, 72679456, 72744992, 72810528, 72876064, 72941600, 73007136, 73072672, 73138208,
 73203744,
write error: 0
newfs: wtfs: Read-only file system

I have been fighting this problem for a while. I even RMA'd the drive.
With first drive the newfs would panic the system. I have ran IBM's
diagnostics, low-level formated and verified the drive.

If I split the drive down the middle, 2 partitions, Strange things happen.
I can load the first partition down with data. If I start writing to the 2nd
partition, I corrupt the first. It's like the sector indexing in the OS
is broke at some large number. Maybe an overflow.

BTW I have a IBM 16.5 Gig Drive on the same system, It works fine.

nomad# disklabel -r /dev/wd1
# /dev/wd1:
type: ESDI
disk: wd1s1
label: 
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 80
sectors/cylinder: 5040
cylinders: 14536
sectors/unit: 73261440
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0           # milliseconds
track-to-track seek: 0  # milliseconds
drivedata: 0 

8 partitions:
#        size   offset    fstype   [fsize bsize bps/cpg]
  c: 73261440        0    unused        0     0         # (Cyl.    0 - 14535)
  e: 36630720        0    4.2BSD     1024  8192    16   # (Cyl.    0 - 7267)
  f: 36630720 36630720    4.2BSD     1024  8192    16   # (Cyl. 7268 - 14535)

How-To-Repeat: Newfs a really large eide drive
Comment 1 iedowse 1999-12-21 20:12:03 UTC
In message <19991221193630.A786B154E2@hub.freebsd.org>, rjbubon@bigi.com writes
:

>1) Using whole disk as one filesystems. Newfs exits with the following:
>
> 72548384, 72613920, 72679456, 72744992, 72810528, 72876064, 72941600, 7300713
>6, 73072672, 73138208,
> 73203744,
>write error: 0
>newfs: wtfs: Read-only file system

Try turning on LBA mode by or'ing 0x1000 into the flags for wdc0. e.g. in
your kernel config file use:

controller      wdc0    at isa? port "IO_WD1" bio irq 14 flags 0xb0ffb0ff

or alternatively in /boot/kernel.rc, add:

flags wdc0 0xb0ffb0ff

We have been using an IBM 36Gb disk like this (one filesystem covering
the whole disk) without any problems. We're using 3.4, but I know that it
worked at least as far back as 3.2. The dmesg output gives:

wdc0 at 0x1f0-0x1f7 irq 14 flags 0xb0ffb0ff on isa
...
wdc0: unit 1 (wd1): <IBM-DPTA-353750>, LBA, DMA, 32-bit, multi-block-16
wd1: 35772MB (73261440 sectors), 4560 cyls, 255 heads, 63 S/T, 512 B/S

Ian
Comment 2 wsb 1999-12-24 04:10:06 UTC
I believe I'm running into a similar problem. I have a
Maxtor 40GB IDE disk split into 2 slices. The first slice
is 1GB with swap and root(/) in it and the second has a
single 38GB file system. I created everything when I
installed 3.3 on the system (Athlon, all ide devices)
and the install worked without errors. After reboot,
the large FS showed mounted but I couldn't write to it.
I got "bad file descriptor". So, I unmounted and tried
to remount. First try, mount segfaulted. Second try,
I got back to "bad file descriptor".

I would rate this as severe since it basically puts
the machine out of commission. I had planned on adding
more 40GB drives hoping to take advantage of FreeBSD's
large file support/NFSV3. 

On a side note, the drive worked fine under Linux kernel
2.2.5 from RH6.0 so I know it is healthy. 

If someone has some ideas on how to debug/test a fix,
please let me know.

Thanks.


Wes Bauske
Comment 3 mjacob 1999-12-27 00:01:40 UTC
For what it's worth, I just attached a 37GB drive like Wes' to my
-current Tyan mother board
system:

ata1-master: success setting up UDMA2 mode on PIIX4 chip
ad1: piomode=4 dmamode=2 udmamode=4 cblid=1
ad1: <IBM-DPTA-353750/P51OA30A> ATA-4 disk at ata1 as master
ad1: 35772MB (73261440 sectors), 72680 cyls, 16 heads, 63 S/T, 512 B/S
ad1: 16 secs/int, 32 depth queue, UDMA33
Creating DISK ad1

I had no trouble whatsoever creating a filesystem that covered the whole
disk.
sysinstall questioned the geometry, but all seemed well otherwise.

lorq.feral.com > root disklabel ad1
# /dev/rad1c:
type: ESDI
disk: ad1s1
label:
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 4559
sectors/unit: 73256337
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0           # milliseconds
track-to-track seek: 0  # milliseconds
drivedata: 0

8 partitions:
#        size   offset    fstype   [fsize bsize bps/cpg]
  a: 10000000        0    4.2BSD     1024  8192    16   # (Cyl.    0 -
622*)
  c: 73256337        0    unused        0     0         # (Cyl.    0 -
4559*)
  d: 73256337        0    4.2BSD     1024  8192    16   # (Cyl.    0 -
4559*)
Comment 4 wsb 1999-12-27 00:37:11 UTC
Matt,

OK. I assume you also wrote some significant files onto it??
I wrote a 2GB file for testing.

The file system does create most of the time. It's only when
you start writing data to it that there's trouble. Also, if
your single file system works, then it may be something to
do with my layout. To recap, I have 2 slices, one is 1GB
with 128MB swap, and the rest root(/) and the other slice
contains a single FS of around 38GB as work space for my
application.

I'm running pure 3.3 from the Walnut Creek CD, no updates.

I haven't made any further progress at this point. I do know
that once I exceeded 32GB for the FS in the second slice, I
trashed the first slice. By that, I mean root was hosed
complaining about I-node problems on reboot and dropping to
a shell to "fix" it but I had no idea what would fix thousands
of I-node errors so I just re-installed.

As I mentioned, this is my first FreeBSD install so if there's
something I'm assuming that is incorrect, like the above slice
layout for example, let me know.

Interesting that FreeBSD questioned the geometry. I have Linux
on another partition and it's fdisk complains about the 40GB
drive on the second IDE channel's geometry but not the ones
on the first IDE channel. (I have 3 40GB drives on this box for
tests) Did you put your test drive on the secondary IDE channel??

Also, what about forcing the driver to use LBA mode? I tried
that but might not have done it correctly and it still failed.
I used 0xf0ff for the flag. 

Get's tiring to reinstall the OS after each test.


Wes

Matthew Jacob wrote:
> 
> For what it's worth, I just attached a 37GB drive like Wes' to my
> -current Tyan mother board
> system:
> 
> ata1-master: success setting up UDMA2 mode on PIIX4 chip
> ad1: piomode=4 dmamode=2 udmamode=4 cblid=1
> ad1: <IBM-DPTA-353750/P51OA30A> ATA-4 disk at ata1 as master
> ad1: 35772MB (73261440 sectors), 72680 cyls, 16 heads, 63 S/T, 512 B/S
> ad1: 16 secs/int, 32 depth queue, UDMA33
> Creating DISK ad1
> 
> I had no trouble whatsoever creating a filesystem that covered the whole
> disk.
> sysinstall questioned the geometry, but all seemed well otherwise.
> 
> lorq.feral.com > root disklabel ad1
> # /dev/rad1c:
> type: ESDI
> disk: ad1s1
> label:
> flags:
> bytes/sector: 512
> sectors/track: 63
> tracks/cylinder: 255
> sectors/cylinder: 16065
> cylinders: 4559
> sectors/unit: 73256337
> rpm: 3600
> interleave: 1
> trackskew: 0
> cylinderskew: 0
> headswitch: 0           # milliseconds
> track-to-track seek: 0  # milliseconds
> drivedata: 0
> 
> 8 partitions:
> #        size   offset    fstype   [fsize bsize bps/cpg]
>   a: 10000000        0    4.2BSD     1024  8192    16   # (Cyl.    0 -
> 622*)
>   c: 73256337        0    unused        0     0         # (Cyl.    0 -
> 4559*)
>   d: 73256337        0    4.2BSD     1024  8192    16   # (Cyl.    0 -
> 4559*)
Comment 5 mjacob 1999-12-27 01:05:42 UTC
On Sun, 26 Dec 1999, Wes Bauske wrote:

> Matt,
> 
> OK. I assume you also wrote some significant files onto it??
> I wrote a 2GB file for testing.

Nope, I'm a conehead- I only created/fsck'd it. I really didn't have time
to do exhaustive testing.

> 
> The file system does create most of the time. It's only when
> you start writing data to it that there's trouble. Also, if

What I had heard was that there was a problem > 32GB in creating the
filesystem.  I checked that this seemed to work in -current.

> your single file system works, then it may be something to
> do with my layout. To recap, I have 2 slices, one is 1GB
> with 128MB swap, and the rest root(/) and the other slice
> contains a single FS of around 38GB as work space for my
> application.
> 
> I'm running pure 3.3 from the Walnut Creek CD, no updates.

> Also, what about forcing the driver to use LBA mode? I tried

It does. It was just sysinstall that said, "Hmm- looks odd!"...


> Get's tiring to reinstall the OS after each test.

Well, yes. But all I did was give this a try under -current- seemed to
work with the newest ata driver (no special flags)- probably works even
for files on it.

I think I'm gently suggeting that you shouldn't hold FreeBSD to a higher
standard than Linux. I'm giving you a standard linux answer of- "try the
latest kernel"- in this case, do a net install of -current and see if this
solves your problems.

The problems may or may not be fixed in 3.4 (which is the latest -stable
release- just cut, too late to fix for this problem if it still is a
problem for 3.X), but you seem sophisticated enough to try the -current
netinstall to see if it solves your problems. If it's still a problem in
3.X FreeBSD it will probably get fixed, but everyone's pretty focussed on
closing off 4.0.

-matt
Comment 6 wsb 1999-12-27 03:16:55 UTC
Matthew Jacob wrote:
> 
> On Sun, 26 Dec 1999, Wes Bauske wrote:
> 
> > Matt,
> >
> > OK. I assume you also wrote some significant files onto it??
> > I wrote a 2GB file for testing.
> 
> Nope, I'm a conehead- I only created/fsck'd it. I really didn't have time
> to do exhaustive testing.
> 

OK. But it would be good to try put a file or two on it.

> >
> > The file system does create most of the time. It's only when
> > you start writing data to it that there's trouble. Also, if
> 
> What I had heard was that there was a problem > 32GB in creating the
> filesystem.  I checked that this seemed to work in -current.
> 

Yes. That's what I meant by "most of the time". It will sometimes
just reboot on it's own while creating the FS. That in itself
doesn't cause the root FS corruption though.

> > your single file system works, then it may be something to
> > do with my layout. To recap, I have 2 slices, one is 1GB
> > with 128MB swap, and the rest root(/) and the other slice
> > contains a single FS of around 38GB as work space for my
> > application.
> >
> > I'm running pure 3.3 from the Walnut Creek CD, no updates.
> 
> > Also, what about forcing the driver to use LBA mode? I tried
> 
> It does. It was just sysinstall that said, "Hmm- looks odd!"...
> 
> > Get's tiring to reinstall the OS after each test.
> 
> Well, yes. But all I did was give this a try under -current- seemed to
> work with the newest ata driver (no special flags)- probably works even
> for files on it.
> 
> I think I'm gently suggeting that you shouldn't hold FreeBSD to a higher
> standard than Linux. I'm giving you a standard linux answer of- "try the
> latest kernel"- in this case, do a net install of -current and see if this
> solves your problems.

Right now I can't due to heavy downloading of RH6.1 iso images.
I should be done with that in a day or two at which point I'll
see about continuing testing, including trying what's on the
net instead of the CD. I assume I'll have to create floppies
for that since I'm currently booting directly from the 3.3 CD.

> 
> The problems may or may not be fixed in 3.4 (which is the latest -stable
> release- just cut, too late to fix for this problem if it still is a
> problem for 3.X), but you seem sophisticated enough to try the -current
> netinstall to see if it solves your problems. If it's still a problem in
> 3.X FreeBSD it will probably get fixed, but everyone's pretty focussed on
> closing off 4.0.
> 
> -matt

I'll let you know what I find.


Wes
Comment 7 wsb 2000-01-05 09:13:32 UTC
Matthew Jacob wrote:
> 
> For what it's worth, I just attached a 37GB drive like Wes' to my
> -current Tyan mother board
> system:
> 
> ata1-master: success setting up UDMA2 mode on PIIX4 chip
> ad1: piomode=4 dmamode=2 udmamode=4 cblid=1
> ad1: <IBM-DPTA-353750/P51OA30A> ATA-4 disk at ata1 as master
> ad1: 35772MB (73261440 sectors), 72680 cyls, 16 heads, 63 S/T, 512 B/S
> ad1: 16 secs/int, 32 depth queue, UDMA33
> Creating DISK ad1
> 

I'm back looking at this problem. One thing I just noticed,
your disk is called ad1 where mine were called wd0/1/2. I
suspect you're using a different driver than I did.

I had to give up the Athlon for real work so I'm now using
an Intel PIII 600 w/SuperMicro PIIISCD motherboard. This
board uses the Intel 820 chip set but implements SDRAM instead
of using RDRAM. I downloaded 3.4 from the net and burned a
CD to try it out. For some reason, 3.4 can't find any of my
disks?? (Traced this to the 2nd 40GB drive just now. It's
unplugged for more tests)

So, do I need a different kernel to pick up this new ata1
driver?? Is this available in the 3.4 kernel config?

Any info appreciated.


Wes
Comment 8 Bruce Evans 2000-01-22 08:58:14 UTC
On Tue, 21 Dec 1999 rjbubon@bigi.com wrote:

> If I split the drive down the middle, 2 partitions, Strange things happen.
> I can load the first partition down with data. If I start writing to the 2nd
> partition, I corrupt the first. It's like the sector indexing in the OS
> is broke at some large number. Maybe an overflow.

Addressing is broken in CHS mode for cylinder numbers >= 65536, since
cylinder numbers are blindly truncated mod 65536.

The following patches give proper brokenness for -current.  Large
disks are truncated to 65536 cylinders (normally 33.8GB).  This is
simple in wd.c.  Unfortunately, dsinit() "helpfully" enlarges the disk
if necessary to cover the slice entries in the MBR.  This was once
necessary to support MFM disks with > 1024 cylinders, but it is wrong
for disks that report their size.

Index: i386/isa/wd.c
===================================================================
RCS file: /home/ncvs/src/sys/i386/isa/wd.c,v
retrieving revision 1.217
diff -c -2 -r1.217 wd.c
*** i386/isa/wd.c	1999/12/10 09:40:29	1.217
--- i386/isa/wd.c	2000/01/22 08:22:07
***************
*** 1761,1764 ****
--- 1761,1772 ----
  		    du->dk_dd.d_secperunit / du->dk_dd.d_secpercyl;
  	}
+ 	if (du->dk_dd.d_ncylinders > 0x10000 && !(du->cfg_flags & WDOPT_LBA)) {
+ 		du->dk_dd.d_ncylinders = 0x10000;
+ 		du->dk_dd.d_secperunit = du->dk_dd.d_secpercyl *
+ 		    du->dk_dd.d_ncylinders;
+ 		printf(
+ 		    "wd%d: cannot handle %d total sectors; truncating to %lu\n",
+ 		    du->dk_lunit, wp->wdp_lbasize, du->dk_dd.d_secperunit);
+ 	}
  #if 0
  	du->dk_dd.d_partitions[RAW_PART].p_size = du->dk_dd.d_secperunit;
Index: kern/subr_diskmbr.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/subr_diskmbr.c,v
retrieving revision 1.42
diff -c -2 -r1.42 subr_diskmbr.c
*** kern/subr_diskmbr.c	1999/11/09 21:35:10	1.42
--- kern/subr_diskmbr.c	2000/01/22 08:21:43
***************
*** 73,76 ****
--- 73,79 ----
  			      u_long ext_size, u_long base_ext_offset,
  			      int nsectors, int ntracks, u_long mbr_offset));
+ static int dssetslice __P((char *sname, struct disklabel *lp,
+ 			   struct diskslice *sp, struct dos_partition *dp,
+ 			   u_long br_offset));
  
  static int
***************
*** 295,306 ****
--- 298,321 ----
  	secpercyl = (u_long)max_nsectors * max_ntracks;
  	if (secpercyl != 0) {
+ #if 0
  		u_long	secperunit;
+ #endif
  
  		lp->d_nsectors = max_nsectors;
  		lp->d_ntracks = max_ntracks;
  		lp->d_secpercyl = secpercyl;
+ 		/*
+ 		 * Temporarily, don't even consider adjusting the drive's
+ 		 * size, since the adjusted size may exceed the hardware's
+ 		 * addressing capabilities.  The adjustment helped mainly
+ 		 * for ancient MFM drives with > 1024 cylinders, but now
+ 		 * breaks at least IDE drives with 63*16*65536 sectors if
+ 		 * they are controlled by the wd driver in CHS mode.
+ 		 */
+ #if 0
  		secperunit = secpercyl * max_ncyls;
  		if (lp->d_secperunit < secperunit)
  			lp->d_secperunit = secperunit;
+ #endif
  		lp->d_ncylinders = lp->d_secperunit / secpercyl;
  	}
***************
*** 320,335 ****
  	sp = &ssp->dss_slices[BASE_SLICE];
  	for (dospart = 0, dp = dp0; dospart < NDOSPART; dospart++, dp++, sp++) {
! 		sp->ds_offset = mbr_offset + dp->dp_start;
! 		sp->ds_size = dp->dp_size;
! 		sp->ds_type = dp->dp_typ;
! #ifdef PC98_ATCOMPAT
! 		/* Fake FreeBSD(98). */
! 		if (sp->ds_type == DOSPTYP_386BSD)
! 			sp->ds_type = 0x94;
! #endif
! #if 0
! 		lp->d_subtype |= (lp->d_subtype & 3) | dospart
! 				 | DSTYPE_INDOSPART;
! #endif
  	}
  	ssp->dss_nslices = BASE_SLICE + NDOSPART;
--- 335,341 ----
  	sp = &ssp->dss_slices[BASE_SLICE];
  	for (dospart = 0, dp = dp0; dospart < NDOSPART; dospart++, dp++, sp++) {
! 		sname = dsname(dev, dkunit(dev), BASE_SLICE + dospart,
! 			       RAW_PART, partname);
! 		(void)dssetslice(sname, lp, sp, dp, mbr_offset);
  	}
  	ssp->dss_nslices = BASE_SLICE + NDOSPART;
***************
*** 435,446 ****
  				continue;
  			}
! 			sp->ds_offset = ext_offset + dp->dp_start;
! 			sp->ds_size = dp->dp_size;
! 			sp->ds_type = dp->dp_typ;
! #ifdef PC98_ATCOMPAT
! 			/* Fake FreeBSD(98). */
! 			if (sp->ds_type == DOSPTYP_386BSD)
! 				sp->ds_type = 0x94;
! #endif
  			ssp->dss_nslices++;
  			slice++;
--- 441,446 ----
  				continue;
  			}
! 			if (dssetslice(sname, lp, sp, dp, ext_offset) != 0)
! 				continue;
  			ssp->dss_nslices++;
  			slice++;
***************
*** 459,462 ****
--- 459,501 ----
  	bp->b_flags |= B_INVAL | B_AGE;
  	brelse(bp);
+ }
+ 
+ static int
+ dssetslice(sname, lp, sp, dp, br_offset)
+ 	char	*sname;
+ 	struct disklabel *lp;
+ 	struct diskslice *sp;
+ 	struct dos_partition *dp;
+ 	u_long	br_offset;
+ {
+ 	u_long	offset;
+ 	u_long	size;
+ 
+ 	offset = br_offset + dp->dp_start;
+ 	if (offset > lp->d_secperunit || offset < br_offset) {
+ 		printf(
+ 		"%s: slice starts beyond end of the disk: rejecting it\n",
+ 		       sname);
+ 		return (1);
+ 	}
+ 	size = lp->d_secperunit - offset;
+ 	if (size >= dp->dp_size)
+ 		size = dp->dp_size;
+ 	else
+ 		printf(
+ "%s: slice extends beyond end of disk: truncating from %lu to %lu sectors\n",
+ 		       sname, (u_long)dp->dp_size, size);
+ 	sp->ds_offset = offset;
+ 	sp->ds_size = size;
+ 	sp->ds_type = dp->dp_typ;
+ #ifdef PC98_ATCOMPAT
+ 	/* Fake FreeBSD(98). */
+ 	if (sp->ds_type == DOSPTYP_386BSD)
+ 		sp->ds_type = 0x94;
+ #endif
+ #if 0
+ 	lp->d_subtype |= (lp->d_subtype & 3) | dospart | DSTYPE_INDOSPART;
+ #endif
+ 	return (0);
  }
  
Bruce
Comment 9 Dag-Erling Smørgrav freebsd_committer freebsd_triage 2001-03-13 20:20:01 UTC
State Changed
From-To: open->closed

According to bde, this problem has been fixed.