Bug 141992 - [ufs] fsck cannot repair file system in which it finds an error [regression]
Summary: [ufs] fsck cannot repair file system in which it finds an error [regression]
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 8.0-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Kirk McKusick
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-12-24 23:20 UTC by Dan Strick
Modified: 2010-02-10 21:08 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Strick 2009-12-24 23:20:02 UTC
When I upgraded from FreeBSD release 6.1 to release 8.0 I discovered
that fsck complained about a problem in one of my old file systems
and claimed to have fixed the problem but did not really fix it.

The fsck on release 6.1 finds no problem at all in the file system.
I wonder if perhaps fsck could not fix the problem because there
was no real problem and fsck was confused by some difference in
file systems created by an older release of FreeBSD.

Fix: 

(workaround:)

1) Dump the file system.
2) Newfs the file system under FreeBSD release 8.0.
3) Restore the file system.
How-To-Repeat: This is a sample fsck dialogue under FreeBSD release 8.0.  I do not claim
that the file system meets all release 8.0 standards for UFS file systems
(since it was created under an old FreeBSD release) but I do claim that
it is a bug for fsck to have claimed to have fixed a problem and then to
find the problem again when the fsck is repeated.

mist# fsck /dev/ad4s4a
** /dev/ad4s4a
** Last Mounted on /fs/u4
** Phase 1 - Check Blocks and Sizes
CYLINDER GROUP 48: BAD MAGIC NUMBER
UNEXPECTED SOFT UPDATE INCONSISTENCY

REBUILD CYLINDER GROUP? [yn] y

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
SUMMARY INFORMATION BAD
SALVAGE? [yn] y

BLK(S) MISSING IN BIT MAPS
SALVAGE? [yn] y

46090 files, 2542520 used, 1766975 free (22607 frags, 218046 blocks, 0.5% fragmentation)

***** FILE SYSTEM IS CLEAN *****

***** FILE SYSTEM WAS MODIFIED *****
mist# !!
fsck /dev/ad4s4a
** /dev/ad4s4a
** Last Mounted on /fs/u4
** Phase 1 - Check Blocks and Sizes
CYLINDER GROUP 48: BAD MAGIC NUMBER
UNEXPECTED SOFT UPDATE INCONSISTENCY

REBUILD CYLINDER GROUP? [yn] y

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
SUMMARY INFORMATION BAD
SALVAGE? [yn] y

BLK(S) MISSING IN BIT MAPS
SALVAGE? [yn] y

46090 files, 2542520 used, 1766975 free (22607 frags, 218046 blocks, 0.5% fragmentation)

***** FILE SYSTEM IS CLEAN *****

***** FILE SYSTEM WAS MODIFIED *****
mist#
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2009-12-25 02:59:51 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 mckusick 2009-12-29 18:43:11 UTC
I am investigating your bug report about your problems with fsck
in going from 6.1 to 8.0. There have been only insignificant
changes in the filesystem during that time, so in theory you should
not have any problems going back and forth between these two
versions of FreeBSD. The fsck program has been changed to look
more critically at the cylinder group maps, so some errors that
would not previously have been detected will now be found. That 
said though, it should be able to completely fix those errors
which it appears that it is not able to do.

To help me track down your problem, please supply me with the following
information:

1) Do you have more than one 6.1 filesystem that you are running on
   this system. If so, is this the only filesystem having trouble in
   this way, or are more than one of them exhibiting similar problems?

2) Are you getting any disk errors reported for this drive in
   /var/log/messages? It really looks like there is a bad sector in
   cylinder group 48 and so even though fsck can calculate a correct
   one, it is not able to write it out. Even if errors are not being
   reported, I am inclined to write a small test program that attempts
   to read and write the affected sectors to see if the changes stick.

3) Please provide me with a `dumpfs /dev/ad4s4a'. I don't need the
   whole (huge) thing. Just the start up through cylinder group 0,
   then the output for cylinder groups 47, 48, and 49. I am looking
   for any obvious anomolies in cylinder group 48 versus its neighbors.

Hopefully we can sort out what is going wrong here.

	Kirk McKusick
Comment 3 Kirk McKusick freebsd_committer freebsd_triage 2009-12-29 18:47:02 UTC
Responsible Changed
From-To: freebsd-fs->mckusick

I'll take this one.
Comment 4 Dan Strick 2009-12-30 23:00:16 UTC
> From mckusick@mckusick.com Wed Dec 30 09:45:41 2009
> To: Dan Strick <mla_strick@att.net>
> Subject: Re: kern/141992: fsck cannot repair file system in ...
> Date: Wed, 30 Dec 2009 09:31:07 -0800
>
> Thanks.The thing that immediately jumps out at me is that the filesystem
> is UFS1 format (which would make it 4.X vintage unless you specifically
> requested UFS1 when you built it). The new code to fix cylinder groups
> in fsck can only fix UFS2 filesystems. So the problem is that it is
> unhappy with cylinder group 48, then tries to fix it and finds that
> it cannot do anything. At a minimum it should just not do its check
> for bad cylinder groups in UFS1 filesystems.

The file system was probably created by FreeBSD 4.10, the first version
of FreeBSD that would run on my machine.

Perhaps when the new fsck finds a bad UFS1 cylinder group and cannot
fix it, the new fsck should simply explain its limitation and recommend
copying the UFS1 file system contents to a new UFS2 file system.

> Is there a chance that I can either (1) get a copy of the mdconfig
> disk image so I can work out a solution here, or (2) get a login
> on your machine so I can poke at things there.
>
> 	Kirk McKusick

I have prepared a new file system by deleting all files in the old file
system, creating a huge file containing only zeros, and deleting the
huge file.  The new file system gzips nicely and still has the bad
cylinder group.  The new file system is about 16 MB compressed (almost
9 GB uncompressed).  How should I send it to you?

Dan Strick
Comment 5 mckusick 2009-12-31 00:11:41 UTC
> Date: Wed, 30 Dec 2009 15:00:16 -0800 (PST)
> From: Dan Strick <mla_strick@att.net>
> To: mckusick@mckusick.com
> In-Reply-To: <200912301731.nBUHV720089853@chez.mckusick.com>
> Cc: freebsd-fs@freebsd.org, bug-followup@freebsd.org, mla@mist.nodomain
> Subject: Re: kern/141992: fsck cannot repair file system in which it finds
> 	an error
> X-BeenThere: freebsd-fs@freebsd.org
> 
> > From mckusick@mckusick.com Wed Dec 30 09:45:41 2009
> > To: Dan Strick <mla_strick@att.net>
> > Subject: Re: kern/141992: fsck cannot repair file system in ...
> > Date: Wed, 30 Dec 2009 09:31:07 -0800
> >
> > Thanks.The thing that immediately jumps out at me is that the filesystem
> > is UFS1 format (which would make it 4.X vintage unless you specifically
> > requested UFS1 when you built it). The new code to fix cylinder groups
> > in fsck can only fix UFS2 filesystems. So the problem is that it is
> > unhappy with cylinder group 48, then tries to fix it and finds that
> > it cannot do anything. At a minimum it should just not do its check
> > for bad cylinder groups in UFS1 filesystems.
> 
> The file system was probably created by FreeBSD 4.10, the first version
> of FreeBSD that would run on my machine.

Yes, quite believeable.

> Perhaps when the new fsck finds a bad UFS1 cylinder group and cannot
> fix it, the new fsck should simply explain its limitation and recommend
> copying the UFS1 file system contents to a new UFS2 file system.

That is what I will probably do unless an easy fix for UFS1 presents
itself.

> > Is there a chance that I can either (1) get a copy of the mdconfig
> > disk image so I can work out a solution here, or (2) get a login
> > on your machine so I can poke at things there.
> >
> > 	Kirk McKusick
> 
> I have prepared a new file system by deleting all files in the old file
> system, creating a huge file containing only zeros, and deleting the
> huge file.  The new file system gzips nicely and still has the bad
> cylinder group.  The new file system is about 16 MB compressed (almost
> 9 GB uncompressed).  How should I send it to you?
> 
> Dan Strick
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

Excellent. I wish all folks that reported bugs could provide such
help as you have :-)

Do you have a web server on which you could put it from which I could
do a download? Or an ftp server? If not, I'll create a guest account
for you on my machine and let you download to that.

	~Kirk
Comment 6 Dan Strick 2009-12-31 01:34:15 UTC
> Do you have a web server on which you could put it from which I could
> do a download? Or an ftp server? If not, I'll create a guest account
> for you on my machine and let you download to that.

I don't have a web or ftp server.  I am pretty sure my ISP does not
provide me with a place to store a web page.  I should be able to
anonymous ftp or do scp to a guest account.  Send instructions.

Thanks,
	Dan Strick
Comment 7 mckusick 2010-01-07 00:21:52 UTC
So, I have (finally) gotten a chance to try out the disk image that
you sent me using fsck_ffs from a FreeBSD 8.0 system. Here is the
correction report and patch that I made:

Author: mckusick
Date: Thu Jan  7 00:17:36 2010
New Revision: 201700
URL: http://svn.freebsd.org/changeset/base/201700

Log:
  This corrects a bug that manifested itself as identifying the last
  cylinder group of a UFS1 filesystem as bad. The error was in the check
  and not in the cylinder group itself. So even though fsck fixed the
  cylinder group correctly, it was still endlessly reported as bad.
  
  PR:		141992
  MFC after:	2 weeks
  Reported by:	Dan Strick

Modified:
  head/sbin/fsck_ffs/fsutil.c

Modified: head/sbin/fsck_ffs/fsutil.c
==============================================================================
--- head/sbin/fsck_ffs/fsutil.c	Thu Jan  7 00:04:29 2010	(r201699)
+++ head/sbin/fsck_ffs/fsutil.c	Thu Jan  7 00:17:36 2010	(r201700)
@@ -436,7 +436,7 @@ check_cgmagic(int cg, struct cg *cgp)
 	    ((sblock.fs_magic == FS_UFS1_MAGIC &&
 	      cgp->cg_old_niblk == sblock.fs_ipg &&
 	      cgp->cg_ndblk <= sblock.fs_fpg &&
-	      cgp->cg_old_ncyl == sblock.fs_old_cpg) ||
+	      cgp->cg_old_ncyl <= sblock.fs_old_cpg) ||
 	     (sblock.fs_magic == FS_UFS2_MAGIC &&
 	      cgp->cg_niblk == sblock.fs_ipg &&
 	      cgp->cg_ndblk <= sblock.fs_fpg &&

Please apply this patch to your fsck_ffs and confirm for me that it
corrects your problem. Thanks for all your help.

	Kirk McKusick
Comment 8 dfilter service freebsd_committer freebsd_triage 2010-01-07 00:21:58 UTC
Author: mckusick
Date: Thu Jan  7 00:17:36 2010
New Revision: 201700
URL: http://svn.freebsd.org/changeset/base/201700

Log:
  This corrects a bug that manifested itself as identifying the last
  cylinder group of a UFS1 filesystem as bad. The error was in the check
  and not in the cylinder group itself. So even though fsck fixed the
  cylinder group correctly, it was still endlessly reported as bad.
  
  PR:		141992
  MFC after:	2 weeks
  Reported by:	Dan Strick

Modified:
  head/sbin/fsck_ffs/fsutil.c

Modified: head/sbin/fsck_ffs/fsutil.c
==============================================================================
--- head/sbin/fsck_ffs/fsutil.c	Thu Jan  7 00:04:29 2010	(r201699)
+++ head/sbin/fsck_ffs/fsutil.c	Thu Jan  7 00:17:36 2010	(r201700)
@@ -436,7 +436,7 @@ check_cgmagic(int cg, struct cg *cgp)
 	    ((sblock.fs_magic == FS_UFS1_MAGIC &&
 	      cgp->cg_old_niblk == sblock.fs_ipg &&
 	      cgp->cg_ndblk <= sblock.fs_fpg &&
-	      cgp->cg_old_ncyl == sblock.fs_old_cpg) ||
+	      cgp->cg_old_ncyl <= sblock.fs_old_cpg) ||
 	     (sblock.fs_magic == FS_UFS2_MAGIC &&
 	      cgp->cg_niblk == sblock.fs_ipg &&
 	      cgp->cg_ndblk <= sblock.fs_fpg &&
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 9 Kirk McKusick freebsd_committer freebsd_triage 2010-01-07 00:23:19 UTC
State Changed
From-To: open->patched

A fix has been found and is being tested.
Comment 10 Dan Strick 2010-01-07 09:12:28 UTC
> From mckusick@mckusick.com Thu Jan  7 00:55:35 2010
> To: Dan Strick <mla_strick@att.net>
> Subject: Re: kern/141992: fsck cannot repair file system in ...
>
	...
>
> Please apply this patch to your fsck_ffs and confirm for me that it
> corrects your problem. Thanks for all your help.

This patch does indeed make the problem I had with that particular
file system go away.

Thanks,
	Dan Strick
Comment 11 dfilter service freebsd_committer freebsd_triage 2010-02-10 20:35:38 UTC
Author: mckusick
Date: Wed Feb 10 20:35:20 2010
New Revision: 203765
URL: http://svn.freebsd.org/changeset/base/203765

Log:
  MFC of r201700 | mckusick | 2010-01-06
  
  This corrects a bug that manifested itself as identifying the last
  cylinder group of a UFS1 filesystem as bad. The error was in the check
  and not in the cylinder group itself. So even though fsck fixed the
  cylinder group correctly, it was still endlessly reported as bad.
  
  This bug first appeared in 8.0 so does not apply to earlier releases.
  
  PR:             141992
  Reported by:    Dan Strick

Modified:
  stable/8/sbin/fsck_ffs/fsutil.c
Directory Properties:
  stable/8/sbin/fsck_ffs/   (props changed)

Modified: stable/8/sbin/fsck_ffs/fsutil.c
==============================================================================
--- stable/8/sbin/fsck_ffs/fsutil.c	Wed Feb 10 20:17:46 2010	(r203764)
+++ stable/8/sbin/fsck_ffs/fsutil.c	Wed Feb 10 20:35:20 2010	(r203765)
@@ -436,7 +436,7 @@ check_cgmagic(int cg, struct cg *cgp)
 	    ((sblock.fs_magic == FS_UFS1_MAGIC &&
 	      cgp->cg_old_niblk == sblock.fs_ipg &&
 	      cgp->cg_ndblk <= sblock.fs_fpg &&
-	      cgp->cg_old_ncyl == sblock.fs_old_cpg) ||
+	      cgp->cg_old_ncyl <= sblock.fs_old_cpg) ||
 	     (sblock.fs_magic == FS_UFS2_MAGIC &&
 	      cgp->cg_niblk == sblock.fs_ipg &&
 	      cgp->cg_ndblk <= sblock.fs_fpg &&
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 12 Kirk McKusick freebsd_committer freebsd_triage 2010-02-10 21:07:17 UTC
State Changed
From-To: patched->closed

The fix has been confirmed and MFC'ed to 8.0. 
It did not appear in releases before 8.0, 
so no further MFC's are needed.