fsck_ffs trusts value of used inodes in cylinder group header on UFS2 filesystems. Unfortunately, a disk/memory corruption, for whatever reason, may corrupt that particular value. If the corrupt value is too large, this is easy to detect, by comparing it to superblock max value. If it is too low, bad things may still happen, as not all inodes are checked, possibly causing loss of files? The patch below works around the too large case. I think that the whole optimization of trusting cylinder group header is too optimistic, and the fsck_ffs should probably be returned to UFS1 way, even if there would be performance penalty. Fix: This is partial fix as it only fixes the detectable situation when cgrp.cg_initediblk is larger than number of inodes per cylinder group. Possibly better alternative is to return to UFS1 code in this case. Should this mode be triggered on any strange things, such as redo the pass1 in case number of inodes used mismatches in some way? It would catch too small value case? Anyway, fsck should never ever exit nor it should make optimistic assumptions of the disk state. I did not analyze softupdates cases, its already 3:45am... The below fixes a mildly confusing error message as well. How-To-Repeat: Have your 3ware 9500S RAID controller go nuts when hotswapping a disk in the pack, and with apparent failure of the BBU which may or may not be related. Alternatively, use any other flaky hardware and wait. The server this happened on has ECC memory so this leaves the controller or the disks as most likely source. This is probably not a very frequent event, if I am the first person ever to stumble upon this. The problem shows up with error "bad inode number xxx to nextinode", as getnextinode gets called with inumber beyond that particular cylinder group. fsck_ffs exits in this case, making the filesystem inaccessible. Forcibly mounting the said filesystem read-only generated immediate panic. There was lots of corruption, most of it seemed to concentrate in a area around this cylinder group, with little damage elsewhere.
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s)
Author: mckusick Date: Thu Jan 7 01:10:49 2010 New Revision: 201708 URL: http://svn.freebsd.org/changeset/base/201708 Log: Add some error messages suggested in PR bin/138043. The code to correct the problem was added in r176575 by delphij on 2008-02-25. PR: 138043 Reported by: Heikki Suonsivu Modified: head/sbin/fsck_ffs/main.c head/sbin/fsck_ffs/pass1.c Modified: head/sbin/fsck_ffs/main.c ============================================================================== --- head/sbin/fsck_ffs/main.c Thu Jan 7 00:57:40 2010 (r201707) +++ head/sbin/fsck_ffs/main.c Thu Jan 7 01:10:49 2010 (r201708) @@ -406,7 +406,10 @@ checkfilesys(char *filesys) */ if (duplist) { if (preen || usedsoftdep) - pfatal("INTERNAL ERROR: dups with -p"); + pfatal("INTERNAL ERROR: dups with %s%s%s", + preen ? "-p" : "", + (preen && usedsoftdep) ? " and " : "", + usedsoftdep ? "softupdates" : ""); printf("** Phase 1b - Rescan For More DUPS\n"); pass1b(); } Modified: head/sbin/fsck_ffs/pass1.c ============================================================================== --- head/sbin/fsck_ffs/pass1.c Thu Jan 7 00:57:40 2010 (r201707) +++ head/sbin/fsck_ffs/pass1.c Thu Jan 7 01:10:49 2010 (r201708) @@ -98,10 +98,16 @@ pass1(void) rebuildcg = 1; if (!rebuildcg && sblock.fs_magic == FS_UFS2_MAGIC) { inosused = cgrp.cg_initediblk; - if (inosused > sblock.fs_ipg) + if (inosused > sblock.fs_ipg) { + pfatal("%s (%d > %d) %s %d\nReset to %d\n", + "Too many initialized inodes", inosused, + sblock.fs_ipg, "in cylinder group", c, + sblock.fs_ipg); inosused = sblock.fs_ipg; - } else + } + } else { inosused = sblock.fs_ipg; + } if (got_siginfo) { printf("%s: phase 1: cyl group %d of %d (%d%%)\n", cdevname, c, sblock.fs_ncg, _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
State Changed From-To: open->closed The suggested fix was added in r176575 by delphij on 2008-02-25. The more detailed error messages were added by me in r201708.
Author: delphij Date: Thu Jan 7 01:55:34 2010 New Revision: 201711 URL: http://svn.freebsd.org/changeset/base/201711 Log: MFC r176575: In pass1(), cap inosused to fs_ipg rather than allowing arbitrary number read from cylinder group. Chances that we read a smarshed cylinder group, and we can not 100% trust information it has supplied. fsck_ffs(8) will crash otherwise for some cases. PR: bin/138043 Reminded by: mckusick Modified: stable/7/sbin/fsck_ffs/pass1.c Directory Properties: stable/7/sbin/fsck_ffs/ (props changed) Modified: stable/7/sbin/fsck_ffs/pass1.c ============================================================================== --- stable/7/sbin/fsck_ffs/pass1.c Thu Jan 7 01:24:09 2010 (r201710) +++ stable/7/sbin/fsck_ffs/pass1.c Thu Jan 7 01:55:34 2010 (r201711) @@ -93,9 +93,11 @@ pass1(void) inumber = c * sblock.fs_ipg; setinodebuf(inumber); getblk(&cgblk, cgtod(&sblock, c), sblock.fs_cgsize); - if (sblock.fs_magic == FS_UFS2_MAGIC) + if (sblock.fs_magic == FS_UFS2_MAGIC) { inosused = cgrp.cg_initediblk; - else + if (inosused > sblock.fs_ipg) + inosused = sblock.fs_ipg; + } else inosused = sblock.fs_ipg; if (got_siginfo) { printf("%s: phase 1: cyl group %d of %d (%d%%)\n", _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Author: delphij Date: Thu Jan 7 01:56:35 2010 New Revision: 201712 URL: http://svn.freebsd.org/changeset/base/201712 Log: MFC r176575: In pass1(), cap inosused to fs_ipg rather than allowing arbitrary number read from cylinder group. Chances that we read a smarshed cylinder group, and we can not 100% trust information it has supplied. fsck_ffs(8) will crash otherwise for some cases. PR: bin/138043 Reminded by: mckusick Modified: stable/6/sbin/fsck_ffs/pass1.c Directory Properties: stable/6/sbin/fsck_ffs/ (props changed) Modified: stable/6/sbin/fsck_ffs/pass1.c ============================================================================== --- stable/6/sbin/fsck_ffs/pass1.c Thu Jan 7 01:55:34 2010 (r201711) +++ stable/6/sbin/fsck_ffs/pass1.c Thu Jan 7 01:56:35 2010 (r201712) @@ -93,9 +93,11 @@ pass1(void) inumber = c * sblock.fs_ipg; setinodebuf(inumber); getblk(&cgblk, cgtod(&sblock, c), sblock.fs_cgsize); - if (sblock.fs_magic == FS_UFS2_MAGIC) + if (sblock.fs_magic == FS_UFS2_MAGIC) { inosused = cgrp.cg_initediblk; - else + if (inosused > sblock.fs_ipg) + inosused = sblock.fs_ipg; + } else inosused = sblock.fs_ipg; if (got_siginfo) { printf("%s: phase 1: cyl group %d of %d (%d%%)\n", _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Author: delphij Date: Thu Jan 7 01:57:13 2010 New Revision: 201713 URL: http://svn.freebsd.org/changeset/base/201713 Log: MFC r176575: In pass1(), cap inosused to fs_ipg rather than allowing arbitrary number read from cylinder group. Chances that we read a smarshed cylinder group, and we can not 100% trust information it has supplied. fsck_ffs(8) will crash otherwise for some cases. PR: bin/138043 Reminded by: mckusick Modified: stable/5/sbin/fsck_ffs/pass1.c Directory Properties: stable/5/sbin/fsck_ffs/ (props changed) Modified: stable/5/sbin/fsck_ffs/pass1.c ============================================================================== --- stable/5/sbin/fsck_ffs/pass1.c Thu Jan 7 01:56:35 2010 (r201712) +++ stable/5/sbin/fsck_ffs/pass1.c Thu Jan 7 01:57:13 2010 (r201713) @@ -93,9 +93,11 @@ pass1(void) inumber = c * sblock.fs_ipg; setinodebuf(inumber); getblk(&cgblk, cgtod(&sblock, c), sblock.fs_cgsize); - if (sblock.fs_magic == FS_UFS2_MAGIC) + if (sblock.fs_magic == FS_UFS2_MAGIC) { inosused = cgrp.cg_initediblk; - else + if (inosused > sblock.fs_ipg) + inosused = sblock.fs_ipg; + } else inosused = sblock.fs_ipg; if (got_siginfo) { printf("%s: phase 1: cyl group %d of %d (%d%%)\n", _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"