Bug 255979 - fsck bad inode number 2 (256) to nextinode
Summary: fsck bad inode number 2 (256) to nextinode
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-18 18:19 UTC by Px
Modified: 2021-10-20 07:05 UTC (History)
6 users (show)

See Also:


Attachments
Proposed fix for bug. (652 bytes, patch)
2021-05-18 21:26 UTC, Kirk McKusick
mckusick: maintainer-approval+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Px 2021-05-18 18:19:48 UTC
Hi.

Yesterday I've noticed that I have a following message in /var/log/messages

May 16 13:24:13 bsd-route fsck[1606]: /dev/ufs/pxstore: NO WRITE ACCESS
May 16 13:24:13 bsd-route fsck[1606]: /dev/ufs/pxstore: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
May 16 13:24:13 bsd-route fsck[1606]: /dev/ufs/pxstore: CANNOT SET FS_NEEDSFSCK FLAG

Underlying hdd holds a single Jail with torrent client, so I've shut down the client, don't remember if I've also shut down the jail itself, then I've unmounted drive with -f switch, and run fsck on it. fsck found a lot of errors, like

129516299 DUP I=64686337
UNEXPECTED SOFT UPDATE INCONSISTENCY

2661355758115450807 BAD I=23756482
UNEXPECTED SOFT UPDATE INCONSISTENCY

-8999617908267060188 BAD I=23756482
UNEXPECTED SOFT UPDATE INCONSISTENCY

CYLINDER GROUP 3944: INTEGRITY CHECK FAILED
UNEXPECTED SOFT UPDATE INCONSISTENCY

INCORRECT BLOCK COUNT I=486994392 (56264 should be 49112)

INODE 486994392: FILE SIZE 28746675 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 25133056

and some others (you can see 512KB of the log here - https://pastebin.com/Xcm1iKCc), but in the end to my surprise fsck exited with the following error:

INTERNAL ERROR: dups with softupdates
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Phase 1b - Rescan For More DUPS
fsck_ufs: bad inode number 2 to nextinode

After some search I was lucky to find the following commit
https://cgit.freebsd.org/src/commit/?id=bc444e2ec6e6cc9d96d35ab7ce3c02c0da952fad
Fix fsck_ffs Pass 1b error exit "bad inode number 2 to nextinode".

As I'm running 13-RELEASE, I went to https://download.freebsd.org/ftp/snapshots/amd64/13.0-STABLE/, downloaded base.txz file, took fsck* files from it, and put them instead of existing in /sbin, but the only change I've got is the number change from 2 to 256

INTERNAL ERROR: dups with softupdates
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Phase 1b - Rescan For More DUPS
fsck_ufs: bad inode number 256 to nextinode

On fsck re-run I'm getting the same list of the errors, and the check ends with the same result.

I've mounted the disk in r/o mode, and can see the folders structure just fine, random files check showed that they are fine, text/images/video renders without errors. SMART for disk is fine, general status is PASSED, and there are no reallocation events

Am I assuming correctly that fix above is incomplete, and fsck should recover file system my case too? Any other way to fix it without moving all the data elsewhere, and recreating file system?
Comment 1 Kirk McKusick freebsd_committer 2021-05-18 21:26:33 UTC
Created attachment 225075 [details]
Proposed fix for bug.

Please try making this change and rerunning fsck on your broken filesystem.
Comment 2 Px 2021-05-19 20:53:52 UTC
(In reply to Kirk McKusick from comment #1)

Hi Kirk,

Thank you, with this patch applied fsck was able to complete Phase 1b, and following phases.
Comment 3 commit-hook freebsd_committer 2021-05-19 21:36:01 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=fe815b88b553667c40353c46b58f9779efa3570e

commit fe815b88b553667c40353c46b58f9779efa3570e
Author:     Kirk McKusick <mckusick@FreeBSD.org>
AuthorDate: 2021-05-19 21:38:21 +0000
Commit:     Kirk McKusick <mckusick@FreeBSD.org>
CommitDate: 2021-05-19 21:39:24 +0000

    Fix fsck_ffs Pass 1b error exit "bad inode number 256 to nextinode".

    Pass 1b of fsck_ffs runs only when Pass 1 has found duplicate blocks.
    Pass 1 only knows that a block is duplicate when it finds the second
    instance of its use. The role of Pass 1b is to find the first use
    of all the duplicate blocks. It makes a pass over the cylinder groups
    looking for these blocks. When moving to the next cylinder group,
    Pass 1b failed to properly calculate the starting inode number for
    the cylinder group resulting in the above error message when it
    tried to read the first inode in the cylinder group.

    Reported by:  Px
    Tested by:    Px
    PR:           255979
    MFC after:    3 days
    Sponsored by: Netflix

 sbin/fsck_ffs/pass1b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 4 Kirk McKusick freebsd_committer 2021-05-19 21:46:01 UTC
(In reply to Px from comment #2)
Thanks for testing and confirming the fix.

I will MFC it to stable/13 and stable/12 after it has had the required 3-day minimum waiting period in the development head.

Your comprehensive bug report made it easy to find and fix.
Comment 5 Px 2021-05-20 07:10:25 UTC
Thanks again for the quick fix :)
Comment 6 commit-hook freebsd_committer 2021-05-21 20:39:36 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f190f9193bc10a8193c87e0a02fa91400e4eb159

commit f190f9193bc10a8193c87e0a02fa91400e4eb159
Author:     Kirk McKusick <mckusick@FreeBSD.org>
AuthorDate: 2021-05-21 20:41:40 +0000
Commit:     Kirk McKusick <mckusick@FreeBSD.org>
CommitDate: 2021-05-21 20:42:37 +0000

    Fix fsck_ufs segfaults with gjournal (SU+J)

    The segfault was being hit in ckfini() (sbin/fsck_ffs/fsutil.c)
    while attempting to traverse the buffer cache to flush dirty buffers.
    The tail queue used for the buffer cache was not initialized before
    dropping into gjournal_check(). Move the buffer initialization earlier
    so that it has been done before calling gjournal_check().

    Reported by:  crypt47, nvass
    Fix by:       Robert Wing
    Tested by:    Robert Wing
    PR:           255030
    PR:           255979
    MFC after:    3 days
    Sponsored by: Netflix

 sbin/fsck_ffs/main.c  | 1 +
 sbin/fsck_ffs/setup.c | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)
Comment 7 Kirk McKusick freebsd_committer 2021-05-22 16:25:37 UTC
(In reply to commit-hook from comment #6)
Commit f190f919 applies to bug 254709 and I cited it here by mistake.
Comment 8 Kirk McKusick freebsd_committer 2021-05-22 16:27:55 UTC
(In reply to Kirk McKusick from comment #7)
Geez, I cannot even get the correction correct!
Commit f190f919 applies to bug 245709 and I cited it here by mistake.
Comment 9 Kirk McKusick freebsd_committer 2021-05-22 16:36:33 UTC
(In reply to commit-hook from comment #6)
Third time (plus an additional cup of coffee) is the charm.
Commit f190f919 applies to bug 245907 and I cited it here by mistake.
Comment 10 commit-hook freebsd_committer 2021-05-22 21:01:07 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e198c1dc8f6faaa85bd20990d15e3bcb9d081873

commit e198c1dc8f6faaa85bd20990d15e3bcb9d081873
Author:     Kirk McKusick <mckusick@FreeBSD.org>
AuthorDate: 2021-05-19 21:38:21 +0000
Commit:     Kirk McKusick <mckusick@FreeBSD.org>
CommitDate: 2021-05-22 21:03:37 +0000

    Fix fsck_ffs Pass 1b error exit "bad inode number 256 to nextinode".

    (cherry picked from commit fe815b88b553667c40353c46b58f9779efa3570e)

    PR:           255979
    Sponsored by: Netflix

 sbin/fsck_ffs/pass1b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 11 Kirk McKusick freebsd_committer 2021-05-22 21:02:39 UTC
This fix has been MFC'ed to 13-stable.
It does not apply to 12-stable.
Comment 12 Gordon Dickens 2021-07-04 12:36:17 UTC
Has this bug been merged into FreeBSD 13?   I ask since I am running stock FreeBSD 13 and I believe that I am seeing this bug in that fsck quits with the following ending dialog:

UNKNOWN FILE TYPE I=300955998
UNEXPECTED SOFT UPDATE INCONSISTENCY

CLEAR? yes

UNKNOWN FILE TYPE I=300955999
UNEXPECTED SOFT UPDATE INCONSISTENCY

CLEAR? yes

CYLINDER GROUP 374: FOUND 80256 VALID INODES
INTERNAL ERROR: dups with softupdates
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Phase 1b - Rescan For More DUPS
fsck_ufs: bad inode number 2 to next inode

#

So, what do I do to get it fsck finish?

Thanks,

Gordon Dickens
Comment 13 Kirk McKusick freebsd_committer 2021-07-04 16:48:41 UTC
(In reply to Gordon Dickens from comment #12)
This was MFC'ed to 13 on April 2, 2021. If you are running an fsck from after that time, you should not experience this bug.
Comment 14 odhiambo@gmail.com 2021-10-10 07:02:16 UTC
I am on FreeBSD 13-RELEASE (fully updated) and I have encountered this problem just yesterday and it won't go away:

root@svr:/usr/local/SRC/Exim/exim-4.95 # date
Sun Oct 10 09:42:24 EAT 2021
root@svr:/usr/local/SRC/Exim/exim-4.95 # freebsd-update fetch
Looking up update.FreeBSD.org mirrors... 2 mirrors found.
Fetching metadata signature for 13.0-RELEASE from update1.freebsd.org... done.
Fetching metadata index... done.
Inspecting system... done.
Preparing to download files... done.

No updates needed to update system to 13.0-RELEASE-p4.
root@svr:/usr/local/SRC/Exim/exim-4.95 # freebsd-update install
No updates are available to install.
Run '/usr/sbin/freebsd-update fetch' first.
root@svr:/usr/local/SRC/Exim/exim-4.95 #
root@svr:/usr/local/SRC/Exim/exim-4.95 #
root@svr:/usr/local/SRC/Exim/exim-4.95 # mount
/dev/ada0p2 on / (ufs, local, journaled soft-updates)
devfs on /dev (devfs)
fdescfs on /dev/fd (fdescfs)
procfs on /proc (procfs, local)
root@svr:/usr/local/SRC/Exim/exim-4.95 # df -h
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/ada0p2    899G    103G    723G    13%    /
devfs          1.0K    1.0K      0B   100%    /dev
fdescfs        1.0K    1.0K      0B   100%    /dev/fd
procfs         4.0K    4.0K      0B   100%    /proc
root@svr:/usr/local/SRC/Exim/exim-4.95 # fsck -y /dev/ada1p2
** /dev/ada1p2
** SU+J Recovering /dev/ada1p2
Invalid flags 0x1 for journal inode 11959
** Skipping journal, falling through to full fsck

** Last Mounted on /disk2
** Phase 1 - Check Blocks and Sizes
237836 BAD I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

237626 BAD I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

1633111304 BAD I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

80659953970315267 BAD I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9683 BAD I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

3630573539351855112 BAD I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

80659953970315267 BAD I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9684 BAD I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9684 DUP I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9685 DUP I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9686 DUP I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9687 DUP I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9688 DUP I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9689 DUP I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

9690 DUP I=11004
UNEXPECTED SOFT UPDATE INCONSISTENCY

237860 BAD I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237626 BAD I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237626 DUP I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237627 DUP I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237628 DUP I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237629 DUP I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237630 DUP I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237631 DUP I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237632 DUP I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

237633 DUP I=11005
UNEXPECTED SOFT UPDATE INCONSISTENCY

238133 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

237825 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

2556482 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

1633111304 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

83761345559789569 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

4750453985124128242 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

12 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

12 DUP I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

13 DUP I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

14 DUP I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

15 DUP I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

16 DUP I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

17 DUP I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

18 DUP I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

19 DUP I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

83762934697689089 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

4750453186260211186 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

83762938992656385 BAD I=11798
UNEXPECTED SOFT UPDATE INCONSISTENCY

4195264 DUP I=561803
UNEXPECTED SOFT UPDATE INCONSISTENCY

4195265 DUP I=561803
UNEXPECTED SOFT UPDATE INCONSISTENCY

4195266 DUP I=561803
UNEXPECTED SOFT UPDATE INCONSISTENCY

4195267 DUP I=561803
UNEXPECTED SOFT UPDATE INCONSISTENCY

4195268 DUP I=561803
UNEXPECTED SOFT UPDATE INCONSISTENCY

4195269 DUP I=561803
UNEXPECTED SOFT UPDATE INCONSISTENCY

4195270 DUP I=561803
UNEXPECTED SOFT UPDATE INCONSISTENCY

4195271 DUP I=561803
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023656 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023657 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023658 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023659 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023660 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023661 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023662 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023663 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023664 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023665 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

46023666 DUP I=18780109
UNEXPECTED SOFT UPDATE INCONSISTENCY

EXCESSIVE DUP BLKS I=18780109
CONTINUE? yes

INCORRECT BLOCK COUNT I=18780109 (1721600 should be 620048)
CORRECT? yes

INODE 18780109: FILE SIZE 881182827 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 317325312
ADJUST? yes

INTERNAL ERROR: dups with softupdates
UNEXPECTED SOFT UPDATE INCONSISTENCY
** Phase 1b - Rescan For More DUPS
fsck_ffs: bad inode number 2 to nextinode
root@svr:/usr/local/SRC/Exim/exim-4.95 #
Comment 15 Kirk McKusick freebsd_committer 2021-10-10 18:57:20 UTC
(In reply to odhiambo@gmail.com from comment #14)
You have updated to the original 13.0 release which does indeed have the problem that you note. You need to update to the stable/13 release which has the fix in it. You can either check out the stable/13 branch from git and build it yourself. Or you can download the latest snapshot of stable/13 from https://download.freebsd.org/ftp/snapshots/VM-IMAGES/13.0-STABLE/amd64/Latest
Comment 16 Gordon Dickens 2021-10-11 17:26:47 UTC
(In reply to Kirk McKusick from comment #15)

Hi Kirk,

The command line output from "freebsd-version" on my machine is:

13.0-RELEASE-p4

Is this the correct stable/13 version to be running?

Thanks,

Gordon Dickens
Comment 17 Kirk McKusick freebsd_committer 2021-10-11 17:35:31 UTC
(In reply to Gordon Dickens from comment #16
The 13.0-RELEASE-p4 release is 13.0 as it was originally released. It will not be updated until 13.1 is released some time in early 2022. The stable/13 release changes day-by-day as the fixes that will eventually become 13.1 are being made. They do not go through the rigorous testing that will be applied to them as part of putting together the 13.1 release, but they do give you access to the fixes that have been made to the 13.0 release without waiting for the 13.1 release to be made.
Comment 18 odhiambo@gmail.com 2021-10-12 09:13:38 UTC
(In reply to Kirk McKusick from comment #17)
The suggestion below has sent shivers down my spine.

I generally decided against running -STABLE version of FreeBSD because it involved compiling userland and kernel.

And your advice to me was:

1. You have updated to the original 13.0 release which does indeed have the problem that you note.
2. You need to update to the stable/13 release which has the fix in it.
    (a) You can either check out the stable/13 branch from git and build it yourself
    (b) Or you can download the latest snapshot of stable/13 from https://download.freebsd.org/ftp/snapshots/VM-IMAGES/13.0-STABLE/amd64/Latest

It seems that this is taking me back to the days of buildworld/kernel, installworld/install kernel, no??

Or did I just misunderstand the recommendation? And what it entails??

Can I use freebsd-update to upgrade to stable/13 release? And still be able to upgrade to subsequent releases?
Comment 19 Graham Perrin 2021-10-13 00:53:07 UTC
(In reply to odhiambo@gmail.com from comment #18)

> … Can I use freebsd-update to upgrade to stable/13 release? …

No.
Comment 20 Kirk McKusick freebsd_committer 2021-10-13 04:56:00 UTC
(In reply to odhiambo@gmail.com from comment #18)
As Graham Perrin noted, freebsd-upgrade only works between point releases.

As you (understandably) do not want to do buildworld, I suggest you consider my other option which is to download one of the stable/13 releases that matches your architecture. It is an image of what you would get if you had downloaded stable/13 and done a buildworld / installworld. Then just copy out the binaries that are of interest to you. In your case /sbin/fsck_ffs which will have the fixes that you need. The kernel ABI is kept unchanged throughout a major release, so binaries from later version will run on earlier versions of that release.
Comment 21 odhiambo@gmail.com 2021-10-13 08:29:21 UTC
(In reply to Kirk McKusick from comment #20)

Hi,

Thanks for clarifying this. I will download the snapshot and copy the file.
Comment 22 Kirk McKusick freebsd_committer 2021-10-14 17:17:00 UTC
(In reply to odhiambo@gmail.com from comment #21)
Please post to this thread of your success in getting the updated fsck_ffs
for the benefit of others in your position needing an update and reading this bug report.
Comment 23 odhiambo@gmail.com 2021-10-15 14:26:53 UTC
(In reply to Kirk McKusick from comment #22)

My issue is now resolved following the advise from McKusick.

1. I downloaded 
https://download.freebsd.org/ftp/snapshots/VM-IMAGES/13.0-STABLE/amd64/Latest
2. Deployed the VM
3. Copied /sbin/{fsck|fsck_ffs} to the affected server
4. cd /sbin; mv fsck fsck.old && mv fsck_ffs fsck_ffs.old. I then replaced the two files with the ones from 13.0-STABLE.

I finally ran the fsck -y /dev/ada1p2 and this ran and ultimately I got the magical FILE SYSTEM MARKET CLEAN.
Comment 24 odhiambo@gmail.com 2021-10-15 14:28:35 UTC
(In reply to odhiambo@gmail.com from comment #23)

*FILE SYSTEM MARKED CLEAN.*
Comment 25 Gordon Dickens 2021-10-15 17:37:19 UTC
(In reply to odhiambo@gmail.com from comment #23)

This procedure worked for me too!

*FILE SYSTEM MARKED CLEAN.*
Comment 26 Kirk McKusick freebsd_committer 2021-10-15 17:56:38 UTC
(In reply to odhiambo@gmail.com from comment #24)
(In reply to Gordon Dickens from comment #25)
Thanks for verifying that this solved your problems with fsck.
As a side benefit you got several other bug fixes for obscure problems.
Sorry for the long and tortuous path to get this resolved, but at least I now have a solution for future problems of this sort.
Comment 27 Mateusz Kwiatkowski 2021-10-20 07:05:37 UTC
Hello,

Since this bug apparently hit more users maybe it could be backported to 13.0 and issue ERRATA? I think that this is more user-friendly approach in this case.