Bug 271352

Summary: "live" dump -- with a snapshot -- broken by recent upgrade
Product: Base System Reporter: Mikhail T. <freebsd-2024>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: emaste, mckusick
Priority: ---    
Version: 13.2-STABLE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Output of dumpfs -s none

Description Mikhail T. 2023-05-10 15:10:05 UTC
For years now the nightly dumps of /home were running here from cron:

exec lockf -t 0 /tmp/home-dump-lock dump 0uaCLf 16 - /home | .....

After upgrading to the 13-stable as of May 5th (after 265 days of uptime), that job started failing with:

  DUMP: Date of this level 0 dump: Wed May 10 10:58:00 2023
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping snapshot of /dev/ada0h (/home) to standard output
dump: Cannot find file system superblock: No such file or directory

(Note, that the file system is happily mounted and in use.)

Rerunning the same command without the "L" switch works (with a warning):

  DUMP: WARNING: should use -L when dumping live read-write filesystems!
  DUMP: Date of this level 0 dump: Wed May 10 10:58:07 2023
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/ada0h (/home) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 57211585 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
...

Seems like something is broken about the snapshot-creation. The /home/.snap exists -- and is empty... The machine has 32Gb of RAM, and /home is only about 13% full at present:

root@symbion:~ # df -m /home
Filesystem 1M-blocks  Used  Avail Capacity  Mounted on
/dev/ada0h    458392 56009 365711    13%    /home
root@symbion:~ # tunefs -p /dev/ada0h
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       disabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             8%
tunefs: space to hold for metadata blocks: (-k)            2600
tunefs: optimization preference: (-o)                      time
tunefs: volume label: (-L)                                 home

I recall it being formatted as UFS1 -- but don't know, how to ascertain that.
Comment 1 Ed Maste freebsd_committer freebsd_triage 2023-05-10 19:45:55 UTC
file(1) should report the fs type, for example

# file -s /dev/md0 
/dev/md0: Unix Fast File system [v2] (little-endian)...

or

/dev/md0: Unix Fast File system [v1] (little-endian)...

Do you have the git references available for the previously-working and now-broken cases?
Comment 2 Mikhail T. 2023-05-11 04:00:57 UTC
> file(1) should report the fs type, for example

Awesome, thanks! Here it is:

root@symbion:~ # file -s /dev/ada0h
/dev/ada0h: Unix Fast File system [v1] (little-endian), last mounted on /home, last written at Thu May 11 03:57:49 2023, clean flag 0, number of blocks 119241301, number of data blocks 117348597, number of cylinder groups 1834, block size 32768, fragment size 4096, minimum percentage of free blocks 8, rotational delay 0ms, disk rotational speed 60rps, TIME optimization

> Do you have the git references available for the previously-working and now-broken cases?

The current version is  stable/13-8ba9384727. The kernel.old/kernel has this string: FreeBSD 13.1-STABLE #3 stable/13-c9f9dc96d9.

Hope, this helps.
Comment 3 commit-hook freebsd_committer freebsd_triage 2023-05-26 01:59:57 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4b08a62ed441668c103f834f5fe756ece5a8d9b3

commit 4b08a62ed441668c103f834f5fe756ece5a8d9b3
Author:     Kirk McKusick <mckusick@FreeBSD.org>
AuthorDate: 2023-05-26 01:56:22 +0000
Commit:     Kirk McKusick <mckusick@FreeBSD.org>
CommitDate: 2023-05-26 01:59:16 +0000

    When running fsck_ffs(8) in background ensure that a superblock has been read.

    Reported by:  Mikhail T.
    PR:           271352
    MFC after:    1 week
    Sponsored by: The FreeBSD Foundation

 sbin/fsck_ffs/main.c  | 5 ++++-
 sbin/fsck_ffs/setup.c | 8 ++++++--
 2 files changed, 10 insertions(+), 3 deletions(-)
Comment 4 Kirk McKusick freebsd_committer freebsd_triage 2023-05-26 02:03:15 UTC
The above commit should fix the problem. Will close this bug if no problems reported and it has been MFC'ed to 13-STABLE.
Comment 5 Mikhail T. 2023-05-26 02:06:40 UTC
(In reply to commit-hook from comment #3)
> When running fsck_ffs(8) in background ensure that a superblock has been read.

So, the filesystem was not repaired quite right, when I last rebooted? But the reboot was clean: `shutdown -r .....`

Oh, well, I'll certainly try it as soon as the MFC takes place. Thank you.
Comment 6 Kirk McKusick freebsd_committer freebsd_triage 2023-06-08 17:15:18 UTC
Just went to do an MFC to 13 and found that this change is not relevant to 13. Notably the change that broke background fsck in 14 was not present in 13. So please check that you are still having the issue in 13 (and that it was not just a transient error). If it is still relevant to 13 please do a `dumpfs -s /path/to/filesystem' of the filesystem in question and post it here. Also (as root) try taking a snapshot of the filesystem using `mksnap_ffs /path/to/filesystem/.snap/snap1' and let me know if it is successful. If not, let me know what error it returns and its exit status.
Comment 7 Mikhail T. 2023-06-08 17:46:29 UTC
Created attachment 242692 [details]
Output of dumpfs -s

Yes, the problem is still here. Output of dumpfs -s is attached. Yes, the snapshots can be created easily:

root@symbion:/home/mi # mksnap_ffs /home/.snap/meow
root@symbion:/home/mi # echo $status
0
root@symbion:/home/mi # snapinfo -v /home
/dev/ada0h mounted on /home
        snapshot /home/.snap/meow (inode 4)

Hope, this helps... (How do I get rid of the snapshot? mksnap_ffs(8) has no relevant information...)
Comment 8 Kirk McKusick freebsd_committer freebsd_triage 2023-06-09 06:19:55 UTC
(In reply to Mikhail T. from comment #7)
You remove the snapshot using the `rm' command just as you would any other file.
The mksnap_ffs(8) manual page got updated to note this just recently.

It is UFS1 as noted at the top of your dumpfs(8) output.

The fact that you are able to create a snapshot means that the problem is with the dump command being unsuccessful at doing that. There have been no changes in dump(8) that should affect that.

Another useful test would be to have you run `fsck_ffs -B -d /home' which needs to be run with /home mounted R/W. The -B option runs a background fsck in which fsck takes a snapshot and then checks the snapshot. Mostly just checking that fsck is able to take and use a snapshot.
Comment 9 Mikhail T. 2023-06-09 12:04:45 UTC
(In reply to Kirk McKusick from comment #8)
> fsck_ffs -B -d /home

UFS1 superblock failed: fs->fs_maxfilesize (70368744177663) != maxfilesize (18016597801566207)
Cannot find file system superblock
/home: CAN'T CHECK FILE SYSTEM.
/home: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.

:-/
Comment 10 Kirk McKusick freebsd_committer freebsd_triage 2023-06-11 05:30:26 UTC
Actually that is very helpful. I now know what I need to track down.
Comment 11 commit-hook freebsd_committer freebsd_triage 2023-06-13 07:23:15 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f1549d7d522995bf5d821ae08cc2f500ba545285

commit f1549d7d522995bf5d821ae08cc2f500ba545285
Author:     Kirk McKusick <mckusick@FreeBSD.org>
AuthorDate: 2023-06-13 07:21:43 +0000
Commit:     Kirk McKusick <mckusick@FreeBSD.org>
CommitDate: 2023-06-13 07:22:13 +0000

    Write out corrected superblock when creating a UFS/FFS snapshot.

    When taking a snapshot on a UFS version 1 filesystem we need to
    call ffs_oldfscompat_write() to unwind any in-memory changes that
    were made to the superblock before writing it. The cause of this bug
    was that the trimmed down maximum file size was not being reverted.

    PR:           271352
    Tested-by:    Peter Holm
    MFC-after:    1 week
    Sponsored-by: The FreeBSD Foundation

 sys/ufs/ffs/ffs_snapshot.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
Comment 12 Kirk McKusick freebsd_committer freebsd_triage 2023-06-13 07:28:48 UTC
If the fix is confirmed to work, will MFC to 13.
Comment 13 Mikhail T. 2023-07-21 15:33:33 UTC
(In reply to Kirk McKusick from comment #12)
> If the fix is confirmed to work, will MFC to 13.

Should I plan a rebuild/reboot of this system, or is the fix not in 13 yet?
Comment 14 commit-hook freebsd_committer freebsd_triage 2023-07-22 00:22:54 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=17207eae668b80561c98fe18a9a5e18409d47c58

commit 17207eae668b80561c98fe18a9a5e18409d47c58
Author:     Kirk McKusick <mckusick@FreeBSD.org>
AuthorDate: 2023-06-13 07:21:43 +0000
Commit:     Kirk McKusick <mckusick@FreeBSD.org>
CommitDate: 2023-07-22 00:22:10 +0000

    Write out corrected superblock when creating a UFS/FFS snapshot.

    PR:           271352
    Tested-by:    Peter Holm
    Sponsored-by: The FreeBSD Foundation
    (cherry picked from commit f1549d7d522995bf5d821ae08cc2f500ba545285)

 sys/ufs/ffs/ffs_snapshot.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
Comment 15 Kirk McKusick freebsd_committer freebsd_triage 2023-07-22 00:25:43 UTC
(In reply to Mikhail T. from comment #13)
Thanks for the reminder (for some reason I never got the system MFC reminder). I have just now done the MFC to 13. Please let me know if it fixes your issue. If so, I will close this report.
Comment 16 Mikhail T. 2023-08-01 15:53:01 UTC
Live dumps work again now. Thanks.