Whenever fsck is run in background mode, a file called fsck_snapshot gets created in the .snap directory on the checked volume. Fsck then runs its check on this file instead of the live filesystem. Filesystem snapshots (which fsck_snapshot essentially is) are designed to persist over mounts and reboots thus if fsck does not terminate properly for some reason (hard reboot etc), the file gets left over. This is partially solved on the next background fsck run (commonly just after the system reboots if the fs is marked dirty) since fsck overwrites the left over fsck_snapshot whit a new one and removes it when its done.
The prblem occours when you mark the filesystem clean before the next fsck background run (for example through fsck in singleuser mode). This way the fsck_snapshot file persists and possibly consumes most of the filesystem (depending on the state of the filesystem when the snapshot was made).
Implement a code (maybe into loader after the the fs is mounted) to check for left over fsck snapshots and remove them if appropriate.
How-To-Repeat: 1) run fsck in background mode
2) halt -qn before fsck finishes (or otherwise terminate it unproperly ... sigkill does not seem to work since fsck is in biord state)
3) boot into singleuser mode
4) fsck to mark the filesystem clean
5) reboot into normal mode and watch the file grow with every change on the live filesystem
Over to maintainer(s).
For bugs matching the following criteria:
Status: In Progress Changed: (is less than) 2014-06-01
Reset to default assignee and clear in-progress tags.
Mail being skipped
Please, does this bug explain the clean-then-dirty behaviour that's observed in the following transcript?
First observed (and reproducible) whilst working with faulty hardware.
Reproducible today with a new SSD.
(In reply to Graham Perrin from comment #3)
Your transcript does not use snapshots, so this bug which is about snapshots does not apply to your transcript.
The update to the block counts should not have affected the file type, so it would appear that when the inode block with the updated count and size fields was written to disk, other parts of it were scrambled. This implies some kind of error in writing the inode block to the disk.
(In reply to Kirk McKusick from comment #4)
Thank you. (I wondered whether there might be a shared underlying cause.)
I'll raise a separate bug report.