Bug 195485 - [ufs] mksnap_ffs(8) cannot create snapshot with journaled soft updates enabled
Summary: [ufs] mksnap_ffs(8) cannot create snapshot with journaled soft updates enabled
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.1-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-fs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-28 21:21 UTC by Frank Wall
Modified: 2018-11-12 07:30 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Wall 2014-11-28 21:21:51 UTC
On FreeBSD 10.1 amd64 it is still not possible to create UFS snapshots if journaled soft updates is enabled:

mksnap_ffs: Cannot create snapshot /home/.snap/dump_snapshot: /home: Snapshots are not yet supported when running with journaled soft updates

This long-standing limitation should be fixed to allow running dump(8) in this configuration.

Maybe a note should be added to the dump(8) man page and the FreeBSD Handbook, chapter 18.8.1. "File System Backups" to properly document this limitation.
Comment 1 t_uemura 2018-11-09 02:00:57 UTC
Hi. Is there any chance to re-enable snapshot on SU+J?

Rev. 232351 and many more changes have applied to ffs_softdep.c in the
past 6 years, I think snapshot support can be re-enabled on SU+J
filesystem, at least via sysctl to get wider evaluation.

For me, a couple of machines running recent (as of September)
11-STABLE amd64, snapshot is working without deadlock or/and panic.
Comment 2 Kirk McKusick freebsd_committer 2018-11-12 07:30:26 UTC
(In reply to t_uemura from comment #1)
Short answer: snapshots work while SU+J is running. The problem arises because the fsck code that does the journal recovery does not know how to repair snapshots. Thus after a crash recovery all the snapshots that were on the filesystem are possibly corrupted and will cause a panic if used.

Long answer: when files are deleted, the blocks are normally returned to the list of free blocks so that they can be allocated to new files.  When a filesystem contains snapshots, each freed blocks is first offered to each of the snapshots so that they can claim it if it is part of one of the files in the snapshot. By claiming a block they prevent it from being put on the list of free blocks and thus its contents will be preserved for the snapshotted file.

The journal recovery code has never had the logic added to it to do these checks. Hence, when it frees blocks, it does not check the snapshots to see if they want to claim these blocks. Thus blocks that should be claimed by the snapshots are instead put on the list of free blocks and will eventually be reused. If one of these blocks is part of the metadata of a file in a snapshot (such as a block of indirect pointers) and that block gets overwritten with other data, then attempts to access that file in the snapshot will cause a data inconsistency leading to a kernel panic.

The correct solution is to extract the code from the kernel that handles freeing of blocks and add it to the journal recovery code in fsck. This is a lot of complicated code and would take a lot of effort to do. As ZFS provides cheap snapshots, that is the filesystem of preference for folks that want snapshot functionality. The only remaining use for snapshots in UFS is the ability to do live dumps.  Thus I have not been motivated to go to the effort to migrate the kernel code to fsck (and nobody has offered to pay me the $25K to have me do it).

An easier solution would be to simply delete all the snapshots as part of doing the filesystem recovery. The problem is that while there is a list of all the inode numbers for the active snapshots in the superblock, we do not know the pathnames for all of these snapshots, so we would have to do a complete traversal of the filesystem to find them which would largely negate the speed benefit of journaling.

Another easy solution would be to truncate all the snapshots to zero length and stop offering them as snapshots. This would be much quicker as we have the list of inode number that need to be truncated and all we would be left to clean up would be a list of zero-length files which could be handled by a find after the system is up and running.

I am happy to review changes if someone wants to implement this solution (or the more difficult correct solution noted above).