Bug 263765 - panic: ffs_sync: modification on read-only filesystem
Summary: panic: ffs_sync: modification on read-only filesystem
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords: crash
Depends on:
Blocks: 263979
  Show dependency treegraph
 
Reported: 2022-05-04 00:44 UTC by Mark Johnston
Modified: 2022-10-12 00:50 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Johnston freebsd_committer freebsd_triage 2022-05-04 00:44:28 UTC
I was using a VM to test some code and crashed it many times.  After a while it consistently panics during boot, while upgrading the read-only root mount to read-write:

Trying to mount root from ufs:gpt/rootfs [rw]...                                                                                                                                                                                                                                                                              
WARNING: / was not properly dismounted                                                                                                                                                                                                                                                                                        
WARNING: /: mount pending error: blocks 367136 files 5                                                                                                                                                                                                                                                                        
No suitable dump device was found.                                                                                                                                                                                                                                                                                            
Setting hostuuid: 4ff7a735-851c-7030-87f0-cf2c2b9ef44e.                                                                                                                                                                                                                                                                       
Setting hostid: 0x1c981ebb.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
Starting file system checks:                                                                                                                                   
WARNING: / was not properly dismounted                                                                                                                         
panic: /: ffs_sync: modification on read-only filesystem                                                                                                 
cpuid = 1                                                                                                                                                      
time = 1651623448                                                                                                                                              
KDB: stack backtrace:                                                                                                                                          
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe007a140750                                                                                 
vpanic() at vpanic+0x17f/frame 0xfffffe007a1407a0                                                                                                              
panic() at panic+0x43/frame 0xfffffe007a140800                                                                                                                 
ffs_sync() at ffs_sync+0x6e6/frame 0xfffffe007a1408a0                                                                                                          
vfs_write_suspend() at vfs_write_suspend+0x165/frame 0xfffffe007a1408f0                                                                                        
vfs_write_suspend_umnt() at vfs_write_suspend_umnt+0x35/frame 0xfffffe007a140920
ffs_mount() at ffs_mount+0xa2b/frame 0xfffffe007a140a70                                                                                                        
vfs_domount_update() at vfs_domount_update+0x277/frame 0xfffffe007a140bf0                                                                                      
vfs_domount() at vfs_domount+0x26b/frame 0xfffffe007a140d20                                                                                                    
vfs_donmount() at vfs_donmount+0x878/frame 0xfffffe007a140dc0                                                                                                  
sys_nmount() at sys_nmount+0x69/frame 0xfffffe007a140e00                                                                                                       
amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe007a140f30                                                                                                
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe007a140f30                                                                                     
--- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x28658b33e3da, rsp = 0x286588ae7e38, rbp = 0x286588ae83a0 ---

The problem appears to be that the fmod flag in the superblock got set, somehow:

magic   19540119 (UFS2)
last mounted time       Mon May  2 21:00:40 2022
last modified time      Mon May  2 21:00:52 2022
superblock location     65536   id      [ 62658b45 0a5e546e ]
ncg     82      size    13107200        blocks  12694551
bsize   32768   shift   15      mask    0xffff8000
fsize   4096    shift   12      mask    0xfffff000
frag    8       shift   3       fsbtodb 3
minfree 8%      optim   time    symlinklen 120
maxbsize 32768  maxbpg  8192    maxcontig 2     contigsumsize 2
nbfree  1377660 ndir    4007    nifree  6527315 nffree  6053
bpg     20042   fpg     160336  ipg     80256   unrefs  0
nindir  4096    inopb   128     maxfilesize     2252349704110079
sbsize  4096    cgsize  32768   csaddr  5056    cssize  4096
sblkno  24      cblkno  32      iblkno  40      dblkno  5056
cgrotor 7       fmod    1       ronly   0       clean   0
metaspace 0     avgfpdir 64     avgfilesize 16384
flags   unclean soft-updates 

Running fsck from single-user mode fixed the problem, but it reappeared several times.

I'm not really sure how fmod can end up set; it looks like it's cleared any time FFS updates the superblock.  (But then why is it in the superblock at all?)

I don't appear to have checksums enabled here, maybe enabling them would help narrow down the problem.
Comment 1 Konstantin Belousov freebsd_committer freebsd_triage 2022-05-05 00:01:05 UTC
Are you having background fsck enabled?

If fs_fmod is 1, which it should be after any alloc/free on the filesystem,
then this is expected and even desirable outcome.
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2022-05-05 00:31:30 UTC
(In reply to Konstantin Belousov from comment #1)
Background fsck is enabled.

At the time of the panic, the filesystem is still mounted read-only.  fs->fs_fmod is set to 1 because the corresponding fmod flag was set in the on-disk superblock.  So, somehow a superblock with fs->fs_fmod = 1 was written to disk, even though ffs_sbput() clears it, and after that the system always panics during boot.
Comment 3 Konstantin Belousov freebsd_committer freebsd_triage 2022-05-05 01:03:32 UTC
I see, ffs_sbput() indeed clears the flag, but nothing syncs it with updates
to fs_fmod in ffs_alloc.c (just for example).  As result, if a block or inode
is allocated or freed after fs_fmod is cleared but before struct fs is copied
to the b_data of superblock in ffs_use_bwrite(), then we get fmod == 1 on disk.

I am not sure, perhaps this could be considered a bug.  If we really do not
want to see fmod == 1 on disk, then zeroing should be moved to the
ffs_use_bwrite() function, right before recalculation of the checksum.

More important, I believe, UFS mount lock should be taken around bcopy()
in ffs_use_bwrite(), although there is probably not too many volatile
fields which can be corrupted, if any.

I believe that zeroing in ffs_sbput() must be left for userspace consumers?
Comment 4 Mark Johnston freebsd_committer freebsd_triage 2022-05-05 14:17:29 UTC
(In reply to Konstantin Belousov from comment #3)
Hmm, above the definition of fs_fmod we even have "/* these fields are cleared at mount time */", but this is not respected.

You're right, there is no synchronization for the fs_fmod field changes.  So this is probably not a new bug, but I'm surprised that I never saw it before.

> More important, I believe, UFS mount lock should be taken around bcopy()
in ffs_use_bwrite(), although there is probably not too many volatile
fields which can be corrupted, if any.

Such corruptions would presumably be detected by the superblock check hash.

(It would be nice if makefs supported enabling check hashes.)
Comment 5 Konstantin Belousov freebsd_committer freebsd_triage 2022-05-08 14:06:56 UTC
https://reviews.freebsd.org/D35149
Comment 6 commit-hook freebsd_committer freebsd_triage 2022-05-09 21:11:09 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ca7c2d2eedf690ae0c780451f53d9ce36bb2c337

commit ca7c2d2eedf690ae0c780451f53d9ce36bb2c337
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2022-05-09 20:46:05 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2022-05-09 20:46:05 +0000

    UFS: clear fs_fmod once more, in the buffer data copy.

    This is needed for in-kernel copy of the code, where allocation might
    happen after fs_fmod is cleared in ffs_sbput() but before the write.

    Reported by:    markj
    Reviewed by:    chs, markj
    PR:     263765
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D35149

 sys/ufs/ffs/ffs_vfsops.c | 1 +
 1 file changed, 1 insertion(+)
Comment 7 commit-hook freebsd_committer freebsd_triage 2022-05-09 21:11:11 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4ac2df8f4cd91017c000543224204f823008f699

commit 4ac2df8f4cd91017c000543224204f823008f699
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2022-05-08 14:00:37 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2022-05-09 20:45:27 +0000

    ffs_use_bwrite: make the superblock snapshot more consistent

    Copy in-memory struct fs to the superblock buffer under the UFS mutex.

    Reviewed by:    chs, markj
    PR:     263765
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D35149

 sys/ufs/ffs/ffs_vfsops.c | 2 ++
 1 file changed, 2 insertions(+)
Comment 8 commit-hook freebsd_committer freebsd_triage 2022-05-16 22:49:25 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=2430388070fcedb3d30b8dd0e9fffacdf630dc34

commit 2430388070fcedb3d30b8dd0e9fffacdf630dc34
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2022-05-09 20:46:05 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2022-05-16 22:32:55 +0000

    UFS: clear fs_fmod once more, in the buffer data copy.

    PR:     263765

    (cherry picked from commit ca7c2d2eedf690ae0c780451f53d9ce36bb2c337)

 sys/ufs/ffs/ffs_vfsops.c | 1 +
 1 file changed, 1 insertion(+)
Comment 9 commit-hook freebsd_committer freebsd_triage 2022-05-16 22:49:26 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=bc6860ca7cfa4f3d317dd6380b3852a9db99bb16

commit bc6860ca7cfa4f3d317dd6380b3852a9db99bb16
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2022-05-08 14:00:37 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2022-05-16 22:32:55 +0000

    ffs_use_bwrite: make the superblock snapshot more consistent

    PR:     263765

    (cherry picked from commit 4ac2df8f4cd91017c000543224204f823008f699)

 sys/ufs/ffs/ffs_vfsops.c | 2 ++
 1 file changed, 2 insertions(+)