Bug 275871 - [unionfs] [nullfs] [zfs] corrupt filesystem
Summary: [unionfs] [nullfs] [zfs] corrupt filesystem
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.0-RELEASE
Hardware: Any Any
: --- Affects Many People
Assignee: Jason A. Harmening
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-21 19:29 UTC by Karlo Miličević
Modified: 2024-04-07 00:35 UTC (History)
5 users (show)

See Also:


Attachments
unionfs workaround for zfs_mkdir() issue (687 bytes, patch)
2023-12-23 05:09 UTC, Jason A. Harmening
no flags Details | Diff
updated patch, with debug print removed (597 bytes, patch)
2023-12-23 06:04 UTC, Jason A. Harmening
no flags Details | Diff
patch for unionfs_vput_pair() panicking on VSOCK vnodes (1.34 KB, patch)
2024-01-30 03:45 UTC, Jason A. Harmening
no flags Details | Diff
Implement VOP_UNP_* for unionfs (6.70 KB, patch)
2024-02-03 18:51 UTC, Jason A. Harmening
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Karlo Miličević 2023-12-21 19:29:17 UTC
I stumbled upon a way to reliably create files with "/" in their names in FreeBSD 14.0.
It works fine when using UFS or FreeBSD 13.
The filesystem works fine with those files - they are just impossible to destroy or access apart from destroying them with zfs operations (rollback or destroy).

I provide a small script that produces a file named "a/b" in directory "/var/repro/y":

#!/bin/sh
zfs destroy zroot/var/repro
zfs create zroot/var/repro
mkdir -p /var/repro/z/a
echo > /var/repro/z/a/b
chmod +x /var/repro/z/a/b
mkdir -p /var/repro/x
mkdir -p /var/repro/y
mount_nullfs /var/repro/z /var/repro/x
mount_unionfs /var/repro/y /var/repro/x
/var/repro/x/a/b
umount /var/repro/x
umount /var/repro/x
rmdir /var/repro/x
rm -r /var/repro/z
ls -Rl /var/repro/y

IMPORTANT: If you 'ls' /var/repro/x/a/b before executing it everything works fine.

NOTE:
I stumbled upon this when trying to run IMUNES (https://github.com/imunes/imunes) on the newest release.
Currently, this bug is blocking the project from porting to 14.0.
Comment 1 Jason A. Harmening freebsd_committer freebsd_triage 2023-12-23 05:05:19 UTC
The fundamental issue here is that ZFS doesn't correctly respect cnp->cn_namelen in its mkdir operation, which breaks unionfs shadow directory creation.

I've filed https://github.com/openzfs/zfs/issues/15705 to address this upstream, but in the meantime it can be worked around in unionfs using the attached patch.
Comment 2 Jason A. Harmening freebsd_committer freebsd_triage 2023-12-23 05:09:10 UTC
Created attachment 247210 [details]
unionfs workaround for zfs_mkdir() issue
Comment 3 Jason A. Harmening freebsd_committer freebsd_triage 2023-12-23 06:04:47 UTC
Created attachment 247211 [details]
updated patch, with debug print removed
Comment 4 Karlo Miličević 2024-01-29 20:39:50 UTC
I tested the patch and it fixes the repro.
However, when I run the program that produced original issue - kernel panics.

Here is stacktrace:
(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=textdump@entry=1) at /usr/src/sys/kern/kern_shutdown.c:405
#2  0xffffffff80b56f8c in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:526
#3  0xffffffff80b5747f in vpanic (fmt=0xffffffff81231b76 "%s: fault on nofault entry, addr: %#lx", ap=ap@entry=0xfffffe016afbf7a0) at /usr/src/sys/kern/kern_shutdown.c:970
#4  0xffffffff80b572d3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:894
#5  0xffffffff80ed7085 in vm_fault_lookup (fs=0xfffffe016afbf800) at /usr/src/sys/vm/vm_fault.c:912
#6  vm_fault (map=<optimized out>, vaddr=vaddr@entry=18446744071597604864, fault_type=2 '\002', fault_flags=fault_flags@entry=0, m_hold=m_hold@entry=0x0) at /usr/src/sys/vm/vm_fault.c:1569
#7  0xffffffff80ed59f0 in vm_fault_trap (map=<optimized out>, vaddr=vaddr@entry=18446744071597605869, fault_type=<optimized out>, fault_flags=fault_flags@entry=0, signo=0x0, ucode=0x0)
    at /usr/src/sys/vm/vm_fault.c:712
#8  0xffffffff810364c9 in trap_pfault (frame=0xfffffe016afbf990, usermode=false, signo=<unavailable>, ucode=<unavailable>) at /usr/src/sys/amd64/amd64/trap.c:845
#9  <signal handler called>
#10 atomic_fcmpset_int (src=1627419759, dst=<optimized out>, expect=<optimized out>) at /usr/src/sys/amd64/include/atomic.h:183
#11 refcount_acquire_if_gt (count=0xffffffff821e43ed, n=0) at /usr/src/sys/sys/refcount.h:129
#12 refcount_acquire_if_not_zero (count=0xffffffff821e43ed) at /usr/src/sys/sys/refcount.h:138
#13 vget_prep (vp=vp@entry=0xffffffff821e4241 <.L.str.22+1>) at /usr/src/sys/kern/vfs_subr.c:3262
#14 0xffffffff80c40ace in vref (vp=vp@entry=0xffffffff821e4241 <.L.str.22+1>) at /usr/src/sys/kern/vfs_subr.c:3356
#15 0xffffffff82e1c7fd in unionfs_vput_pair (ap=0xfffffe016afbfb20) at /usr/src/sys/fs/unionfs/union_vnops.c:2664
#16 0xffffffff80c0fd53 in VOP_VPUT_PAIR (dvp=<optimized out>, vpp=vpp@entry=0xfffffe016afbfcd0, unlock_vp=true) at ./vnode_if.h:2415
#17 0xffffffff80c0d4f0 in uipc_bindat (fd=<optimized out>, so=<optimized out>, nam=0xfffff80037f50820, td=0xfffffe016c3e0ac0) at /usr/src/sys/kern/uipc_usrreq.c:646
#18 0xffffffff80bffb42 in sobind (so=0xffffffff821e4241 <.L.str.22+1>, so@entry=0xfffff8000e2a2b40, nam=0xfffffe016afbfcd0, nam@entry=0xfffff80037f50820, td=0xfffff8000e959700, td@entry=0xfffffe016c3e0ac0)
    at /usr/src/sys/kern/uipc_socket.c:940
#19 0xffffffff80c06fd5 in kern_bindat (td=td@entry=0xfffffe016c3e0ac0, dirfd=dirfd@entry=-100, fd=<optimized out>, sa=sa@entry=0xfffff80037f50820) at /usr/src/sys/kern/uipc_syscalls.c:223
#20 0xffffffff80c06e6b in sys_bind (td=0xfffffe016c3e0ac0, uap=0xfffffe016c3e0ec0) at /usr/src/sys/kern/uipc_syscalls.c:190
#21 0xffffffff81036ba9 in syscallenter (td=0xfffffe016c3e0ac0) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187
#22 amd64_syscall (td=0xfffffe016c3e0ac0, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1197
#23 <signal handler called>
#24 0x00000008250a912a in ?? ()
Backtrace stopped: Cannot access memory at address 0x820e29598
(kgdb) fr
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
57              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,

I can provide more information if needed.
Comment 5 Jason A. Harmening freebsd_committer freebsd_triage 2024-01-30 03:45:04 UTC
Created attachment 248076 [details]
patch for unionfs_vput_pair() panicking on VSOCK vnodes

Oof, apologies for the breakage.  Unionfs has been considered "broken" for so long that there are still some big holes in our test coverage.  I think I know what's happening based on the backtrace though. Can you apply the attached patch on top of you other local patches?
Comment 6 Karlo Miličević 2024-01-30 13:55:03 UTC
I tested them together and it seems to behave correctly now.
I also tested it together with the patch that attached to bug #275871.
Everything seems to work correctly.

Thank you!
Comment 7 Jason A. Harmening freebsd_committer freebsd_triage 2024-02-03 18:51:01 UTC
Created attachment 248162 [details]
Implement VOP_UNP_* for unionfs

After digging through UIPC code and unionfs revision history, I think a better approach for fixing the VSOCK issue is to remove all the special-case VSOCK handling in unionfs and implement the VOP_UNP_* entrypoints.  

Karlo, can I ask you to apply this patch instead of the previous patch and re-test your program?  I've tested it locally and verified that it allows unionfs to work with unix sockets on 15-current, but it would still be good to make sure it doesn't break your program before I commit it.

To be clear, you should still apply the patch from PR 275870 as well as the first patch from this PR ("updated patch, with debug print removed"), but you should revert the most recent patch ("patch for unionfs_vput_pair() panicking on VSOCK vnodes") and instead apply this new patch.

Thanks for testing!
Comment 8 Karlo Miličević 2024-02-03 20:57:57 UTC
I applied patched as you described and tested it.
It also seems to work. :)

P.S. In my last comment, I meant to reference bug #275870.
Comment 9 Karlo Miličević 2024-02-03 21:01:10 UTC
I spoke too soon.
Here is how I made it crash again. :)
I don't know if it is related, though.

root@rat2 /v/repro# mount_unionfs a b
root@rat2 /v/repro# mount_unionfs a b
root@rat2 /v/repro# mount_unionfs b a
# mkdir -p /var/repro/a/panic: __lockmgr_args: downgrade a recursed lockmgr zfs @ /usr/src/sys/fs/unionfs/union_vnops.c:1933

cpuid = 14
time = 1706997578
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe016cd41820
vpanic() at vpanic+0x132/frame 0xfffffe016cd41950
panic() at panic+0x43/frame 0xfffffe016cd419b0
__lockmgr_args() at __lockmgr_args+0x720/frame 0xfffffe016cd41a40
lockmgr_lock_flags() at lockmgr_lock_flags+0xe9/frame 0xfffffe016cd41a80
unionfs_lock() at unionfs_lock+0x28f/frame 0xfffffe016cd41b00
_vn_lock() at _vn_lock+0x47/frame 0xfffffe016cd41b60
unionfs_readdir() at unionfs_readdir+0x144/frame 0xfffffe016cd41cb0
VOP_READDIR_APV() at VOP_READDIR_APV+0x20/frame 0xfffffe016cd41cd0
kern_getdirentries() at kern_getdirentries+0x221/frame 0xfffffe016cd41dd0
sys_getdirentries() at sys_getdirentries+0x29/frame 0xfffffe016cd41e00
amd64_syscall() at amd64_syscall+0x109/frame 0xfffffe016cd41f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe016cd41f30
--- syscall (554, FreeBSD ELF64, getdirentries), rip = 0x8279d9a0a, rsp = 0x82972e988, rbp = 0x82972e9d0 ---
KDB: enter: panic
[ thread pid 760 tid 100628 ]
Stopped at      kdb_enter+0x32: movq    $0,0xe20903(%rip)
db>
Comment 10 Karlo Miličević 2024-02-03 21:08:22 UTC
I'm sorry for spamming.
Here is a more minimal repro:

# mount_unionfs a b
# mount_unionfs b a
# ls a

--- same stacktrace

I know usage like tihs is unrealistic, but I wanted to make sure it does not... crash.
Comment 11 Jason A. Harmening freebsd_committer freebsd_triage 2024-02-03 23:33:58 UTC
No need to apologize, this is the way things go with unionfs: fix one thing, then another thing pops up.  Sometimes the fix for the first thing accidentally *causes* the second thing.  I hope eventually we'll have enough tests in place to make that less likely.

The good news here is that this doesn't seem related to the previous patches.

Just to be sure, this scenario isn't required in order for your program to work, is it?  If not I'd rather focus on getting the other fixes in the pipeline reviewed, committed, and MFCed to 14 first, because I suspect the fix for this issue may not be so easy since it involves unionfs on unionfs.  I haven't had the courage to test that kind of thing locally yet, since just getting one level of unionfs to work has been enough trouble.

(BTW I think the only reason this didn't panic (or at least not the same way) on 13 was that unionfs_readdir() and other functions weren't holding the lock exclusive when they needed to which caused all kinds of other problems under stress testing, see the earlier note about fixing the first thing causing the second thing :) )
Comment 12 Karlo Miličević 2024-02-04 00:38:17 UTC
This scenario isn't expected by the program so the bug does not affect it. I was just curious how unionfs would handle this case.

If the bugs aren't related - should another bug report be opened for it, or do you yourself track what else needs to be fixed with unionfs?

Thank you again for all the help and effort.
Comment 13 Jason A. Harmening freebsd_committer freebsd_triage 2024-02-04 00:44:41 UTC
(In reply to Karlo Miličević from comment #12)

Yes, please do file another bug report.  I'm not sure if bugzilla will allow you to assign it directly to me, but if so please do that too.  Otherwise just add me to the CC list and I'll grab it.
Comment 14 Karlo Miličević 2024-02-04 00:54:59 UTC
I looked a bit at open issues and found bug #172334 which seems to have same repro steps.
Comment 15 Jason A. Harmening freebsd_committer freebsd_triage 2024-02-04 00:59:33 UTC
Even better, thanks!
Comment 16 commit-hook freebsd_committer freebsd_triage 2024-02-18 15:20:52 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=a2ddbe019d51b35f9da2cb5ddca8c69f0ee422da

commit a2ddbe019d51b35f9da2cb5ddca8c69f0ee422da
Author:     Jason A. Harmening <jah@FreeBSD.org>
AuthorDate: 2023-12-24 04:48:19 +0000
Commit:     Jason A. Harmening <jah@FreeBSD.org>
CommitDate: 2024-02-18 15:19:23 +0000

    unionfs: work around underlying FS failing to respect cn_namelen

    unionfs_mkshadowdir() may be invoked on a non-leaf pathname component
    during lookup, in which case the NUL terminator of the pathname buffer
    will be well beyond the end of the current component.  cn_namelen in
    this case will still (correctly) indicate the length of only the
    current component, but ZFS in particular does not currently respect
    cn_namelen, leading to the creation on inacessible files with slashes
    in their names.  Work around this behavior by temporarily NUL-
    terminating the current pathname component for the call to VOP_MKDIR().

    https://github.com/openzfs/zfs/issues/15705 has been filed to track
    a proper upstream fix for the issue at hand.

    PR:             275871
    Reported by:    Karlo Miličević <karlo98.m@gmail.com>
    Tested by:      Karlo Miličević <karlo98.m@gmail.com>
    Reviewed by:    kib, olce
    MFC after:      2 weeks
    Differential Revision: https://reviews.freebsd.org/D43818

 sys/fs/unionfs/union_subr.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
Comment 17 commit-hook freebsd_committer freebsd_triage 2024-03-04 18:51:59 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c18e6a5a5c63d85d664d60dc3bf0232ea21bf8d2

commit c18e6a5a5c63d85d664d60dc3bf0232ea21bf8d2
Author:     Jason A. Harmening <jah@FreeBSD.org>
AuthorDate: 2023-12-24 04:48:19 +0000
Commit:     Jason A. Harmening <jah@FreeBSD.org>
CommitDate: 2024-03-04 18:31:25 +0000

    unionfs: work around underlying FS failing to respect cn_namelen

    unionfs_mkshadowdir() may be invoked on a non-leaf pathname component
    during lookup, in which case the NUL terminator of the pathname buffer
    will be well beyond the end of the current component.  cn_namelen in
    this case will still (correctly) indicate the length of only the
    current component, but ZFS in particular does not currently respect
    cn_namelen, leading to the creation on inacessible files with slashes
    in their names.  Work around this behavior by temporarily NUL-
    terminating the current pathname component for the call to VOP_MKDIR().

    https://github.com/openzfs/zfs/issues/15705 has been filed to track
    a proper upstream fix for the issue at hand.

    PR:             275871
    Reported by:    Karlo Miličević <karlo98.m@gmail.com>
    Tested by:      Karlo Miličević <karlo98.m@gmail.com>
    Reviewed by:    kib, olce
    Differential Revision: https://reviews.freebsd.org/D43818

    (cherry picked from commit a2ddbe019d51b35f9da2cb5ddca8c69f0ee422da)

 sys/fs/unionfs/union_subr.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
Comment 18 commit-hook freebsd_committer freebsd_triage 2024-03-24 02:12:35 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=eee6217b40df5877674df1d9aec7d5bd04202876

commit eee6217b40df5877674df1d9aec7d5bd04202876
Author:     Jason A. Harmening <jah@FreeBSD.org>
AuthorDate: 2024-02-03 17:17:58 +0000
Commit:     Jason A. Harmening <jah@FreeBSD.org>
CommitDate: 2024-03-24 02:10:53 +0000

    unionfs: implement VOP_UNP_* and remove special VSOCK vnode handling

    unionfs has a bunch of clunky special-case code to avoid creating
    unionfs wrapper vnodes for AF_UNIX sockets.  This was added in 2008
    to address PR 118346, but in the intervening years the VOP_UNP_*
    operations have been added to provide a clean interface to allow
    sockets to work in the presence of stacked filesystems.

    PR:                     275871
    Reviewed by:            kib (prior version), olce
    Tested by:              Karlo Miličević <karlo98.m@gmail.com>
    MFC after:              2 weeks
    Differential Revision:  https://reviews.freebsd.org/D44288

 sys/fs/unionfs/union_vnops.c | 173 +++++++++++++++++++++----------------------
 1 file changed, 84 insertions(+), 89 deletions(-)
Comment 19 commit-hook freebsd_committer freebsd_triage 2024-04-07 00:31:57 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7b86d14bfccb589612cbc6b2e54b405da9839232

commit 7b86d14bfccb589612cbc6b2e54b405da9839232
Author:     Jason A. Harmening <jah@FreeBSD.org>
AuthorDate: 2024-02-03 17:17:58 +0000
Commit:     Jason A. Harmening <jah@FreeBSD.org>
CommitDate: 2024-04-07 00:27:00 +0000

    unionfs: implement VOP_UNP_* and remove special VSOCK vnode handling

    unionfs has a bunch of clunky special-case code to avoid creating
    unionfs wrapper vnodes for AF_UNIX sockets.  This was added in 2008
    to address PR 118346, but in the intervening years the VOP_UNP_*
    operations have been added to provide a clean interface to allow
    sockets to work in the presence of stacked filesystems.

    PR:                     275871
    Reviewed by:            kib (prior version), olce
    Tested by:              Karlo Miličević <karlo98.m@gmail.com>
    Differential Revision:  https://reviews.freebsd.org/D44288

    (cherry picked from commit eee6217b40df5877674df1d9aec7d5bd04202876)

 sys/fs/unionfs/union_vnops.c | 173 +++++++++++++++++++++----------------------
 1 file changed, 84 insertions(+), 89 deletions(-)
Comment 20 Jason A. Harmening freebsd_committer freebsd_triage 2024-04-07 00:35:53 UTC
It took me forever to purge the queue of pending unionfs work and then get everything MFCed, but this should now be fixed with the latest MFC.