Bug 266236 - ZFS NFS : .zfs/snapshot : Stale file handle
Summary: ZFS NFS : .zfs/snapshot : Stale file handle
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.1-STABLE
Hardware: amd64 Any
: --- Affects Many People
Assignee: Mark Johnston
URL: https://www.freebsd.org/security/advi...
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-05 15:36 UTC by Michel Le Cocq
Modified: 2022-11-21 06:54 UTC (History)
13 users (show)

See Also:
markj: mfc-stable13+


Attachments
fix lock leak (622 bytes, patch)
2022-10-06 17:07 UTC, Mark Johnston
no flags Details | Diff
proposed patch (707 bytes, patch)
2022-10-07 13:43 UTC, Mark Johnston
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michel Le Cocq 2022-09-05 15:36:57 UTC
Hi, since upgrading to 13.1-RELEASE of FreeBSD I can't anymore access to .zfs/snapshot folder over NFS.

On Ubuntu or Debian client when I tried to acces do .zfs/snapshot I obtain : Stale file handle

medic:/home/user1 on /home/user1 type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.80,mountvers=3,mountport=850,mountproto=udp,local_lock=none,addr=192.168.0.80)

    I have 2 server one in 13.0-p7 and the other in 13.1-p2
    Several disk bay, all in multi-attachment
    each bay connected to the 2 server

I use this setup since 12.0 Release and before 13.1 all was ok with snapshot access.

I use carp to be able to distribute the load over my two server and in case of trouble or upgrade needed I can import all my pool in one and then upgrade on the other.

So I have several IP for this data service, one by pool export in fact.

I only have the stale file handle on .zfs/snapshot over NFS on the 13.1 server, if I import my pool on the 13.0 it works has normal.

Locally (On FreeBSD) I can list the snapshots normally on booth server.

I have to upgrade my booth server to 13.1 because with 13.0 I was facing an other trouble which is solver under 13.1.

Has I say NFS setup is based on a carp IP :

lagg1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
[...]
        inet 192.168.0.80 netmask 0xffffff00 broadcast 192.168.0.255 vhid 80
[...]
        laggproto lacp lagghash l2,l3,l4
        laggport: bnxt0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: bnxt1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>                                                                                                                                                                             
        groups: lagg                                                                                                                                                                                                                                                              
        carp: MASTER vhid 80 advbase 1 advskew 100                                                                                                                                                                                                               
[...]                                                                                                                                                                                                                          
        media: Ethernet autoselect                                                                                                                                                                                                                                                
        status: active                                                                                                                                                                                                                                                            
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>


My NFS config :

rpcbind_enable="YES"
nfs_server_enable="YES"
nfs_server_flags="-u -t -h 192.168.0.80 -h 192.168.0.81 -h 192.168.0.82 -h 192.168.0.83 --minthreads 12 --maxthreads 24"
mountd_enable="YES"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"


My sharenfs setup on the pool/vol :

# zfs get sharenfs tank/home/user1
NAME              PROPERTY  VALUE                                  SOURCE
tank/home/user1  sharenfs  -network 192.168.0.0 -mask 255.255.255.0  local

It's seems there is the same trouble with TrueNas 13 see here :Forum TrueNAS - Stale file handle" when list snapshots (.zfs)

An other issu which also appear on TrueNAS :
Deleting a snapshot in which a simple "ls" via NFS has been attempted will completely block and leave the zfs destroy process in an unkillable state IO (trouble).
On TrueNAS it seems that in this case whole system will become unstable or even totally unusable...

If anyone can help.
Thanks.
Comment 2 Graham Perrin freebsd_committer freebsd_triage 2022-09-05 21:20:51 UTC
Cross-reference: <https://old.reddit.com/r/freebsd/comments/x6c78e/-/>
Comment 3 Michel Le Cocq 2022-09-06 12:11:18 UTC
A little small procedure to reproduce this bug.

Install a fresh FreeBSD server under 13.1-p2

	root@server:~# freebsd-version 
	13.1-RELEASE-p2
	
Inside a zfs pool create a vol and share it by NFS.

	root@server:~# zfs get name,mountpoint,sharenfs tank/zfsnfstest
	NAME             PROPERTY    VALUE             SOURCE
	tank/zfsnfstest  name        tank/zfsnfstest   -
	tank/zfsnfstest  mountpoint  /tank/zfsnfstest  local
	tank/zfsnfstest  sharenfs    on                local
	
Mount localy your sharenfs vol.

	root@server:~# mount -t nfs 127.0.0.1:/tank/zfsnfstest /mnt
	
Create a snapshot.

	root@server:~# zfs snapshot tank/zfsnfstest@1
	
Check you can access localy.

	root@server:~# ls -l /tank/zfsnfstest/.zfs/snapshot/1/
	total 0
	root@server:~#
	
Try to acces from mounted NFS vol.

	root@server:~# ls -l /mnt/.zfs/snapshot/1/
	total 0
	ls: /mnt/.zfs/snapshot/1/: Stale NFS file handle
	
Here we see we don't have access from NFS.

Try to remove your previous created snapshot.

	root@server:~# zfs destroy tank/zfsnfstest@1

This process didn't end...
		
	root@server:~ # ps aux
	USER    PID  %CPU %MEM    VSZ  RSS TT  STAT STARTED      TIME COMMAND
	[...]
	root  58027   0.0  0.0  18012 7256  1  D+   12:27     0:00.01 zfs destroy tank/zfsnfstest@1
	
D mean : Uninterruptible sleep (usually IO). So you can't kill it !

	root@server:~ # kill 58027
	root@server:~ # ps aux | grep 58027
	root  58027   0.0  0.0  18012 7256  1  D+   12:27     0:00.01 zfs destroy tank/zfsnfstest@1
	root@server:~ # kill -1 58027
	root@server:~ # ps aux | grep 58027
	root  58027   0.0  0.0  18012 7256  1  D+   12:27     0:00.01 zfs destroy tank/zfsnfstest@1
	root@server:~ # kill -9 58027
	root@server:~ # ps aux | grep 58027
	root  58027   0.0  0.0  18012 7256  1  D+   12:27     0:00.01 zfs destroy tank/zfsnfstest@1
	root@server:~ #
Comment 5 Michel Le Cocq 2022-09-12 19:26:50 UTC
change component to Kern.
Comment 6 eborisch+FreeBSD 2022-09-13 02:30:41 UTC
Some extra info:

Steps I took to reproduce:

1) Mount NFS-exported zfs filesystem on client.
2) Try to enter a snapshot (.zfs/snapshot/foo) directory on client -> Stale file handle error.
3) Unmount on client; stop nfsd on server
4) mount -v on server shows the requested snapshot as mounted
5) Try explicit unmount of the snapshot path -> umount hangs (but the snapshot path no longer shows up in mount -v)
6) Try deleting snapshot -> zfs hangs

procstat -k $hung_unmount_pid: 

  PID    TID COMM                TDNAME              KSTACK                       
 5260 101043 umount              -                   mi_switch _sleep rms_wlock zfsvfs_teardown zfs_umount dounmount kern_unmount amd64_syscall fast_syscall_common 


procstat -k $hung_zfs_destroy:

  PID    TID COMM                TDNAME              KSTACK                       
 5826 101058 zfs                 -                   mi_switch _sleep vfs_busy zfs_vfs_ref getzfsvfs_impl getzfsvfs zfsctl_snapshot_unmount zfs_ioc_destroy_snaps zfsdev_ioctl_common zfsdev_ioctl devfs_ioctl vn_ioctl devfs_ioctl_f kern_ioctl sys_ioctl amd64_syscall fast_syscall_common

At this point no _zfs_ or _zpool_ commands succeed. (Or at least, none that I tried; all hang.)

Restart required to unwedge.
Comment 7 Graham Perrin freebsd_committer freebsd_triage 2022-09-30 06:32:34 UTC
Should there be an upstream issue for this bug report? 

OpenZFS issues in 2017 and 2019 (both now closed): 


> Unable to view snapshots over NFS in 0.7.1 · Issue #6594 · openzfs/zfs
> 
> … Stale file handle …

> Unable to see ZFS snapshots over NFS · Issue #8645 · openzfs/zfs
> 
> I see that there have been similar issues like this before, but they are 
> all closed, …


Cross-reference: <https://markmail.org/thread/r6dptuewsf2k2nxe>
Comment 8 florian.millet 2022-09-30 11:52:50 UTC
We are hit by this problem as well.

I would add that the version of NFS used by the client has no impact, NFSv3 or NFSv4 has the problem.

When FreeBSD changed from ZoF to OpenZFS 2.x in 13, there does seem to be a change in the unmounting of the .zfs directory (see https://cgit.freebsd.org/src/tree/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_ctldir.c?h=releng/13.1 vs https://cgit.freebsd.org/src/tree/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c?h=releng/12.3)

a new function zfsctl_snapshot_unmount has been added in OpenZFS 2.x, and if you compare it to the old zfsctl_umount_snapshot, the logic is quite different. Perhaps we have a regression here ?
Comment 9 eborisch+FreeBSD 2022-09-30 14:02:20 UTC
(In reply to Graham Perrin from comment #7)

There is now: https://github.com/openzfs/zfs/issues/13974
Comment 10 Michel Le Cocq 2022-10-06 14:05:04 UTC
How can I help for this bug ?

In fact I've two FreeBSD server with several disk bay.

Server1 : 13.0-RELEASE-p7
Server2 : 13.1-RELEASE-p2

My disk bay are attach on booth.

Then I can choose to import the pool on Server1 or Server2.

The point is that on Server1 with 13.0-RELEASE-p7 no snapshot access trouble but really often NFS client hang like this  :

Oct  6 15:15:57 xxx kernel: [275827.527427] nfs: server Server1 not responding, still trying
...
Oct  6 15:18:58 xxx kernel: [276007.920813] nfs: server Server1 OK

Since upgrade on 13.1-RELEASE-p2 on Server2, I have 'Stale NFS file handle' on NFS client.

	root@poste:~# ls -l /mnt/.zfs/snapshot/1/
	total 0
	ls: /mnt/.zfs/snapshot/1/: Stale NFS file handle

And worse on this release ... an action on zfs vol result in : 'process in Uninterruptible sleep', then I must reboot...
Comment 11 Michel Le Cocq 2022-10-06 15:01:46 UTC
With 13.1-RELEASE-p2 no more "nfs: server Server1 not responding" at all .
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2022-10-06 15:57:12 UTC
I can reproduce this with a loopback NFSv3 mount of a dataset containing a snapshot.  Some dtracing shows the ESTALE comes from nfs_access(), the RPC is returning the error because it can't translate the file handle to a vnode.  In particular, I can see zfs_fhtovp() is returning EINVAL when dealing with a "long" FID, used for files under .zfs/snapshot.

  1  70351                 zfs_fhtovp:entry fid_t {
    u_short fid_len = 0x12 <-- LONG_FID_LEN
    u_short fid_data0 = 0x22
    char [16] fid_data = [ "" ]
}
  1  70352                zfs_fhtovp:return nfsd 22
              kernel`nfsvno_fhtovp+0x3d
              kernel`nfsrvd_dorpc+0x120
              kernel`nfssvc_program+0x68c
              kernel`svc_run_internal+0xb4f
              kernel`svc_thread_start+0xb
              kernel`fork_exit+0x7e
              kernel`0xffffffff8107803e

With DTrace I can see that vfs_hash_get() and VOP_LOOKUP are not getting called from zfs_fhtovp(), which means the error is probably coming from a recently added check:

1821         if (fidp->fid_len == LONG_FID_LEN && (fid_gen > 1 || setgen != 0)) {
1822                 dprintf("snapdir fid: fid_gen (%llu) and setgen (%llu)\n",
1823                     (u_longlong_t)fid_gen, (u_longlong_t)setgen);
1824                 return (SET_ERROR(EINVAL));
1825         }

That came in with https://github.com/openzfs/zfs/pull/12905 but let me see first if that really is the source of the error.
Comment 13 Mark Johnston freebsd_committer freebsd_triage 2022-10-06 16:54:22 UTC
Reverting https://github.com/openzfs/zfs/pull/12905 does fix the problem for me.  There is an additional bug there in that there is a missing zfs_exit() call before the return, which causes a lock leak.

In my case, we end up with fid_gen == 4 and setgen == 0.  I don't understand where that generation number comes from. zfs_fid() sets it from a ZPL attribute, and "stat -f '%v'" on the snapshot dir prints 0.  

To repro, I just enabled an NFSv3 server in a VM:

mountd_enable="YES"                                                                                                                                                                                                                                                                                                           
mountd_flags="-n"                                                                                                                                                                                                                                                                                                             
nfs_server_enable="YES"                                                                                                                                                                                                                                                                                                       
rpc_locked_enable="YES"                                                                                                                                                                                                                                                                                                       
rpc_statd_enable="YES"                                                                                                                                                                                                                                                                                                        
rpcbind_enable="YES"
zfs_enable="YES"

Create a dataset with a snapshot and export it:

# zfs snapshot test@1
# zfs set sharenfs=on test

Mount it locally and check .zfs/snapshot:

# mount localhost:/test /mnt
# ls /mnt/.zfs/snapshot
1
# ls /mnt/.zfs/snapshot/1
snapdir fid: fid_gen (4) and setgen (0)
snapdir fid: fid_gen (4) and setgen (0)
snapdir fid: fid_gen (4) and setgen (0)
ls: /mnt/.zfs/snapshot/1: Stale NFS file handle
Comment 14 eborisch+FreeBSD 2022-10-06 16:59:18 UTC
(In reply to Mark Johnston from comment #13)

Thank you for looking into this!

To clarify:
 1) With your reproduced error state, is deleting the snapshot still possible, or does that hang as expected?
 2) Does reverting https://github.com/openzfs/zfs/pull/12905 fix that (hang after failed-access/destroy), as well?

Your comments may already answer (2), and I've just failed to parse it sufficiently.
Comment 15 Mark Johnston freebsd_committer freebsd_triage 2022-10-06 17:07:12 UTC
Created attachment 237124 [details]
fix lock leak

(In reply to eborisch+FreeBSD from comment #14)
The hang you are seeing is due to the aforementioned lock leak.  That at least has a straightforward fix, patch attached.  But that doesn't fix the "stale NFS handle" errors; I'm not sure exactly what's going on there.
Comment 16 Bane Ivosev 2022-10-07 08:34:01 UTC
Hi, we have the same problem after upgrade to 13.1, easily reproducible every time. First, we experienced freezing of our production nfs/zfs Supermicro storage after we tried to access snapshot on the nfs client (at one moment server freeze at regular snapshot management procedure, not immediately), then tested this bug on our backup storage. Same result.

Now we try the same with 12.3 in KVM VM. 

server:
zfs create tank0/test
zfs snap tank0/test@1

client:
mount -o vers=3,tcp server:/tank0/test /mnt
ls -al /mnt/.zfs/snapshot/1

Results:

12.3 server zfs -> 12/13 nfs client, everything is ok, as expected
12.3 server openzfs 2022Q1 -> 12/13 nfs client, everything is ok

12.3 server openzfs 2022Q2 -> 12/13 nfs client stale file handle, server hang on any zfs command or can't finish reboot/halt, need power cycle

Same results with openzfs 2022Q3 and 2022Q4.
Comment 17 florian.millet 2022-10-07 13:27:02 UTC
(In reply to Mark Johnston from comment #15)
I tested a modified version of your patch (it didn't compile so I replaced your line by a ZFS_EXIT(zfsvfs);) and I can confirm that the deadlock no longer happens.

I can also confirm as you specified that we indeed still have the NFS Stale Handle error while listing the directories under .zfs/snapshot/

Thank you for the patch!
Comment 18 Mark Johnston freebsd_committer freebsd_triage 2022-10-07 13:43:33 UTC
Created attachment 237140 [details]
proposed patch

It looks like the fid_gen > 1 check is related to the implementation of zfsctl_snapdir_fid(), which is Linux-specific and sets the generation number to `!!d_mountpoint(dentry)`.  There is a comment explaining this: "we encode whether snapdir is already mounted in gen field".

As far as I can see we do not do anything similar when constructing FIDs on FreeBSD, so the check is wrong there and we should simply drop it.
Comment 19 Mark Johnston freebsd_committer freebsd_triage 2022-10-07 14:06:49 UTC
(In reply to florian.millet from comment #17)
Thanks for testing.  Please give the patch from comment 18 a try.  You'll need to change the zfs_exit() line again: the interface is different on FreeBSD 13 vs. the development branch, and I'm testing on the latter.

See also https://github.com/openzfs/zfs/pull/14001
Comment 20 florian.millet 2022-10-07 16:25:50 UTC
(In reply to Mark Johnston from comment #19)
I tested the new patch and I have weird problems, NFS no longer works on my ZFS shares :
- A Linux Client with NFSv3 gives a Connection Timeout
- A FreeBSD client with NFSv3 gives a NFS Stale Handle when trying to mount
- A FreeBSD client with NFSv4 mounts the share but I see nothing, no files, if I try a touch I get a Input/Output Error

I'm quite surprised by this behavior.
Comment 21 Mark Johnston freebsd_committer freebsd_triage 2022-10-07 17:10:42 UTC
(In reply to florian.millet from comment #20)
That is indeed surprising.  I can't reproduce any such problems locally.  Just to double check, can you show the patch that you've applied to 13?
Comment 22 florian.millet 2022-10-07 17:15:10 UTC
(In reply to Mark Johnston from comment #21)
I tried this :

diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c
index cdd762dcbcbf..05d41d4e3b2a 100644
--- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c
+++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c
@@ -1845,7 +1845,8 @@ zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, int flags, vnode_t **vpp)
                return (SET_ERROR(EINVAL));
        }

-       if (fidp->fid_len == LONG_FID_LEN && (fid_gen > 1 || setgen != 0)) {
+       if (fidp->fid_len == LONG_FID_LEN && setgen != 0) {
+               ZFS_EXIT(zfsvfs);
                dprintf("snapdir fid: fid_gen (%llu) and setgen (%llu)\n",
                    (u_longlong_t)fid_gen, (u_longlong_t)setgen);
                return (SET_ERROR(EINVAL));
Comment 23 Bane Ivosev 2022-10-07 18:46:41 UTC
Mark, i just tried your patch and i don't have stale file handle anymore. Great. I'll do more testing.
Comment 24 Bane Ivosev 2022-10-07 18:59:09 UTC
I run tests in VMs, 13.1-RELEASE-p2 source, applied patch

-       if (fidp->fid_len == LONG_FID_LEN && (fid_gen > 1 || setgen != 0)) {
+       if (fidp->fid_len == LONG_FID_LEN && setgen != 0) {
+               ZFS_EXIT(zfsvfs);

make kernel, and then nfs mount from 12/13 client and ls .zfs/snapshot/snap1. Everything is working normal, no more stale file handle and freeze at the server side.

Thanks Mark!
Comment 25 eborisch+FreeBSD 2022-10-07 19:43:06 UTC
(In reply to florian.millet from comment #22)

I have the _exact_ same changes (git blob hashes from 'git diff' match), and it is working for me. (Access to snapshots over NFS succeeds; zfs doesn't hang; can destroy entered-via-nfs snapshots.)

What's your 'uname -a' look like?
Comment 26 florian.millet 2022-10-08 16:37:28 UTC
(In reply to eborisch+FreeBSD from comment #25)
My test environment is far from being standard so I must have made a mistake, let me retry monday from scratch to see if I can reproduce my NFS problems.
Comment 27 Michel Le Cocq 2022-10-09 08:27:17 UTC
Thanks to all of you for all this work.
I'd like also to test.

Can you give me a process to do that !?

Thanks
Comment 28 Michel Le Cocq 2022-10-10 17:13:10 UTC
I made the _exact_ same changes : in : sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c

-       if (fidp->fid_len == LONG_FID_LEN && (fid_gen > 1 || setgen != 0)) {
+       if (fidp->fid_len == LONG_FID_LEN && setgen != 0) {
+               ZFS_EXIT(zfsvfs);

On a small test server all seems to work has normal now for me.
- Access to snapshots over NFS succeeds
- zfs doesn't hang
- can destroy entered-via-nfs snapshots

I just did a git pull of src then remove the line and add the two others then create a new zfs boot env and :

make -j 8 buildkernel
make -j 8 installkernel

root@smallfish:~ # uname -a
FreeBSD smallfish 13.1-RELEASE-p2 FreeBSD 13.1-RELEASE-p2 #0 releng/13.1-752f813d6-dirty: Mon Oct 10 11:46:40 CEST 2022     root@smallfish:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
root@smallfish:~ # 

Cool :-)

Now waiting for the implementation in freebsd-update for my production environnement.

Thanks to all of you.

Do you have an idee of when it be avaible in freebsd-update ?
Comment 29 Mark Johnston freebsd_committer freebsd_triage 2022-10-11 13:47:58 UTC
(In reply to florian.millet from comment #26)
Have you been able to reproduce the same NFS problems as before?

(In reply to Michel Le Cocq from comment #28)
I will aim to get it into the next patch release, but I'm not sure exactly when that will be.
Comment 30 florian.millet 2022-10-11 16:54:29 UTC
(In reply to Mark Johnston from comment #29)
Sorry for the delay, I retried everything from scratch on my test environment and this time everything is working fine, I don't know what I did the first time around, sorry for the noise. 
We also did performance benchmarks to verify that there is no regression (seeing the patch there was no reason to have any, but better safe than sorry), and results show no regressions.

I didn't reply immediately because I deployed the patch on a production server and was waiting for the customer to try everything to see if it works for him, but I do not yet have a definitive answer.

So apparently everything is working fine, if my customer says ok, we will deploy this patch as-is to our production servers.
Comment 31 eborisch+FreeBSD 2022-10-11 21:05:25 UTC
For record keeping, Mark Johnston's fix has been merged upstream (OpenZFS):

https://github.com/openzfs/zfs/commit/ed566bf1cd0bdbf85e8c63c1c119e3d2ef5db1f6
Comment 32 florian.millet 2022-10-13 00:00:05 UTC
FYI, the server with the patch that we deployed for one of our customer was tested by him and he told us that everything is working as expected.

We will then integrate it to our production servers, thank you you for providing this patch.
Comment 33 Michel Le Cocq 2022-10-18 14:56:15 UTC
(In reply to Mark Johnston from comment #29)
Do you have an idee when this patch will comme in an update release of FreeBSD 13.1 ?
Comment 34 Mark Johnston freebsd_committer freebsd_triage 2022-10-18 14:59:16 UTC
(In reply to Michel Le Cocq from comment #33)
I already staged everything (i.e., wrote an erratum notice) for the bug, but I'm not sure when the next patch release will be.  It should be soon since we have 5+ ENs to release but I'm afraid I don't have a precise date.
Comment 35 Michel Le Cocq 2022-10-19 16:30:48 UTC
ok thanks
Comment 36 commit-hook freebsd_committer freebsd_triage 2022-10-24 16:05:41 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6fe0a6c80a1aff14236924eb33e4013aa8c14f91

commit 6fe0a6c80a1aff14236924eb33e4013aa8c14f91
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-10-24 15:55:48 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-10-24 16:05:17 +0000

    zfs: Fix a pair of bugs in zfs_fhtovp()

    This cherry-picks upstream ed566bf1cd0bdbf85e8c63c1c119e3d2ef5db1f6

        - Add a zfs_exit() call in an error path, otherwise a lock is
          leaked.
        - Remove the fid_gen > 1 check.  That appears to be Linux-specific:
          zfsctl_snapdir_fid() sets fid_gen to 0 or 1 depending on whether
          the snapshot directory is mounted.  On FreeBSD it fails, making
          snapshot dirs inaccessible via NFS.

    PR:             266236
    MFC after:      3 days

 sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 37 commit-hook freebsd_committer freebsd_triage 2022-10-27 12:00:36 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=562c9ac58c7678b13f52b0bfe63148e40d7bf63d

commit 562c9ac58c7678b13f52b0bfe63148e40d7bf63d
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-10-24 15:55:48 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-10-27 12:00:01 +0000

    zfs: Fix a pair of bugs in zfs_fhtovp()

    This cherry-picks upstream ed566bf1cd0bdbf85e8c63c1c119e3d2ef5db1f6

        - Add a zfs_exit() call in an error path, otherwise a lock is
          leaked.
        - Remove the fid_gen > 1 check.  That appears to be Linux-specific:
          zfsctl_snapdir_fid() sets fid_gen to 0 or 1 depending on whether
          the snapshot directory is mounted.  On FreeBSD it fails, making
          snapshot dirs inaccessible via NFS.

    PR:             266236

    (cherry picked from commit 6fe0a6c80a1aff14236924eb33e4013aa8c14f91)

 sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 38 commit-hook freebsd_committer freebsd_triage 2022-11-01 20:34:48 UTC
A commit in branch releng/13.1 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7ab877cb3f9d55a394e43a2c1a1e2711df12226d

commit 7ab877cb3f9d55a394e43a2c1a1e2711df12226d
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-10-24 15:55:48 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-11-01 13:28:11 +0000

    zfs: Fix a pair of bugs in zfs_fhtovp()

    This cherry-picks upstream ed566bf1cd0bdbf85e8c63c1c119e3d2ef5db1f6

        - Add a zfs_exit() call in an error path, otherwise a lock is
          leaked.
        - Remove the fid_gen > 1 check.  That appears to be Linux-specific:
          zfsctl_snapdir_fid() sets fid_gen to 0 or 1 depending on whether
          the snapshot directory is mounted.  On FreeBSD it fails, making
          snapshot dirs inaccessible via NFS.

    Approved by:    so
    PR:             266236
    Security:       FreeBSD-EN-22:24.zfs

    (cherry picked from commit 6fe0a6c80a1aff14236924eb33e4013aa8c14f91)
    (cherry picked from commit 562c9ac58c7678b13f52b0bfe63148e40d7bf63d)

 sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vfsops.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 39 Graham Perrin freebsd_committer freebsd_triage 2022-11-21 06:54:47 UTC
Thanks!