FreeBSD ilythia.eden.le-fay.org 14.0-RELEASE-p3 FreeBSD 14.0-RELEASE-p3 #0: Mon Dec 11 04:56:01 UTC 2023 root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 this system has an NFSv4.2 mount from a FreeBSD 14.0 server using sec=krb5p. after a few hours of activity (mostly low-volume random reads at ~5-10Mbps throughput), the NFS mount has hung and cannot be accessed; 'df' and 'nfsstat -m' also hang. there's no NFS network traffic and no indication in the logs on either system of a problem. the Kerberos ticket isn't expired, and renewing the ticket anyway made no difference. according to kgdb, nfscl seems to be stuck in nfsv4_sequencelookup: (kgdb) where #0 sched_switch (td=td@entry=0xfffffe017c43a740, flags=flags@entry=259) at /usr/src/sys/kern/sched_ule.c:2297 #1 0xffffffff80b5028b in mi_switch (flags=flags@entry=259) at /usr/src/sys/kern/kern_synch.c:548 #2 0xffffffff80ba001b in sleepq_switch (wchan=wchan@entry=0xfffff800754bd9c0, pri=pri@entry=68) at /usr/src/sys/kern/subr_sleepqueue.c:607 #3 0xffffffff80ba05ef in sleepq_timedwait (wchan=wchan@entry=0xfffff800754bd9c0, pri=68) at /usr/src/sys/kern/subr_sleepqueue.c:689 #4 0xffffffff80b4f9e0 in _sleep (ident=ident@entry=0xfffff800754bd9c0, lock=lock@entry=0xfffff800754bd810, priority=priority@entry=68, wmesg=0xffffffff81195590 "nfsclseq", sbt=4294967000, pr=pr@entry=0, flags=256) at /usr/src/sys/kern/kern_synch.c:219 #5 0xffffffff809ed270 in nfsv4_sequencelookup (nmp=nmp@entry=0xfffff802f5506000, sep=sep@entry=0xfffff800754bd810, slotposp=slotposp@entry=0xfffffe017b087b38, maxslotp=maxslotp@entry=0xfffffe017b087b34, slotseqp=slotseqp@entry=0xfffffe017b087b3c, sessionid=sessionid@entry=0xfffffe017b087b40 "k\214\020\201\377\377\377\377\377\242\352\200\377\377\377\377\n\340h D\257\353\v ", fnd_init=false) at /usr/src/sys/fs/nfs/nfs_commonsubs.c:4989 #6 0xffffffff809e1c39 in nfsv4_setsequence (nmp=0xfffff802f5506000, nd=0xfffffe017b087c38, sep=0xfffff800754bd810, dont_replycache=0, cred=0xfffff800087f3c00) at /usr/src/sys/fs/nfs/nfs_commonsubs.c:4892 #7 0xffffffff809e12f1 in nfscl_reqstart (nd=0xfffffe017b087c38, procnum=32, nmp=0xfffff802f5506000, nfhp=0x0, fhlen=0, opcntpp=<optimized out>, sep=<optimized out>, vers=0, minorvers=0, cred=0xfffff800087f3c00) at /usr/src/sys/fs/nfs/nfs_commonsubs.c:433 #8 0xffffffff80a11e77 in nfsrpc_renew (clp=<optimized out>, dsp=dsp@entry=0x0, cred=cred@entry=0xfffff800087f3c00, p=p@entry=0xfffffe017c43a740) at /usr/src/sys/fs/nfsclient/nfs_clrpcops.c:4874 #9 0xffffffff809f7ae5 in nfscl_renewthread (clp=clp@entry=0xfffffe0180c75000, p=0xfffffe017c43a740) at /usr/src/sys/fs/nfsclient/nfs_clstate.c:2769 #10 0xffffffff80a2bde4 in start_nfscl (arg=<unavailable>, arg@entry=0xfffffe0180c75000) at /usr/src/sys/fs/nfsclient/nfs_clport.c:782 #11 0xffffffff80afdb7f in fork_exit (callout=0xffffffff80a2bdd0 <start_nfscl>, arg=0xfffffe0180c75000, frame=0xfffffe017b087f40) at /usr/src/sys/kern/kern_fork.c:1160 #12 <signal handler called> #13 0x00000328911e38ca in ?? () Backtrace stopped: Cannot access memory at address 0x3288e984798 (kgdb) frame 5 #5 0xffffffff809ed270 in nfsv4_sequencelookup (nmp=nmp@entry=0xfffff802f5506000, sep=sep@entry=0xfffff800754bd810, slotposp=slotposp@entry=0xfffffe017b087b38, maxslotp=maxslotp@entry=0xfffffe017b087b34, slotseqp=slotseqp@entry=0xfffffe017b087b3c, sessionid=sessionid@entry=0xfffffe017b087b40 "k\214\020\201\377\377\377\377\377\242\352\200\377\377\377\377\n\340h D\257\353\v ", fnd_init=false) at /usr/src/sys/fs/nfs/nfs_commonsubs.c:4989 warning: Source file is more recent than executable. 4989 mtx_sleep(&sep->nfsess_slots, &sep->nfsess_mtx,
this seems to be connected to Kerberos ticket expiry: the problem recurred just now when my ticket expired. when this happens, klist shows: % klist Credentials cache: FILE:/tmp/krb5cc_10006 Principal: lexi@EDEN.LE-FAY.ORG Issued Expires Principal Dec 24 04:13:34 2023 >>>Expired<<< krbtgt/EDEN.LE-FAY.ORG@EDEN.LE-FAY.ORG Dec 24 06:58:04 2023 >>>Expired<<< host/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG Dec 24 13:37:42 2023 >>>Expired<<< nfs/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG Dec 24 13:37:42 2023 >>>Expired<<< nfs/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG Dec 24 13:37:42 2023 >>>Expired<<< nfs/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG Dec 24 13:37:42 2023 >>>Expired<<< nfs/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG [... many identical entries elided...] Dec 24 13:37:42 2023 >>>Expired<<< nfs/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG Dec 24 13:37:42 2023 >>>Expired<<< nfs/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG Dec 24 06:58:04 2023 >>>Expired<<< host/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG Dec 24 06:58:04 2023 >>>Expired<<< host/hemlock.eden.le-fay.org@EDEN.LE-FAY.ORG in additional, the 'Invalid' counter in nfsstat -cE increases to 64 (it was previously 0): Rpc Info: TimedOut Invalid X Replies Retries Requests 0 64 0 0 4832814 however, the mount never recovers even after renewing the tickets.
You haven't used the "intr" and/or "soft" mount options by any chance? These are guaranteed to break NFSv4 mounts. The renewthread is stuck in nfsv4_sequencelookup() because there are no session slots available. - This can happen for "soft" or "intr" mounts because the RPC reply does not get processed properly to release the session slot. - If the mount does not use "soft" nor "intr", then there is some other reason that the replies are not being processed. --> I'll think about this. It might be a side effect of a RPCSEC_GSS session being expired so the client cannot de-encrypt the RPC reply, or something like that. Typically, the RPCSEC_GSS session (not the same thing as a NFSv4 session, just to make the terminology confusing) expires when the TGT used to acquire it expires. It will be re-created upon the next RPC if there is a valid TGT, but that might be too late for processing of an RPC reply. (With sec=krb5p the reply is encrypted, so it cannot be processed without a valid RPCSEC_GSS session. Once all session slots are in use and no RPC reply is processed to release one, the mount is hung and the only fix is to "umount -N <mnt-path>" or reboot the client machine.
no, i'm not using soft or intr (assuming the default is hard,nointr): hemlock.eden.le-fay.org:/public /data/public nfs rw,nfsv4,minorversion=2,sec=krb5p,gssname=host,proto=tcp6 0 0 in case it makes a difference, the max lifetime on my user TGT (lexi@EDEN.LE-FAY.ORG) is very long, about 30 days. in the second comment, i'd only requested a normal 10 hour ticket. in the original report, i think the lifetime may have been much longer - which is why i thought the ticket hadn't expired. i'm not sure if this is relevant (i'm sure it *shouldn't* be, but perhaps this configuration hasn't been tested as much).
The renew thread RPC (the one that your backtrace shows as hung in nfsv4_sequencelookup()) would be using the host/FQDN@REALM entry in the default /etc/krb5.keytab. (ie. Your TGT would not be used for this, since the renew thread runs as "root".) Could the client's keytab entry be broken somehow? (As you probably are aware, when you create a new keytab for a principal, old keytabs no longer work.) You could go onto the client and do # kinit -k host/FQDN to see that the keytab entry for "host" works ok. (I am still wondering what will happen if the TGT for the keytab expires just after the RPC request is sent to the server. I'll look into that.) There could also be a problem where other RPCs were using all slots, so the renew thread slept in nfsv4_sequencelookup() and then another thread needed the renew thread to do some work. --> The renew thread should really have a dedicated slot in the NFSv4 session, so that no other thread can cause it to block. --> I might work on a patch for that. Does this hang occur frequently or was this a "one time" hang on a mount that usually behaves ok?
the client keytab should be fine, since i only created it about 2 days ago when i installed this host and there was no previous principal for this fqdn. 'kinit -k host/fqdn' on the client seems to work: # kinit -k host/ilythia.eden.le-fay.org # klist Credentials cache: FILE:/tmp/krb5cc_0 Principal: host/ilythia.eden.le-fay.org@EDEN.LE-FAY.ORG Issued Expires Principal Dec 24 15:54:16 2023 Dec 25 01:54:16 2023 krbtgt/EDEN.LE-FAY.ORG@EDEN.LE-FAY.ORG > Does this hang occur frequently or was this a "one time" hang on a mount that usually behaves ok? as i mentioned, i only installed this host 2 days ago, so i can't say for sure, but so far it has reliably occurred twice about 10 hours after booting so it seems to be 100% reproducible. (iow, it never *hasn't* occurred at that time.) in the mean time, i've temporarily switched the mount from sec=krb5p to sec=krb5 to see if this fixes the issue; aiui, this means no GSS should be involved after the initial mount, so i expect it will, but i'm happy to do any other testing you need.
Not exactly. sec=krb5 says that a RPCSEC_GSS (think Kerberos) session is used to identify the user for all RPCs. The difference w.r.t. krb5p is that it does not encrypt the NFS payload. I doubt it will make any difference, but it sounds like you'll know soon enough. I now have a hunch w.r.t. what might be broken, but I need to look at the code (and if my hunch seems correct) and maybe come up with a patch. You could try the "syskrb5" mount option, which avoids use of the keytab and allows "system operations that maintain the state" to use AUTH_SYS, while the rest (all involving file data) use Kerberos. (If my hunch is correct, this will not fix the problem, but might cause it to behave better.) rick ps: I accidentally did this as an email reply...
Created attachment 247232 [details] Free a session slot when a server replies RPC_AUTHERR This patch might fix the problem. At least it will give us a little more information w.r.t. what is causing the hang. If the reporter can test it on their client, that could be useful.
sec=krb5 didn't fix the problem. i didn't have a chance to test syskrb5; instead i've applied the patch, and i'll let you know if 10 hours if that's made a difference.
unfortunately the patch doesn't seem to make any difference. when the mount is idle, the behaviour is as expected: - ticket expires - access to the mount returns EACCES - obtaining a new ticket (kinit) restores access to the mount - this is probably the same behaviour as without the patch, but i didn't think to test it when the mount is busy (~20Mbps read traffic in this case), the behaviour is the same as before: instead of returning EACCESS, the mount simply hangs. the printf in the patch is *not* triggered, and obtaining a new ticket doesn't fix the mount. this is on releng/14.0; i'm about to upgrade this system to stable/14 (for unrelated reasons) so i'll test it there as well just in case there's a difference.
in case this makes a difference, the vast majority of the I/O is not on the mounted filesystem itself, but a child filesystem mounted as a mirror mount (which doesn't appear in df, so i forgot to mention it).
Hmm. I am not sure what you mean by a mirror mount under the NFS mount? Maybe you can explain that more. (It is probably a part of this, since no one else has been reporting similar hangs.) Try "syskrb5" on the mount, it should at least avoid the renewthread from getting hung. Next time you have the hang, please collect the following information on the NFS client: # ps axHl # procstat -a -kk which should give us some idea what is hanging. There is really no diffeence in the NFS code between releng/14.0 and stable/14, but maybe some other part of the system that is involved in the hang has changed?
Btw, I am not sure what your goals are for using sec=krb5[p], but if they are "on the wire data encryption" and client machine identification, the using NFS-over-TLS is an alternative. You can see: https://people.freebsd.org/~rmacklem/nfs-over-tls-setup.txt If you are using sec=krb5[p] so that users are identified via their kerberos principal (instead of uid), then you are stuck with Kerberos and the hassles of TGT expiration. You are probably aware that there are utilities out there that run as daemons and renew TGTs if they are renewable.
by mirror mount, i mean that on the NFS server, the filesystem has several sub-filesystems: Filesystem Size Used Avail Capacity Mounted on data/public 28T 45M 28T 0% /data/public data/public/Books 28T 48M 28T 0% /data/public/Books data/public/Miscellaneous 28T 178M 28T 0% /data/public/Miscellaneous data/public/Software 28T 166G 28T 1% /data/public/Software [...] which are individually exported in /etc/exports: /data/public -sec=krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 /data/public/Books -sec=krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 [...] V4: /data -sec=sys:krb5:krb5i:krb5p -network 2001:8b0:aab5::/48 but on the client, only the root filesystem, /data/public, is mounted in /etc/fstab. when something accesses a sub-filesystem, like /data/public/Software, the client automatically mounts it in the appropriate place. possibly FreeBSD calls this something else; i think "mirror mount" might be the Solaris or Linux term for it (e.g., https://docs.oracle.com/cd/E26502_01/html/E28997/ggcvr.html).
Ok. I didn't realize you were referring to the NFS server when you said "mirror mount". Doing this on the server is fine and should not affect the "hang". (And, yes, the term is used in FreeBSD, although I'll admit it has always seemed a weird term for exported subtrees.) Btw, the client does not construct separate mounts for these sub-filesystems on the server, it simply crosses the NFS server mount point. (Something that did not work for NFSv3.)
Created attachment 247257 [details] patch i see - on some other systems these do appear as separate mounts, but the FreeBSD method seems cleaner. in any case, this doesn't seem to be related: i tested with a direct mount of the subfilesystem and the behaviour was the same. syskrb5 made no difference. however, i did some more debugging and i think the reason the suggested patch doesn't work is because it should use RPC_AUTHERROR, not RPC_AUTHERR. after changing that, it *does* seem to work: ticket expiry causes the application to get EACCES, but the mount *doesn't* hang and i can access it with ls (also returning EACCES); renewing the ticket allows the mount to work again. i've attached the full patch i'm currently running. this is on: FreeBSD 14.0-STABLE #5 stable/14-n266050-ac83f3fa44f3-dirty: Tue Dec 26 07:15:04 GMT 2023 lexi@ilythia.eden.le-fay.org:/src/stable/14/sys/amd64/compile/ILYTHIA
Good news (and good catch). I never even noticed there were 2 different macros with similar names. And, yes, I had meant RPC_AUTHERROR when I did the patch. As an aside, great minds think alike. I had added printf()s almost identical to yours, but I had not been able to reproduce the problem yet, so I didn't see the error code. Anyhow, looks like it is fixed. This bug has been in the code ever since NFSv4.1 was added to the client. I guess it shows how few users use sec=krb5[ip]. Thanks for reporting this and with your help debugging it.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=a558130881e9d574dc5f37827fe2284667d5aba8 commit a558130881e9d574dc5f37827fe2284667d5aba8 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2023-12-26 22:33:39 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2023-12-26 22:33:39 +0000 nfscl: Fix handling of expired Kerberos credentials (NFSv4.1/4.2) If the NFS server detects that the Kerberos credentials provided by a NFSv4.1/4.2 mount using sec=krb5[ip] have expired, the NFS server replies with a krpc layer error of RPC_AUTHERROR. When this happened, the client erroneously left the NFSv4.1/4.2 session slot busy, so that it could not be used by other RPCs. If this happened for all session slots, the mount point would hang. This patch fixes the problem by releasing the session slot and resetting its sequence# upon receiving a RPC_AUTHERROR reply. This bug only affects NFSv4.1/4.2 mounts using sec=krb5[ip], but has existed since NFSv4.1 client support was added to FreeBSD. So, why has the bug remained undetected for so long? I cannot be sure, but I suspect that, often, the client detected the Kerberos credential expiration before attempting the RPC. For this case, the client would not do the RPC and, as such, there would be no busy session slot. Also, no hang would occur until all session slots are busied (64 for a FreeBSD client/server), so many cases of the bug probably went undetected? Also, use of sec=krb5[ip] mounts are not that common. PR: 275905 Tested by: Lexi <lexi.freebsd@le-fay.org> MFC after: 1 week sys/fs/nfs/nfs_commonkrpc.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
I was able to reproduce the problem and see that the patch (the second variant, but without the printfs) worked, so I have committed it to main. It will be MFC'd in a week. Thanks for reporting it and for fixing/testing the patch.
thanks for the fix! this was much more pleasant than getting NFS bugs fixed in some other systems i use...
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=dabf1797a76a7c87b8f38fb300fb03f20cf6f973 commit dabf1797a76a7c87b8f38fb300fb03f20cf6f973 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2023-12-26 22:33:39 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2024-01-02 01:20:17 +0000 nfscl: Fix handling of expired Kerberos credentials (NFSv4.1/4.2) If the NFS server detects that the Kerberos credentials provided by a NFSv4.1/4.2 mount using sec=krb5[ip] have expired, the NFS server replies with a krpc layer error of RPC_AUTHERROR. When this happened, the client erroneously left the NFSv4.1/4.2 session slot busy, so that it could not be used by other RPCs. If this happened for all session slots, the mount point would hang. This patch fixes the problem by releasing the session slot and resetting its sequence# upon receiving a RPC_AUTHERROR reply. This bug only affects NFSv4.1/4.2 mounts using sec=krb5[ip], but has existed since NFSv4.1 client support was added to FreeBSD. So, why has the bug remained undetected for so long? I cannot be sure, but I suspect that, often, the client detected the Kerberos credential expiration before attempting the RPC. For this case, the client would not do the RPC and, as such, there would be no busy session slot. Also, no hang would occur until all session slots are busied (64 for a FreeBSD client/server), so many cases of the bug probably went undetected? Also, use of sec=krb5[ip] mounts are not that common. PR: 275905 (cherry picked from commit a558130881e9d574dc5f37827fe2284667d5aba8) sys/fs/nfs/nfs_commonkrpc.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c85ff48a092e4c5989f91a4a1ccf99441ffb3170 commit c85ff48a092e4c5989f91a4a1ccf99441ffb3170 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2023-12-26 22:33:39 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2024-01-02 01:41:19 +0000 nfscl: Fix handling of expired Kerberos credentials (NFSv4.1/4.2) If the NFS server detects that the Kerberos credentials provided by a NFSv4.1/4.2 mount using sec=krb5[ip] have expired, the NFS server replies with a krpc layer error of RPC_AUTHERROR. When this happened, the client erroneously left the NFSv4.1/4.2 session slot busy, so that it could not be used by other RPCs. If this happened for all session slots, the mount point would hang. This patch fixes the problem by releasing the session slot and resetting its sequence# upon receiving a RPC_AUTHERROR reply. This bug only affects NFSv4.1/4.2 mounts using sec=krb5[ip], but has existed since NFSv4.1 client support was added to FreeBSD. So, why has the bug remained undetected for so long? I cannot be sure, but I suspect that, often, the client detected the Kerberos credential expiration before attempting the RPC. For this case, the client would not do the RPC and, as such, there would be no busy session slot. Also, no hang would occur until all session slots are busied (64 for a FreeBSD client/server), so many cases of the bug probably went undetected? Also, use of sec=krb5[ip] mounts are not that common. PR: 275905 (cherry picked from commit a558130881e9d574dc5f37827fe2284667d5aba8) sys/fs/nfs/nfs_commonkrpc.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=263ed87413aece1786df7275cc541dad324c76a5 commit 263ed87413aece1786df7275cc541dad324c76a5 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2023-12-26 22:33:39 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2024-01-02 01:55:14 +0000 nfscl: Fix handling of expired Kerberos credentials (NFSv4.1/4.2) If the NFS server detects that the Kerberos credentials provided by a NFSv4.1/4.2 mount using sec=krb5[ip] have expired, the NFS server replies with a krpc layer error of RPC_AUTHERROR. When this happened, the client erroneously left the NFSv4.1/4.2 session slot busy, so that it could not be used by other RPCs. If this happened for all session slots, the mount point would hang. This patch fixes the problem by releasing the session slot and resetting its sequence# upon receiving a RPC_AUTHERROR reply. This bug only affects NFSv4.1/4.2 mounts using sec=krb5[ip], but has existed since NFSv4.1 client support was added to FreeBSD. So, why has the bug remained undetected for so long? I cannot be sure, but I suspect that, often, the client detected the Kerberos credential expiration before attempting the RPC. For this case, the client would not do the RPC and, as such, there would be no busy session slot. Also, no hang would occur until all session slots are busied (64 for a FreeBSD client/server), so many cases of the bug probably went undetected? Also, use of sec=krb5[ip] mounts are not that common. PR: 275905 (cherry picked from commit a558130881e9d574dc5f37827fe2284667d5aba8) sys/fs/nfs/nfs_commonkrpc.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
Patch has been MFC'd.