Created attachment 223621 [details] add soshutdown() calls to server side krpc for non-functional TCP conn Jason Breitman reported "stuck" Linux NFSv4.1 mounts against a FreeBSD NFSv4.1 server. Although the underlying cause is not known, the TCP connection is in FIN_WAIT2 on the client and CLOSE_WAIT on the server. The server side TCP remains in CLOSE_WAIT because the server's krpc cannot soclose() the socket until the backchannel is re-assigned to another TCP connection. This re-assignment happens when the client establishes a new TCP connection and does a BindConnectionToSession operation on the new connection. This patch adds soshutdown(SHUT_WR) calls in the 3 places where the server krpc knows that the TCP socket is no longer usable. --> I think this will allow the TCP connection to proceed past CLOSE_WAIT and allow the TCP connection closure to complete. This will hopefully get the Linux mount "unstuck".
I am hoping testers
I am hoping that testing will indicate that, at least, the patch does not result in a regression. If so, I will commit it. (Unfortunately Jason cannot patch his production server.) Jason does report that this script can be run on the Linux client to "unstick" the mount. (Essentially, it looks for the stuck TCP connection and then blocks network traffic to get the closureto complete.) #!/bin/sh progName="nfsClientFix" delay=15 nfs_ip=NFS.Server.IP.X nfs_fin_wait2_state() { /usr/bin/netstat -an | /usr/bin/grep ${nfs_ip}:2049 | /usr/bin/grep FIN_WAIT2 > /dev/null 2>&1 return $? } nfs_fin_wait2_state result=$? if [ ${result} -eq 0 ] ; then /usr/bin/logger -s -i -p local7.error -t ${progName} "NFS Connection is in FIN_WAIT2!" /usr/bin/logger -s -i -p local7.error -t ${progName} "Enabling firewall to block ${nfs_ip}!" /usr/sbin/iptables -A INPUT -s ${nfs_ip} -j DROP while true do /usr/bin/sleep ${delay} nfs_fin_wait2_state result=$? if [ ${result} -ne 0 ] ; then /usr/bin/logger -s -i -p local7.notice -t ${progName} "NFS Connection is OK." /usr/bin/logger -s -i -p local7.error -t ${progName} "Disabling firewall to allow access to ${nfs_ip}!" /usr/sbin/iptables -D INPUT -s ${nfs_ip} -j DROP break fi done fi
I've asked someone to help test this. I imagine there is no need for a setup with kerberos?
You can certainly test it without Kerberos. Since we do not know the underlying cause, I do not know if the problem can occur on non-Kerberized mounts. If you look at the email here, you can see that TCP window size adjustment might be at least part of the underlying cause? http://docs.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB0968FB1FF0FC481CE37E9A81DD649
I appreciate your work on the patch and testing. Are you able to provide me with a target date for the patch to be available via a standard package update?
The patch will probably be in FreeBSD 12.3 and 13.1, whenever those releases occur (6mon-> I think). Unless someone, such as yourself can confirm that it fixes the problem, I have no basis on which to ask re@ to consider it for an errata fix. (My testing can only try to confirm that it does not cause a regression, since I have no idea how to reproduce your issue.) A tester has a problem (which I think is a different one), but the patch did not fix the problem for them.
Are you able to provide me with a process to install and uninstall the patch? Ultimately I would want a package that I could add and remove so that I have a rollback plan if the patch has a negative side effect.
This is how I would do it... But your mileage may vary :-) Install: > emacs /etc/freebsd-update.conf # Remove "kernel" from Components) > cd /usr/src > svn checkout https://svn.freebsd.org/base/releng/12.2 > patch </PATH/TO/PATCH/FILE > make buildkernel > mv /boot/kernel /boot/kernel.ORIGINAL > make installkernel > reboot Backout: > mv /boot/kernel /boot/kernel.BACKOUT > mv /boot/kernel.ORIGINAL /boot/kernel > emacs /etc/freebsd-update.conf # Reinstall "kernel" in Components) # cp -r /boot/kernel /boot/kernel.pre-update > freebsd-update fetch install # Optionally... > reboot It's all described in the FreeBSD handbook (somewhere).
The only thing I'd add to what Peter said is that, if the kernel won't boot for some reason after doing "make installkernel", you can use "3" during booting to get the boot prompt and then type: boot kernel.ORIGINAL Oh, and I'd NEVER use emacs;-)
I was able to apply the patch today and will let you know if it resolves the issue. We should gain confidence after 14 days without an issue and I believe we can say the patch was the solution after 21 days. I used vi to edit the file based on your recommendation. :)
Created attachment 223815 [details] enable the 6minute krpc timeout for NFSv4.1/4.2 client mounts The server side krpc has always had a 6 minute "no activity" timeout for connections. Without this patch, the timeout is applied to TCP connections that are not used for a back channel. (NFSv3, NFSv4.0 mounts, plus FreeBSD NFSv4.1/4.2 mounts from clients not running the nfscbd(8) daemon.) The thinking w.r.t. not doing the timeout for connections with a back channel was to avoid loss of the backchannel. This is not a serious concern, since a normal NFSv4.1/4.2 client will renew the lease every minute or so and, as such, only a network partitioning or similar will result in a 6 minute timeout. However, I have been able to get a Linux NFSv4.1 mount "stuck" indefinitely after a 2minute network partitioning without the timeout. So, this simple patch enables the 6minute timeout for all connections.
Oh, and for older systems without the first patch found in PR#254560, there is a third case of nd->nd_xprt->xp_idletimeout = 0; that should be deleted. (Or apply the patch in PR#254560 before "the 6minute" one here. This patch is only needed if the NFS server has Linux NFSv4.1 or 4.2 mounts on it.
Created attachment 223831 [details] enable the 6minute... for FreeBSD12 and FreeBSD13.0 Same patch as 223815, but for FreeBSD12 and FreeBSD13.0. (223815 is for FreeBSD-current.)
I wanted to provide you with an update. It has been 14 days without issues which is a good sign. It should be noted that I only applied patch 223621 - add soshutdown() calls to server side krpc for non-functional TCP conn for bug #254590. I did not apply the other patch as it was posted after my maintenance window. It should also be noted that my original kernel was 12.1. I will apply the same patch to my other production server this coming weekend given that we continue with issues on the patched server.
Sounds good. Actually, not applying the second patch for testing was preferred. I am still not 100% sure the timeout should be enabled for NFSv4.1/4.2. I put it here as "something to try" if the first patch did not resolve the problem. If I recall, you felt that, if you can run one more week without the issue, then it can be considered resolved. Is that correct? Thanks for testing this.
Correct. 21 days without an issue will be a strong indicator that the bug is resolved. We were seeing 1 or more NFS hangs every 7 - 10 days for 4 weeks.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=db8c27f499105dcc9872dcc46e88bdd570c24fee commit db8c27f499105dcc9872dcc46e88bdd570c24fee Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-04-27 22:32:35 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-04-27 22:32:35 +0000 nfsd: fix a NFSv4.1 Linux client mount stuck in CLOSE_WAIT It was reported that a NFSv4.1 Linux client mount against a FreeBSD12 server was hung, with the TCP connection in CLOSE_WAIT state on the server. When a NFSv4.1/4.2 mount is done and the back channel is bound to the TCP connection, the soclose() is delayed until a new TCP connection is bound to the back channel, due to a reference count being held on the SVCXPRT structure in the krpc for the socket. Without the soclose() call, the socket will remain in CLOSE_WAIT and this somehow caused the Linux client to hang. This patch adds calls to soshutdown(.., SHUT_WR) that are performed when the server side krpc sees that the socket is no longer usable. Since this can be done before the back channel is bound to a new TCP connection, it allows the TCP connection to proceed to CLOSED state. PR: 254590 Reported by: jbreitman@tildenparkcapital.com Reviewed by: tuexen Comments by: kevans MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29526 sys/rpc/svc.c | 5 +++++ 1 file changed, 5 insertions(+)
First attachment patch is committed to main and will be MFC'd in two weeks. I have not yet decided if the 6minute timeout should be enabled for NFSv4.1/4.2.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3e67975a0c0807073daff24a3b6fa8942d3305d2 commit 3e67975a0c0807073daff24a3b6fa8942d3305d2 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-04-27 22:32:35 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-05-11 01:12:21 +0000 nfsd: fix a NFSv4.1 Linux client mount stuck in CLOSE_WAIT It was reported that a NFSv4.1 Linux client mount against a FreeBSD12 server was hung, with the TCP connection in CLOSE_WAIT state on the server. When a NFSv4.1/4.2 mount is done and the back channel is bound to the TCP connection, the soclose() is delayed until a new TCP connection is bound to the back channel, due to a reference count being held on the SVCXPRT structure in the krpc for the socket. Without the soclose() call, the socket will remain in CLOSE_WAIT and this somehow caused the Linux client to hang. This patch adds calls to soshutdown(.., SHUT_WR) that are performed when the server side krpc sees that the socket is no longer usable. Since this can be done before the back channel is bound to a new TCP connection, it allows the TCP connection to proceed to CLOSED state. PR: 254590 (cherry picked from commit db8c27f499105dcc9872dcc46e88bdd570c24fee) sys/rpc/svc.c | 5 +++++ 1 file changed, 5 insertions(+)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=16e172a410bff6d2c67523fe949424ab055b46a6 commit 16e172a410bff6d2c67523fe949424ab055b46a6 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-04-27 22:32:35 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-05-11 01:19:54 +0000 nfsd: fix a NFSv4.1 Linux client mount stuck in CLOSE_WAIT It was reported that a NFSv4.1 Linux client mount against a FreeBSD12 server was hung, with the TCP connection in CLOSE_WAIT state on the server. When a NFSv4.1/4.2 mount is done and the back channel is bound to the TCP connection, the soclose() is delayed until a new TCP connection is bound to the back channel, due to a reference count being held on the SVCXPRT structure in the krpc for the socket. Without the soclose() call, the socket will remain in CLOSE_WAIT and this somehow caused the Linux client to hang. This patch adds calls to soshutdown(.., SHUT_WR) that are performed when the server side krpc sees that the socket is no longer usable. Since this can be done before the back channel is bound to a new TCP connection, it allows the TCP connection to proceed to CLOSED state. PR: 254590 (cherry picked from commit db8c27f499105dcc9872dcc46e88bdd570c24fee) sys/rpc/svc.c | 5 +++++ 1 file changed, 5 insertions(+)
Rick Macklem: I posted a fairly detailed comment in [178231] that looks like it could be similar if not the same occurrence as this. I can reproduce the issue in minutes and would be interested to see if this is related. I have however not seen my issue on NFSv3 clients, only when we tried upgrading to NFSv4. I detailed how to produce the event consistently in the other thread should you want to see. I will be loading the patch(s) provided here in tomorrow morning's staging. Would you be interested in any other tests or output I can provide to garner a better idea as to what is causing these freezes while I'm at it? Chris Stephan
When the client is hung, you need to collect some information on the NFS server for the client connection. # netstat -a --> See what state the TCP connection is in and whether or not RecQ is 0. If state == CLOSE_WAIT, you need the first oatch here. If state == ESTABLISHED and the RevQ is non-empty, then you either need to revert r367492 or upgrade the a stable/13 kernel (see PR#256280). If the TCP connection for the client is ESTABLISHED and RecQ == 0, then it is some other issue. Issues I am aware of are related to delegations, so make sure delegations are not enabled on the server. (sysctl vfs.nfsd.issue_delegations = 0, which should be the default.) If delegations are not enabled, along with the TCP connection being in ESTABLISHED state and Recq == 0, then I don't know what your problem is. --> If you can capture packets "tcpdump -s 0 -w out.pcap host <client>" done on the server and then reproduce the hang, I can look at out.pcap to try and see what is going on. And always use "hard" mounts on the clients and never "soft". (That is in the BUGS section of "man mount_nfs".) --> Your first example of an fstab line in your post on PR178231. I assume you meant either/or and you don't have both lines in /etc/fstab. That would be weird and incorrect. Finally, if your clients are new enough "umount -N /mnt/path" is the way to get rid of a hung NFS mount point. "umount -f /mnt/path" won't ever work.
(In reply to Rick Macklem from comment #23) Added your requests to the test plan for 2020-10-03. In regards to hard vs soft. I saw that, but was grasping at straws. With very little references to working NFSv4 setups on the internet and limited information outside of `man nfsv4' (even from the linux community). At one point, I found a blog post with limited instruction on NFSv4 between Solaris clients and FreeBSD 11.1 recommending using a soft mount. I was short on examples, so when we started running into issues so quickly, I assumed either the BUGS section was dated information and it was worth a shot. Needless to say, it didn't resolve anything, but I'm not sure it made it any worse either. Just to be clear, intr will be deprecated for all client versions going forward, not just NFSv3, correct, or is this a linux only change?
I, personally, I am not a fan of "soft" or "intr" mount options, because they can result in I/O syscalls returning EINTR, which is not POSIX and not expected by most applications. Having said that, you can safely use them for NFSv3. It is the open/lock state in NFSv4 that gets "confused" when an RPC involving locking terminates via interruption (a signal) or times out ("soft").
I am looking to apply updates to my servers and want to verify that the fix established here has been applied to FreeBSD 12.X. How can I see which kernel versions this patch was applied to so that I can be confident when moving forward?
The patch is in 12.3. To see this, you'd need to play around with git. (There might be some clone in github that you can look at, so you don't have to bother to git clone, etc.?) --> I'll admit I don't even know what the release branches are called in the git clone, since I only mess with main and stable/N.
Thank you. Knowing that the patch is in 12.3 is helpful.
Since the 13.1 release has this fix, close the PR.