254590 – NFSv4.1 mounts from the Linux client gets "stuck" with partially closed TCP connection

Bug 254590 - NFSv4.1 mounts from the Linux client gets "stuck" with partially closed TCP connection

Summary: NFSv4.1 mounts from the Linux client gets "stuck" with partially closed TCP c...

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	12.1-RELEASE
Hardware:	Any Any

Importance:	--- Affects Some People
Assignee:	Rick Macklem

URL:
Keywords:

Depends on:
Blocks:

Reported:	2021-03-26 20:49 UTC by Rick Macklem
Modified:	2022-05-20 22:22 UTC (History)
CC List:	8 users (show)

See Also:	256280

Flags:	rmacklem: mfc-stable13+ rmacklem: mfc-stable12+ rmacklem: mfc-stable11?

Attachments
add soshutdown() calls to server side krpc for non-functional TCP conn (802 bytes, patch) 2021-03-26 21:01 UTC, Rick Macklem	no flags	Details \| Diff
enable the 6minute krpc timeout for NFSv4.1/4.2 client mounts (910 bytes, patch) 2021-04-05 02:04 UTC, Rick Macklem	no flags	Details \| Diff
enable the 6minute... for FreeBSD12 and FreeBSD13.0 (1.01 KB, patch) 2021-04-05 14:47 UTC, Rick Macklem	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Rick Macklem freebsd_committer

2021-03-26 20:49:56 UTC

Comment 1 Rick Macklem freebsd_committer

2021-03-26 21:01:29 UTC

Created attachment 223621 [details]
add soshutdown() calls to server side krpc for non-functional TCP conn

Jason Breitman reported "stuck" Linux NFSv4.1 mounts
against a FreeBSD NFSv4.1 server.
Although the underlying cause is not known, the
TCP connection is in FIN_WAIT2 on the client
and CLOSE_WAIT on the server.

The server side TCP remains in CLOSE_WAIT because
the server's krpc cannot soclose() the socket until
the backchannel is re-assigned to another TCP connection.
This re-assignment happens when the client establishes
a new TCP connection and does a BindConnectionToSession
operation on the new connection.

This patch adds soshutdown(SHUT_WR) calls in the 3 places
where the server krpc knows that the TCP socket is no
longer usable.
--> I think this will allow the TCP connection to proceed
    past CLOSE_WAIT and allow the TCP connection closure
    to complete.
This will hopefully get the Linux mount "unstuck".

Comment 2 Rick Macklem freebsd_committer

2021-03-26 21:07:24 UTC

I am hoping testers

Comment 3 Rick Macklem freebsd_committer

2021-03-26 21:11:43 UTC

I am hoping that testing will indicate that, at least,
the patch does not result in a regression.
If so, I will commit it.
(Unfortunately Jason cannot patch his production
 server.)

Jason does report that this script can be run on the
Linux client to "unstick" the mount.
(Essentially, it looks for the stuck TCP connection and then
 blocks network traffic to get the closureto complete.)
#!/bin/sh

progName="nfsClientFix"
delay=15
nfs_ip=NFS.Server.IP.X

nfs_fin_wait2_state() {
    /usr/bin/netstat -an | /usr/bin/grep ${nfs_ip}:2049 | /usr/bin/grep FIN_WAIT2 > /dev/null 2>&1
    return $?
}


nfs_fin_wait2_state
result=$?
if [ ${result} -eq 0 ] ; then
    /usr/bin/logger -s -i -p local7.error -t ${progName} "NFS Connection is in FIN_WAIT2!"
    /usr/bin/logger -s -i -p local7.error -t ${progName} "Enabling firewall to block ${nfs_ip}!"
    /usr/sbin/iptables -A INPUT -s ${nfs_ip} -j DROP

    while true
    do
        /usr/bin/sleep ${delay}
        nfs_fin_wait2_state
        result=$?
        if [ ${result} -ne 0 ] ; then
            /usr/bin/logger -s -i -p local7.notice -t ${progName} "NFS Connection is OK."
            /usr/bin/logger -s -i -p local7.error -t ${progName} "Disabling firewall to allow access to ${nfs_ip}!"
            /usr/sbin/iptables -D INPUT -s ${nfs_ip}  -j DROP
            break
        fi
    done
fi

Comment 4 Ryan Moeller freebsd_committer

2021-03-26 21:39:39 UTC

I've asked someone to help test this. I imagine there is no need for a setup with kerberos?

Comment 5 Rick Macklem freebsd_committer

2021-03-26 23:35:43 UTC

You can certainly test it without Kerberos.
Since we do not know the underlying cause,
I do not know if the problem can occur on
non-Kerberized mounts.

If you look at the email here, you can see
that TCP window size adjustment might be at
least part of the underlying cause?
http://docs.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB0968FB1FF0FC481CE37E9A81DD649

Comment 6 Jason 2021-03-31 19:12:42 UTC

I appreciate your work on the patch and testing.  
Are you able to provide me with a target date for the patch to be available via a standard package update?

Comment 7 Rick Macklem freebsd_committer

2021-03-31 22:35:40 UTC

The patch will probably be in FreeBSD 12.3 and 13.1,
whenever those releases occur (6mon-> I think).

Unless someone, such as yourself can confirm that
it fixes the problem, I have no basis on which to
ask re@ to consider it for an errata fix.
(My testing can only try to confirm that it
 does not cause a regression, since I have no
 idea how to reproduce your issue.)

A tester has a problem (which I think is a
different one), but the patch did not fix the
problem for them.

Comment 8 Jason 2021-04-01 15:13:32 UTC

Are you able to provide me with a process to install and uninstall the patch?  Ultimately I would want a package that I could add and remove so that I have a rollback plan if the patch has a negative side effect.

Comment 9 Peter Eriksson 2021-04-01 15:38:12 UTC

This is how I would do it... But your mileage may vary :-)


Install:

> emacs /etc/freebsd-update.conf # Remove "kernel" from Components)
> cd /usr/src
> svn checkout https://svn.freebsd.org/base/releng/12.2
> patch </PATH/TO/PATCH/FILE
> make buildkernel
> mv /boot/kernel /boot/kernel.ORIGINAL
> make installkernel
> reboot


Backout:

> mv /boot/kernel /boot/kernel.BACKOUT
> mv /boot/kernel.ORIGINAL /boot/kernel
> emacs /etc/freebsd-update.conf # Reinstall "kernel" in Components)
# cp -r /boot/kernel /boot/kernel.pre-update
> freebsd-update fetch install # Optionally...
> reboot


It's all described in the FreeBSD handbook (somewhere).

Comment 10 Rick Macklem freebsd_committer

2021-04-01 22:34:18 UTC

The only thing I'd add to what Peter said is that,
if the kernel won't boot for some reason after
doing "make installkernel", you can use "3" during
booting to get the boot prompt and then type:

boot kernel.ORIGINAL

Oh, and I'd NEVER use emacs;-)

Comment 11 Jason 2021-04-03 14:24:18 UTC

I was able to apply the patch today and will let you know if it resolves the issue.  We should gain confidence after 14 days without an issue and I believe we can say the patch was the solution after 21 days.  I used vi to edit the file based on your recommendation.  :)

Comment 12 Rick Macklem freebsd_committer

2021-04-05 02:04:04 UTC

Created attachment 223815 [details]
enable the 6minute krpc timeout for NFSv4.1/4.2 client mounts

The server side krpc has always had a 6 minute
"no activity" timeout for connections. Without
this patch, the timeout is applied to TCP
connections that are not used for a back channel.
(NFSv3, NFSv4.0 mounts, plus FreeBSD NFSv4.1/4.2
mounts from clients not running the nfscbd(8)
daemon.)

The thinking w.r.t. not doing the timeout for
connections with a back channel was to avoid loss
of the backchannel. This is not a serious concern,
since a normal NFSv4.1/4.2 client will renew the
lease every minute or so and, as such, only a
network partitioning or similar will result in a
6 minute timeout.

However, I have been able to get a Linux NFSv4.1
mount "stuck" indefinitely after a 2minute network
partitioning without the timeout.

So, this simple patch enables the 6minute timeout
for all connections.

Comment 13 Rick Macklem freebsd_committer

2021-04-05 02:09:13 UTC

Oh, and for older systems without the first patch
found in PR#254560, there is a third case of
  nd->nd_xprt->xp_idletimeout = 0;
that should be deleted.
(Or apply the patch in PR#254560 before "the 6minute" one here.

This patch is only needed if the NFS server has Linux
NFSv4.1 or 4.2 mounts on it.

Comment 14 Rick Macklem freebsd_committer

2021-04-05 14:47:39 UTC

Created attachment 223831 [details]
enable the 6minute... for FreeBSD12 and FreeBSD13.0

Same patch as 223815, but for FreeBSD12 and FreeBSD13.0.
(223815 is for FreeBSD-current.)

Comment 15 Jason 2021-04-19 13:36:28 UTC

I wanted to provide you with an update.
It has been 14 days without issues which is a good sign.

It should be noted that I only applied patch 223621 - add soshutdown() calls to server side krpc for non-functional TCP conn for bug #254590.
I did not apply the other patch as it was posted after my maintenance window.

It should also be noted that my original kernel was 12.1.

I will apply the same patch to my other production server this coming weekend given that we continue with issues on the patched server.

Comment 16 Rick Macklem freebsd_committer

2021-04-19 13:52:42 UTC

Sounds good.

Actually, not applying the second patch
for testing was preferred.
I am still not 100% sure the timeout
should be enabled for NFSv4.1/4.2.

I put it here as "something to try"
if the first patch did not resolve the
problem.

If I recall, you felt that, if you
can run one more week without the
issue, then it can be considered
resolved. Is that correct?

Thanks for testing this.

Comment 17 Jason 2021-04-19 14:21:18 UTC

Correct.
21 days without an issue will be a strong indicator that the bug is resolved.
We were seeing 1 or more NFS hangs every 7 - 10 days for 4 weeks.

Comment 18 commit-hook freebsd_committer

2021-04-27 22:36:53 UTC

A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=db8c27f499105dcc9872dcc46e88bdd570c24fee

commit db8c27f499105dcc9872dcc46e88bdd570c24fee
Author:     Rick Macklem <rmacklem@FreeBSD.org>
AuthorDate: 2021-04-27 22:32:35 +0000
Commit:     Rick Macklem <rmacklem@FreeBSD.org>
CommitDate: 2021-04-27 22:32:35 +0000

    nfsd: fix a NFSv4.1 Linux client mount stuck in CLOSE_WAIT

    It was reported that a NFSv4.1 Linux client mount against
    a FreeBSD12 server was hung, with the TCP connection in
    CLOSE_WAIT state on the server.
    When a NFSv4.1/4.2 mount is done and the back channel is
    bound to the TCP connection, the soclose() is delayed until
    a new TCP connection is bound to the back channel, due to
    a reference count being held on the SVCXPRT structure in
    the krpc for the socket. Without the soclose() call, the socket
    will remain in CLOSE_WAIT and this somehow caused the Linux
    client to hang.

    This patch adds calls to soshutdown(.., SHUT_WR) that
    are performed when the server side krpc sees that the
    socket is no longer usable.  Since this can be done
    before the back channel is bound to a new TCP connection,
    it allows the TCP connection to proceed to CLOSED state.

    PR:     254590
    Reported by:    jbreitman@tildenparkcapital.com
    Reviewed by:    tuexen
    Comments by:    kevans
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D29526

 sys/rpc/svc.c | 5 +++++
 1 file changed, 5 insertions(+)

Comment 19 Rick Macklem freebsd_committer

2021-04-27 22:41:30 UTC

First attachment patch is committed to main and will
be MFC'd in two weeks.

I have not yet decided if the 6minute timeout should
be enabled for NFSv4.1/4.2.

Comment 20 commit-hook freebsd_committer

2021-05-11 01:16:02 UTC

A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=3e67975a0c0807073daff24a3b6fa8942d3305d2

commit 3e67975a0c0807073daff24a3b6fa8942d3305d2
Author:     Rick Macklem <rmacklem@FreeBSD.org>
AuthorDate: 2021-04-27 22:32:35 +0000
Commit:     Rick Macklem <rmacklem@FreeBSD.org>
CommitDate: 2021-05-11 01:12:21 +0000

    nfsd: fix a NFSv4.1 Linux client mount stuck in CLOSE_WAIT

    It was reported that a NFSv4.1 Linux client mount against
    a FreeBSD12 server was hung, with the TCP connection in
    CLOSE_WAIT state on the server.
    When a NFSv4.1/4.2 mount is done and the back channel is
    bound to the TCP connection, the soclose() is delayed until
    a new TCP connection is bound to the back channel, due to
    a reference count being held on the SVCXPRT structure in
    the krpc for the socket. Without the soclose() call, the socket
    will remain in CLOSE_WAIT and this somehow caused the Linux
    client to hang.

    This patch adds calls to soshutdown(.., SHUT_WR) that
    are performed when the server side krpc sees that the
    socket is no longer usable.  Since this can be done
    before the back channel is bound to a new TCP connection,
    it allows the TCP connection to proceed to CLOSED state.

    PR:     254590
    (cherry picked from commit db8c27f499105dcc9872dcc46e88bdd570c24fee)

 sys/rpc/svc.c | 5 +++++
 1 file changed, 5 insertions(+)

Comment 21 commit-hook freebsd_committer

2021-05-11 01:24:05 UTC

A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=16e172a410bff6d2c67523fe949424ab055b46a6

commit 16e172a410bff6d2c67523fe949424ab055b46a6
Author:     Rick Macklem <rmacklem@FreeBSD.org>
AuthorDate: 2021-04-27 22:32:35 +0000
Commit:     Rick Macklem <rmacklem@FreeBSD.org>
CommitDate: 2021-05-11 01:19:54 +0000

    nfsd: fix a NFSv4.1 Linux client mount stuck in CLOSE_WAIT

    It was reported that a NFSv4.1 Linux client mount against
    a FreeBSD12 server was hung, with the TCP connection in
    CLOSE_WAIT state on the server.
    When a NFSv4.1/4.2 mount is done and the back channel is
    bound to the TCP connection, the soclose() is delayed until
    a new TCP connection is bound to the back channel, due to
    a reference count being held on the SVCXPRT structure in
    the krpc for the socket. Without the soclose() call, the socket
    will remain in CLOSE_WAIT and this somehow caused the Linux
    client to hang.

    This patch adds calls to soshutdown(.., SHUT_WR) that
    are performed when the server side krpc sees that the
    socket is no longer usable.  Since this can be done
    before the back channel is bound to a new TCP connection,
    it allows the TCP connection to proceed to CLOSED state.

    PR:     254590
    (cherry picked from commit db8c27f499105dcc9872dcc46e88bdd570c24fee)

 sys/rpc/svc.c | 5 +++++
 1 file changed, 5 insertions(+)

Comment 22 Chris Stephan 2021-10-02 02:01:15 UTC

Rick Macklem:

I posted a fairly detailed comment in [178231] that looks like it could be similar if not the same occurrence as this. I can reproduce the issue in minutes and would be interested to see if this is related. I have however not seen my issue on NFSv3 clients, only when we tried upgrading to NFSv4. I detailed how to produce the event consistently in the other thread should you want to see. I will be loading the patch(s) provided here in tomorrow morning's staging. Would you be interested in any other tests or output I can provide to garner a better idea as to what is causing these freezes while I'm at it?

Chris Stephan

Comment 23 Rick Macklem freebsd_committer

2021-10-02 02:37:36 UTC

When the client is hung, you need to collect some information
on the NFS server for the client connection.
# netstat -a
--> See what state the TCP connection is in and whether or not
    RecQ is 0.
    If state == CLOSE_WAIT, you need the first oatch here.
    If state == ESTABLISHED and the RevQ is non-empty, then
    you either need to revert r367492 or upgrade the a
    stable/13 kernel (see PR#256280).

If the TCP connection for the client is ESTABLISHED and RecQ == 0,
then it is some other issue. Issues I am aware of are related to
delegations, so make sure delegations are not enabled on the server.
(sysctl vfs.nfsd.issue_delegations = 0, which should be the default.)

If delegations are not enabled, along with the TCP connection being in
ESTABLISHED state and Recq == 0, then I don't know what your problem is.
--> If you can capture packets "tcpdump -s 0 -w out.pcap host <client>"
    done on the server and then reproduce the hang, I can look at out.pcap
    to try and see what is going on.

And always use "hard" mounts on the clients and never "soft".
(That is in the BUGS section of "man mount_nfs".)
--> Your first example of an fstab line in your post on PR178231.
    I assume you meant either/or and you don't have both lines in /etc/fstab.
    That would be weird and incorrect.

Finally, if your clients are new enough "umount -N /mnt/path" is the
way to get rid of a hung NFS mount point. "umount -f /mnt/path" won't
ever work.

Comment 24 Chris Stephan 2021-10-02 03:14:27 UTC

(In reply to Rick Macklem from comment #23)

Added your requests to the test plan for 2020-10-03.

In regards to hard vs soft. I saw that, but was grasping at straws. With very little references to working NFSv4 setups on the internet and limited information outside of `man nfsv4' (even from the linux community). At one point, I found a blog post with limited instruction on NFSv4 between Solaris clients and FreeBSD 11.1 recommending using a soft mount. I was short on examples, so when we started running into issues so quickly, I assumed either the BUGS section was dated information and it was worth a shot. Needless to say, it didn't resolve anything, but I'm not sure it made it any worse either. Just to be clear, intr will be deprecated for all client versions going forward, not just NFSv3, correct, or is this a linux only change?

Comment 25 Rick Macklem freebsd_committer

2021-10-02 05:00:43 UTC

I, personally, I am not a fan of "soft" or "intr"
mount options, because they can result in I/O syscalls
returning EINTR, which is not POSIX and not expected
by most applications.

Having said that, you can safely use them for NFSv3.
It is the open/lock state in NFSv4 that gets "confused"
when an RPC involving locking terminates via interruption
(a signal) or times out ("soft").

Comment 26 Jason 2022-02-14 15:56:04 UTC

I am looking to apply updates to my servers and want to verify that the fix established here has been applied to FreeBSD 12.X.  How can I see which kernel versions this patch was applied to so that I can be confident when moving forward?

Comment 27 Rick Macklem freebsd_committer

2022-02-14 22:22:14 UTC

The patch is in 12.3.

To see this, you'd need to play around with git.
(There might be some clone in github that
 you can look at, so you don't have to bother to git clone, etc.?)
--> I'll admit I don't even know what the release branches are called
    in the git clone, since I only mess with main and stable/N.

Comment 28 Jason 2022-02-15 15:58:15 UTC

Thank you.
Knowing that the patch is in 12.3 is helpful.

Comment 29 Rick Macklem freebsd_committer

2022-05-20 22:22:05 UTC

Since the 13.1 release has this fix, close the PR.