During recent testing of a Linux client NFSv4.1 mount to a FreeBSD server, breakage of both client and server was observed after a network partitioning between them. The FreeBSD server did not reply to a retried RPC using the session's cached reply as it should. The Linux client sometimes advances the sequence# for the session slot by 2 instead of 1. The attached patches alleviate the above problems and should be applied to all NFS servers handling NFSv4 mounts. Fortunately, network partitioning should be a rare event and the patches are only needed when that happens.
Created attachment 223859 [details] fix NFSv4.1/4.2 server session for RPC retries This patch fixes the NFSv4 server so that it correctly sends a reply from the one cached in the session's slot when an RPC retry occurs. (RPC retries are rare for NFSv4, but can occur after a new TCP connection has been established for an NFv4.1/4.2 mount by the client.) Two things needed to be fixed: - don't set nd_repstat to NFSERR_IO when the pseudo error NFSERR_REPLYFROMCACHE is returned. - actually use the reply in "m".
Created attachment 223860 [details] cut the Linux client some slack w.r.t. session sequence# After a network partitioning is healed, some versions of Linux client advance the sequende# for the session slot by 2 instead of 1. This patch allows these cases to work. Although technically a violation of RFC5661, it seems harmless to do, since the NFS4ERR_SEQ_MISORDERED will still be generated if an "out of order" RPC is subsequently received, since it will have a sequence# less than what the server expects. When this goes into main, etc, I will enable it based on a sysctl, so that the server can optionally be RFC5661 conformant.
Created attachment 223861 [details] make the session's cached reply work for multiple retries of an RPC Having multiple retries of the same RPC should be extremely rare, since a correctly functioning client will create a new TCP connection for each of them. As such, the unpatched code assumed it would *never* happen. However it seems prudent to handle that case as far as possible. This patch adds m_copym(..M_NOWAIT) calls so that the session slot will retain the cached reply for a subsequent retry unless the m_copym() fails.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=9edaceca8165e2864267547311daf145bb520270 commit 9edaceca8165e2864267547311daf145bb520270 Author: Rick Macklem <rmacklem@FreeBSD.org> AuthorDate: 2021-04-11 23:51:25 +0000 Commit: Rick Macklem <rmacklem@FreeBSD.org> CommitDate: 2021-04-11 23:51:25 +0000 nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661 Recent testing of network partitioning a FreeBSD NFSv4.1 server from a Linux NFSv4.1 client identified problems with both the FreeBSD server and Linux client. Sometimes, after some Linux NFSv4.1/4.2 clients establish a new TCP connection, they will advance the sequence number for a session slot by 2 instead of 1. RFC5661 specifies that a server should reply NFS4ERR_SEQ_MISORDERED for this case. This might result in a system call error in the client and seems to disable future use of the slot by the client. Since advancing the sequence number by 2 seems harmless, allow this case if vfs.nfs.linuxseqsesshack is non-zero. Note that, if the order of RPCs is actually reversed, a subsequent RPC with a smaller sequence number value for the slot will be received. This will result in a NFS4ERR_SEQ_MISORDERED reply. This has not been observed during testing. Setting vfs.nfs.linuxseqsesshack to 0 will provide RFC5661 compliant behaviour. This fix affects the fairly rare case where a NFSv4 Linux client does a TCP reconnect and then apparently erroneously increments the sequence number for the session slot twice during the reconnect cycle. PR: 254816 MFC after: 2 weeks sys/fs/nfs/nfs_commonsubs.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-)
Comment on attachment 223860 [details] cut the Linux client some slack w.r.t. session sequence# It turns out that the Linux client intentionally does an RPC of just Sequence with the seqid advanced by 2, to test the session slot for correct sequence#. As such the server should conform to RFC5661 and this patch is not recommended.