Bug 245619 - BROKEN NFS SERVER OR MIDDLEWARE after resume
Summary: BROKEN NFS SERVER OR MIDDLEWARE after resume
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-14 14:51 UTC by Edward Tomasz Napierala
Modified: 2020-04-16 17:58 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Edward Tomasz Napierala freebsd_committer freebsd_triage 2020-04-14 14:51:42 UTC
Sometimes, after resuming a FreeBSD 13-CURRENT laptop which was suspended with something mounted over NFS, the following shows in the system log:

Apr  3 00:09:15 brick kernel: nfs server tank:/tank/movies: is alive again
Apr  3 00:09:15 brick kernel: newnfs: server 'tank' error: fileid changed. fsid 0:0: expected fileid 0x4, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE)

The server is also 13-CURRENT; it's NFSv3 backed by ZFS.

Everything seems to work just fine, though; it's just the message seems weird.

Thanks!
Comment 1 Conrad Meyer freebsd_committer freebsd_triage 2020-04-14 16:55:10 UTC
The message indicates you got a GETATTR response for a file you didn't request.

In this case the NFS request was for "inode" 0x4, but you got back a GETATTR response for 0x2.  There are some NFS WAN caching proxy devices that have a GETATTR cache, which is fine, but a bug produced corrupt responses.  Without the loud warning, it is very confusing when your NFS client mount is randomly corrupt.
Comment 2 Conrad Meyer freebsd_committer freebsd_triage 2020-04-14 16:56:23 UTC
(In this instance, it's obviously not a middleware issue, but it might be a client or server bug.)
Comment 3 Edward Tomasz Napierala freebsd_committer freebsd_triage 2020-04-14 20:11:26 UTC
I suspect (a shot in the dark) is that it might be related to NFS reconnecting to the server - it only happens immediately after a long (think - over the night) suspend, so any TCP connection would timeout by then.