Summary: | Using x11/konsole causes NFS hangs | ||
---|---|---|---|
Product: | Base System | Reporter: | Kurt Jaeger <pi> |
Component: | kern | Assignee: | freebsd-kde (group) <kde> |
Status: | New --- | ||
Severity: | Affects Only Me | CC: | emaste, pi, rmacklem, tcberner |
Priority: | --- | ||
Version: | 13.0-RELEASE | ||
Hardware: | amd64 | ||
OS: | Any |
Description
Kurt Jaeger
2021-11-30 19:53:35 UTC
ktrace of hangig processes did not show something, but maybe I had some errors using ktrace here... reboot of the nfs server did not change the problem. But: using this: /usr/local/bin/konsole was installed by package konsole-21.08.3_1 triggers the problem, using xterm works. Hmm, looks like you are using NFSv3. If you are running rpc.lockd, it is a fundamentally broken protocol (NLM). - If this is the case, try either the "nolockd" mount option or "nfsv4,minorversion=1" for your mount. If this doesn't resolve the problem, please provide the output for all of the following when the hang has occurred. nfsstat -m ps axHl procstat -kk netstat -a nfsstat -E -c <-- done at least twice about a minute apart Also, since you know how to reproduce the problem, you can capture packets while it happens. Before the hang run: tcpdump -s 0 -w out.pcap host <nfs-server> --> Then reproduce the problem and kill the tcpdump 1 minute after the hang has occurred. Put out.pcap here as an attachment. (Btw, if you want to look at out.pcap, use wireshark. tcpdump knows almost nothing about NFS packets, whereas wireshark can decode them nicely.) I have not heard of a FreeBSD client hang problem. There is a known TCP layer issue, but it causes hangs when the server is running 13.0 and I do not think the problem occurs when the server is 12.3. I have heard of issues when using vnet jails, but I doubt you are doing that. Also, hangs are often network fabric related. Disabling TSO on both client and server often helps. Of using a different network driver/chip if that is feasible (for example, for re devices the driver in ports sometimes works better than what is in the system, depending on which re chip you have.) Oh, and if "netstat -a" when it is hung shows the TCP connection to the server as ESTABLISHED and Recv-Q is non-zero, you probably are hitting the TCP issue. See PR#254590 for what to do about it. (It is specific to 13.0 and is fixed in stable/13.) (In reply to Rick Macklem from comment #3) Thanks for the pointers. We're now testing with rw,soft,bg,tcp via v6 for now and wait 24hours, if the problem re-occurs. (In reply to Kurt Jaeger from comment #5) The error did not re-appear. It's unclear if the mount option change is a good fix (and we close this PR) or does some part of KDE, NFS, ... need changes ? (In reply to Kurt Jaeger from comment #6) Ah, I have an idea. We have a test desktop and try to reproduce with that and the old fstab settings 8-} (In reply to Kurt Jaeger from comment #7) If I remember correctly we have had a long standing issue with FAM on NFS that lead to hangs. (that's why it is off by default in devel/kf5-kcoreaddons). mfg Tobias |