I've upgraded my aarch64 box (RockPro64 with 4G RAM) which serves a ZFS pool over NFS. Before the upgrade (on FreeBSD 13.2) my upload speed to the share was like 50-70MiB/s. After the upgrade to FreeBSD 14.0, it struggles to get even 10MiB/s write. I've tried switching from NFSd v3 to v4. It's even worse (6MiB/s write). The disk is WD Gold, 10TiB. So I'm quite sure it's not about the disk speed or the network speed (my router has 1G ports). My /etc/rc.conf: # NFS hostid_enable="YES" nfscbd_enable="YES" rpcbind_enable="YES" nfs_server_enable="YES" nfsv4_server_only="NO" nfsv4_server_enable="YES" nfsuserd_enable="YES" mountd_enable="YES" mountd_flags="-r" rpc_lockd_enable="YES" rpc_statd_enable="YES" I've bumped /etc/sysctl.conf settings to huge values: net.inet.raw.maxdgram=262144 net.inet.raw.recvspace=1048576 net.inet.tcp.sendspace=1048576 vfs.nfsd.srvmaxio=1048576 vfs.nfsd.maxthreads=128 net.inet.tcp.rfc1323=1 net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 But this improved nothing. Maybe it's even worse… My NFS client (macOS 13+) uses these options: mount_nfs \ -o \ rw,vers=4,deadtimeout=0,readahead=6,noatime,sync,async,hard,bg,intr,inet,tcp,nfc,rsize=1048576,wsize=1048576,dsize=1048576 \ vks4.home:/Copies/VMs \ /Users/Shared/NFS/VMs My sequential read from NFS is as before (~50-70MiB/s). Which is "okay" for that HW. But what can I do to bring the write speed back to 50MiB/s? Did I do something wrong? Thanks
cc'ing rmacklem@
Check on server: sysctl vfs.nfsd.srvmaxio sysctl vfs.nfsd.maxthreads On clients readahead=16. > sync,async ?
(In reply to Vladimir Druzenko from comment #2) From the server: vfs.nfsd.srvmaxio: 1048576 vfs.nfsd.maxthreads: 64 I've tried with 128 threads before, but that changed nothing. The server is… close to being idle during the transfer. From the btop I only see "intr" and "nfsd: server" subprocess… But both use up to 15% CPU per process. Load is 0.7 Dropped the "sync" option on clients, and added "readahead=16". That caused the write to speed up from 6MiB/s to 8MiB/s. Should I consider modifying /boot/loader.conf to add: "vfs.maxbcachebuf=1048576" there? It's mentioned in the 14.0 release docs.
Try on client: dd if=/dev/zero of=/Users/Shared/NFS/VMs/ZERO bs=1M count=16384 status=progress (don't know correct command line for dd on macos)
There is very little difference in the NFS server for 13.2 vs 14.0. As such, the hit is most likely a network fabric issue or a ZFS issue. The only thing I can suggest to try is: rsize=131072,wsize=131072 It should perform about as well as 1Mbyte, but??? If it does help a lot, there is something in the network fabric (most likely the NIC/driver) that cannot handle the burst of TCP segments well. Did you happen to have "sync=disabled" set on your 13.2 ZFS. Setting this runs the risk of data loss when the NFS server crashes/reboots, but will help w.r.t. write performance.
Oh, and it might be worth capturing packets while writes are slow and taking a look at them in wireshark. (Unlike tcpdump, wireshark knows NFS.) Something like: # tcpdump -s 0 -w out.pcap host <nfs-client-host> on the NFS server and then look at out.pcap in wireshark. (I just install wireshark on my Windows laptop. No need to bother with X windows.) You might see error replies for NFS RPCs or TCP timeouts/retransmits that would explain the slowdown. (Or TCP reconnects. I once saw a case where the network switch would decide to inject an RST in the TCP stream forcing the NFS client to create a new connection. Why did it do this? No idea.)
(In reply to Vladimir Druzenko from comment #4) dd if=/dev/zero of=/Users/Shared/NFS/VMs/ZERO bs=1M count=16384 status=progress 179306496 bytes (179 MB, 171 MiB) transferred 15.001s, 12 MB/s starts at around 20MiBs, then slows down to ~11 after some more tweaks on my side.
(In reply to Rick Macklem from comment #6) It can't be a network thing. When I download VM images from that NFS, the transfer is stable around ~55MiB/s. The issue is only when I upload/write to the share. Then it's ~5x slower.
(In reply to Rick Macklem from comment #6) I double-checked with Wireshark, there are some "TCP Dup ACK" failures during the upload process. Example lines: 802281 62.076919 192.168.0.34 192.168.0.60 TCP 78 [TCP Dup ACK 802084#97] 2049 → 54276 [ACK] Seq=442949 Ack=757712677 Win=28968 Len=0 TSval=1156306898 TSecr=465886986 SLE=757724261 SRE=757864717
By network fabric I mean everything from the TCP stack down, at both ends. A problem can easily manifest itself as only a problem during writing. Writing to an NFS server is very different traffic as reading from a NFS server. I am not saying that it is a network fabric problem, just that good read performance does not imply it is not a network fabric problem. I once saw a case where everything worked fine over NFS (where I worked as a sysadmin) until one specific NFS RPC was done. That NFS RPC (and only that NFS RPC would fail). It turned out to be a hardware bug in a network switch. Move the machine to a port on another switch and the problem went away. Move it onto the problem switch and the issue showed up again. There were no detectable other problems with this switch and the manufacturer returned it after a maintenance cycle claiming it was fixed. It still had the problem, so it went in the trash. (It probably had a memory problem that flipped a bit for this specific case or some such.) Two examples of how a network problem might affect NFS write performance, but not read performance. Write requests are the only large RPC messages sent from client->server. With a !Mbyte write size, each write results in about 700 1500byte TCP segments (for an ordinary ethernet packet size). -> If the burst of 700 packets causes one to be dropped on the server (receive) end sometimes... (Found by seeing an improvement with a smaller wsize.) -> If the client/sender has a TSO bug (the most common problem is mishandling a TSO segment that is slightly less than 64Kbyytes. (Found by disabling TSO in the client. Disabling TSO also changes the timing of the TCP segments and this can sometimes avoid bugs.) Have you yet tried a smaller rsize/wsize as I suggested. NFS traffic is also very different than typical TCP traffic. For example, both 13.0 and 13.1 shipped with bugs in the TCP stack that affected the NFS server (intermittent hangs in these cases). If it isn't a network fabric problem it is probably something related to ZFS. I know nothing about ZFS, so I can't even suggest anything beyond "sync=disabled". Since an NFS server uses both storage (hardware + ZFS) and networking, any breakage anywhere in these can cause a big performance hit. NFS itself just translates between the NFS RPC message and VFS/VOP calls. It is conceivable that some change in the NFS server is causing this, but these changes are few and others have not reported similar write performance problems for 14.0, so it seems unlikely.
(In reply to Rick Macklem from comment #10) Yes, I've tried both sync=disabled (it changed nothing) and smaller r/wsize (smaller size ~256K offers the best throughput from my tests). After a Saturday of hacking, I've managed to reach ~20MiBs write and 43-50MiBs read. It's not terrible. But I will try some more tricks and will let you know if I'll achieve anything. The router issue makes more sense the more I think about it. I got mine from my ISP and indeed sometimes I have weird network problems, so maybe that's related. Will also take a closer look at what can I do to improve this. Thanks for your ideas :) Much appreciated.
If you are playing with network related stuff, here's a bit more (no pun intended;-). Look at any stats generated by both server and client NIC drivers for errors, etc. If you have a different NIC lying about, particularily if it has a different chipset in it, try it. Look for a tunable in the NIC driver that adjusts interrupt moderation. Interrupt moderation is good for streaming traffic, not so good for NFS. Once a NFS client sends an RPC message, it waits for a response. Any delay in the reply, slows it down and interrupt moderation can delay the interrupt and, therefore, the RPC reply. And don't forget simple stuff like cables. They can get damaged at any time. Good luck with it, rick
I just played around on my old dell laptop (which is running something close to 14.0). I mounted it locally (so it is using lo0) and I see a reasonable write rate when I do: # dd if=/dev/zero of=/mnt/xxxx bs=1M count=1000 (about 200Mbytes/sec) but if I do: # dd if=/tmp/somefile of=/mnt/xxxx bs=1M I see much slower writing (about 30Mbytes/sec). I'll try UFS and see if I see the slow writing there as well. I am wondering if ZFS has changed the way it does compression? (I know so little about ZFS, I don't even know how to turn compression on/off on ZFS.) Btw, you could try using /dev/zero for input. You could also try doing a local mount on the NFS server (which gets the network out of the picture and only uses lo0).