Bug 208130 - smbfs is slow because it (apparently) doesn't do any caching/buffering
Summary: smbfs is slow because it (apparently) doesn't do any caching/buffering
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-fs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-03-18 22:55 UTC by noah.bergbauer
Modified: 2018-10-20 17:29 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description noah.bergbauer 2016-03-18 22:55:44 UTC
I set up an smbfs mount on FreeBSD 10.2-RELEASE today and was noticed that it's very slow. How slow? Some numbers: Reading a 600MB file from the share with dd reports around 1 MB/s while doing the same in a Linux VM running inside bhyve on this very same machine yields a whopping 100 MB/s. I conclude that the SMB server is irrelevant in this case.

There's a recent [discussion](https://lists.freebsd.org/pipermail/freebsd-hackers/2015-November/048597.html) about this on freebsd-hackers which reveals an interesting detail: The situation can be improved massivley up to around 60MB/s on the FreeBSD side just by using a larger dd buffer size (e.g. 1MB). Interestingly, using very small buffers has only negligible impact on Linux (until the whole affair gets CPU-bottlenecked of course).

I know little about SMB but a quick network traffic analysis gives some insights: FreeBSD's smbfs seems to translate every read() call from dd directly into an SMB request. So with a small buffer size of e.g. 1k, something like this seems to happen:

* client requests 1k of data
* client waits for a response (network round-trip)
* client receives response
* client hands data to dd which then issues another read()
* client requests 1k of data
* ...

Note how we're spending most of our time waiting for network round-trips. Because a bigger buffer means larger SMB requests, this obviously leads to higher network saturation and less wasted time.

I'm unable to spot a similar pattern on Linux. Here, a steady flow of data is maintained even with small buffer sizes, so apparently some caching/buffering must be happening. Linux's cifs has a "cache" option and indeed, disabling it produces exactly the same performance (and network) behavior I'm seeing on FreeBSD.


So to sum things up: The fact that smbfs doesn't have anything like Linux's cache causes a 100-fold performance hit. Obviously, that's a problem.
Comment 1 noah.bergbauer 2018-10-06 21:00:37 UTC
Revisiting this 2.5 years later, no improvement.

But one possible workaround is to mount the smbfs, then create an md(4) disk from a file on the mount and finally mount UFS on top of that. The filesystem-level buffer management ensures that inefficiently small IO will never hit the network (increase the md disk sector size as needed, 4K already gives me well over 40 MiB/s).

This obviously breaks file sharing but at least it allows using samba shares for simple remote storage (poor man's iSCSI) with decent performance.

It's a shame there is no filesystem-level equivalent of gcache(8) (which by the way is the solution if you just need a block device instead of a filesystem).
Comment 2 Conrad Meyer freebsd_committer 2018-10-06 21:10:05 UTC
As I understand it, our smbfs also does not support newer SMB protocol versions (2 and 3) — and SMB1 has known security problems, so newer versions of Windows (and Samba, by default) do not speak the v1 protocol at all.  It might be best to use something other than smbfs.
Comment 3 noah.bergbauer 2018-10-06 21:42:18 UTC
(In reply to Conrad Meyer from comment #2)

>SMB1 has known security problems, [...] It might be best to use something other than smbfs.

True that. But what else is there? At least in this particular case I'm forced to choose between Samba, FTP, SFTP and WebDAV. This is all I'm given. The kernel has no SSHFS (judging by how the protocol works, performance probably wouldn't be too great either) and last I checked davfs was awfully slow. Let's not even talk about FTP. This leaves me with smbfs as my only choice.

Fun fact: The cloud storage provider in question was (as far as I remember) actually running FreeBSD on their storage servers as of ~2 years ago and probably still is today.


The handful of times I tried to play around with NFS (a few years ago) I got disappointing performance even on loopback/tap links (bhyve VM), especially considering how complicated it is to work with. Right now I'm just not confident that I could properly secure an NFS server.
Comment 4 Conrad Meyer freebsd_committer 2018-10-06 21:59:44 UTC
(In reply to noah.bergbauer from comment #3)
Yes, all of these seem like bad options unfortunately.  There is a FUSE SMB which may support newer versions of SMB and have block caching.  Even with additional round-trips between kernel and user, it may be faster than sending every single 1kB request out to the network.
Comment 5 noah.bergbauer 2018-10-10 22:38:18 UTC
(In reply to Conrad Meyer from comment #4)

If you're talking about sysutils/fusefs-smbnetfs, it maxes out around 8 MB/s.

But I wonder: Why is FreeBSD smbfs capped at about 60 MB/s while Linux cifs (inside bhyve on the same machine!) easily saturates the Gigabit link (120 MB/s)? From a quick peek at the code the maximum read size seems to be 60KB which my measurements somewhat confirm. Synchronously transferring 60KB buffers at 60MB/s indicates a round trip time of exactly 1ms - in reality the RTT is 0.35 ms though. Perhaps it takes one time slice for the reply to be processed (kern.hz=1000)?

It's just a bit surprising that remotely mounting a filesystem (or even just a block device) from one freebsd server to another is this hard.
Comment 6 noah.bergbauer 2018-10-20 17:29:16 UTC
I still don't know what causes this extra 0.65ms latency (see my previous comment for details) but I noticed that there appears to be some per-mount synchronization going on: Two IO streams on a single smbfs mount share the 60MB/s bandwidth, i.e. they get 30 each, whereas two unique mounts of the same samba share each get the full 60MB/s. And indeed: just slap a gmultipath(8) in there and the bandwidth issue is gone!

This hack is quite a mess but so far it's stable, fast and it works astonishingly great. I'm actually planning using this in production.