Bug 259651 - bhyve process uses all memory/swap
Summary: bhyve process uses all memory/swap
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: bhyve (show other bugs)
Version: 12.2-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: Graham Perrin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-04 20:34 UTC by Jeremy Lea
Modified: 2023-04-30 16:15 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Lea freebsd_committer freebsd_triage 2021-11-04 20:34:27 UTC
I've had a Windows Server running in bhyve on FreeNAS for a few years now.  It uses DFS-R to sync a few windows file systems to my remote backup location.  The VM has several zvol backed AHCI devices and a virtio network adapter.  It has been running (mostly) stably for a long time with adequate performance (as in, it can mostly saturate the 1GB link it's on and can get disk speeds in the VM which are as fast as I expect from the low power backing store).  Recently I made a few changes to the machine and the host, some of which are hard to reverse, and the VM has started to consume all available RAM, then all the swap and eventually it gets killed by the OOM handler...  A few crashes corrupted the DFS-R databases, and so now the machine wants to do a huge amount of IO (both network and disk) to resync (but that's my problem).

There are other reports online of RAM exhaustion from bhyve, but I couldn't find an open bug, so I'm filing one.  My problem seemed to start on updating to TrueNAS-12.0-U5.1, but I also did some other reconfiguration around this time, and judging from the other reports, this might be a long-standing issue.

The other change I made was to mess with the CPU/RAM allocation to this VM, and I accidentally misread the number of the cores as the total number of cores, not the per CPU cores, so I allocated way more cores as my CPU has threads (2xCPUs, 2xcores, 2xthreads, 8GB RAM)...  Needless to say, the VM quickly swamped the host.  However, this also caused the memory use to grow.  I've now scaled the CPUs back to (1xCPU, 1xcore, 2xthreads, 6GB RAM) and the memory use is now staying stable - although it's currently rebuilding some DFS-R database so it's not maxing out the VM CPUs.

The behavior I observe is that the memory use stays stable as long as the host CPU use is reasonable.  As soon as the host starts to max out its real cores (it's a 2xcore, 2xthread CPU) and the bhyve VM is doing a lot of IO, the memory use grows rapidly.  When the byhe process is stopped (by shutting down the VM, if you can get in quick enough), it takes a very long time to exit and sits in a 'tx->tx' state.  It looks like it's trying to flush buffers, although the zpool seems to show only reads while the process is exiting.  My guess as to the bug is that byhve has a huge amount of outstanding IO, but I'm not sure how to monitor that.  When the host CPU is really busy these IO buffers are not being freed properly, and are eventually leaking.

Around the same time as making these changes, I also turned on dedup on one of the zvols (the backups on that disk are rewritten every day, even though they're the same, so I was getting a lot of snapshot growth).  I've turned that off, but it didn't seem to change the behavior.  I also added the ZIL and L2ARC devices to the pool around this time.  I've not tried removing them.

The host and the VM have been set up for a long time and working, so I'm going to ignore suggestions to get a bigger box or tune my zarc values...  But I'm happy to debug it - I've been able to reproduce this relatively reliably with different CPU settings, although it does rely on Windows cooperating.  I can't mess with it too much since I do need to keep the other backups going directly via TrueNAS to the other pools going ;-).

TrueNAS Server:
ThinkServer TS140, Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz, 20GB RAM.
zpool: 2 striped mirrored 3TB TOSHIBA HDWD130, with mirrored 12GB ZIL and 19GB L2ARC on SATA SSDs.
TrueNAS-12.0-U6 (FreeBSD 12.2-RELEASE-p10), 25GB of swap.

Windows Server VM:
2xCPU, 1xcore, 1xthread, 6GB RAM (original, see other comments).
4xAHCI zvol with 64K cluster, one of which had dedup on for a period as an experiment, 512B blocks. (VM BSODs immediately if I try using virtio-blk).
1xVirtIO NIC (em0), with 0.1.208 virtio-win drivers.
Windows Server 2019, fully patched.
Comment 1 Michael Osipov 2023-03-07 12:45:06 UTC
Out of interesing, does the issue still persist with 12.4-RELEASE?
Comment 2 void 2023-03-14 23:55:15 UTC
(on a 13.2-stable system, running a few bhyve VMs), I have found that setting

vfs.zfs.arc_max=4294967296

helps. It seems to default to 0. If left at 0, ram will be consumed by ARC which leads (in my context) to swapping.
Comment 3 Jeremy Lea freebsd_committer freebsd_triage 2023-04-25 17:21:39 UTC
(In reply to Michael Osipov from comment #1)

Sorry it's taken so long to get back to this.  I upgraded the machine in question to FreeNAS 13 (or whatever it's called today), and I don't see this anymore, even if I bump the bhyve machine to using the full host CPU (2 cores x 2 threads), so I'd say that whatever was causing it is fixed in 13.
Comment 4 Michael Osipov 2023-04-25 17:25:25 UTC
(In reply to Jeremy Lea from comment #3)

Good that it is fixed, obviously, bad that we don't know the reason :-(
Comment 5 Graham Perrin freebsd_committer freebsd_triage 2023-04-30 16:15:28 UTC
(In reply to Jeremy Lea from comment #3)

> … I'd say that whatever was causing it is fixed in 13.

Thanks; closing. 


(In reply to void from comment #2)

> vfs.zfs.arc_max

Others to consider include: 

vfs.zfs.arc.sys_free
<https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#zfs-arc-sys-free>

vfs.zfs.arc_free_target