There are many instances that I've seen where enabling certain features
with ZFS (such as dedup) will go and starve userland of all memory and/or
cause the system to dig into swap and die a slow and painful death if not
limited, whereas if you limit things it may go slowly, but it will
complete in a reasonable period of time.
Example: was talking with a user where a zpool import -F took 48+ hours
and crashed because it ran out of memory, whereas when I put in more sane
limits rerunning the zpool import -F finished within 45 minutes of
replaying the ZIL. And this was within both multiuser and singleuser
The salient point that I'm bringing up in this ticket is that the
defaults need to be low, but smart. In particular, if the defaults
for vfs.zfs.arc_max were set to 50% ~ 75% of vm.kmem_size, and
vfs.zfs.arc_meta_limit was set to 10% ~ 20% of that, this should suffice
for most scenarios, s.t. userland and other kernel processes aren't
kicked out of the running for memory. If someone has 1GB of RAM on their
box and running with a couple of TB of storage, they should go buy more
RAM or switch to UFS -- but 32GB~48GB~192GB machines shouldn't tip over
in the field because the code doesn't have sane defaults implemented.
I'd agree pretty strongly. I might be naïve, but allowing any subsystem to grab all memory by default seems an exceptionally bad idea. Default to 50% at most, and let the admin adjust it upwards if that seems appropriate and feasible.
For bugs matching the following conditions:
- Status == In Progress
- Assignee == "bugs@FreeBSD.org"
- Last Modified Year <= 2017
- Set Status to "Open"
Rather than start a new report I will jump into this old one.
Currently arc_max is set to the larger of (5/8 of kmem_size) or (kmem_size minus 1G) - for a system with 8G of ram 5/8 is 5G with kmem size minus 1G about 6GB
While that sounds ok, we also have max_wired which seems to default to 5G (I see that on an 8G and 16G setup) the issue results from arc being wired but not included in the max_wired count, that means an 8G system allows 10G to get wired! a 16GB system allows 20GB to be wired!
When a system has more than 70% of physical ram wired, it gets exteremely slow and over 80% wired gives a very high chance of needing a hard reset.
At the minimum we need to consider max_wired in the arc_max calculation.
I do believe the arc_max default should be even smaller to be better suited to more common setups that run processes that use ram, a sysadmin starting a file server that wants to use most of the ram for cache should know how to set this as desired, the other 80-90% of users shouldn't have to adjust a setting that is suited to the smaller use case.