| Summary: | tmpfs: memory reserve does not account for ARC | ||
|---|---|---|---|
| Product: | Base System | Reporter: | (intentionally left blank) <throwaway_vthgwq4> |
| Component: | kern | Assignee: | freebsd-fs (Nobody) <fs> |
| Status: | Open --- | ||
| Severity: | Affects Many People | CC: | markj, mizhka |
| Priority: | --- | ||
| Version: | 15.0-CURRENT | ||
| Hardware: | Any | ||
| OS: | Any | ||
|
Description
(intentionally left blank)
2024-01-17 10:02:58 UTC
Assign to Mike, the author of the change that introduced regression. Note from me: there are way more applications and use cases than just x11-wm/sway that are affected. I attempted to set up an environment to test this without sway. I created a tmpfs with no size specified, filled ZFS ARC by reading from ZFS, then copied from ZFS to tmpfs. The ARC was reduced in size, but probably not to the minimum, before writes to tmpfs failed. At that time, swap was 98% full. Perhaps more could have been written to tmpfs if it had been done more slowly, allowing ARC to shrink, but the system was already running quite close to the edge of running out of memory and swap and starting to kill processes. Can someone describe a setup where tmpfs refuses writes when the system is running in a safe state (e.g. where ARC has sufficient room to shrink, and the system is not really short of memory+swap)? Size parameters would be helpful, i.e. memory size, swap size, and expected capacity of the tmpfs. Also, I'm curious if a setting of vfs.tmpfs.memory_percent between 95 and 100% would allow more tmpfs usage while running in a safe range. I haven't actually tested values other than 95% and 100%; at 100% it is very easy to get the system to kill processes and/or hang. Hi,
I'm facing same issue on my laptop.
Laptop specs:
- 32gb ram
- ZFS pool with ARC limited by 2GiB (vfs.zfs.arc_max="2G" in loader.conf)
- tmpfs /tmp limited by 4GiB (size=4096MiB in fstab)
Also it's worth to mention that I use firefox (with 1000 tabs) consuming RAM as much as possible.
As result, after several active hours top shows following:
Mem: 3662M Active, 11G Inact, 1038M Laundry, 13G Wired, 56K Buf, 1677M Free
ARC: 1613M Total, 724M MFU, 575M MRU, 813K Anon, 28M Header, 264M Other
871M Compressed, 2108M Uncompressed, 2.42:1 Ratio
I assume that most of wired memory is consumed by pgcaches:
# vmstat -z | grep pgcache
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP XDOMAIN
vm pgcache: 4096, 0, 3588181, 3137,56579092, 18726, 0, 0
vm pgcache: 4096, 0, 511022, 2416,10129459, 254, 0, 0
After recent build{kernel,world} I found out that I can't write to /tmp even if there is free memory (like on output of top above):
- packer tries to write logs to /tmp and fails
- thunderbird fails to send emails
- firefox fails to download files
For me it's good mark that system is close to OOM, but it's a bit annoying & a bit unexpected.
In theory, it will be better to shrink pgcache/whatever shrinkable to avoid errors like tmpfs.
I have another point for discussion related to user experience. See output below:
mizhka@tamagawa ~ % echo 1 > /tmp/packer-log3595262476
zsh: no space left on device: /tmp/packer-log3595262476
mizhka@tamagawa ~ % df -h /tmp
Filesystem Size Used Avail Capacity Mounted on
tmpfs 4.0G 284K 4.0G 0% /tmp
It's a bit hard to understand why tmpfs throw error "no space" if "available" is 4.0G according to "df".
May be it possible to mark FS as read-only instead of "no space" error to highlight lack of memory?
Thanks,
Michael
(In reply to Michael Zhilin from comment #3) Thanks for the report. Swap space must have been nearly full; did you happen to notice? It would have been in the "top" header. When the file system size was specified, as in this case, the free space shown by df is relative to that size; otherwise it is relative to free memory + swap. I agree that it can be confusing, especially if it changes due to other usage. I'll think about this. It should be possible to display the minimum of the two values, but that might make the file system appear to shrink (which happens when size is not specified). I experimented yesterday with calls to the VM system, and indirectly to ZFS, reporting memory pressure. It worked to some extent, but too slowly. I will probably change the default memory_percent to 100 for now. It looks like 2e68c5a44c40848c43849cc3650331cf82714dfa hides the problem for the time being? (In reply to Mark Johnston from comment #5) Yes, I believe so. There is still a change present, but everyone who has tested with 100 percent said the problem was masked. I am investigating a more complete fix. ^Triage: with Bugmeister hat, reset assignee. |