When attempting to create a large sparse file on a tmpfs file system, I get this message: "truncate: ... No space left on device"
It appears to work on a similarly sized mfs file system.
Sample test script output:
# sh t_tmpfs
=== tmpfs ===
truncate: /tmp/_trunc_/sparse: No space left on device
0 -rw-r--r-- 1 root wheel 0 Oct 14 17:24 /tmp/_trunc_/sparse
=== mfs ===
96 -rw-r--r-- 1 root wheel 2147483648 Oct 14 17:24 /tmp/_trunc_/sparse
Test script follows:
mkdir $DIR || exit 1
echo "=== tmpfs ==="
mount -t tmpfs -osize=$SSIZE /tmp $DIR
truncate -s$TSIZE $DIR/$TFILE
ls -ls $DIR/$TFILE
echo "=== mfs ==="
mount -t mfs -o-s=$SSIZE md$TUNIT $DIR
truncate -s$TSIZE $DIR/$TFILE
ls -ls $DIR/$TFILE
mdconfig -du $TUNIT
Well, tmpfs does support spare files.
What you reported is indeed the wrong behavior, but it is caused not by supposed lack of support for spares, but due to wrong code what tries to avoid OOM situations due to over-committing the file backing for tmpfs files. See sys/fs/tmpfs_subr.c, functions tmpfs_pages_check_avail() and functions tmpfs_mem_avail() and tmpfs_pages_used() referenced from there.
In particular, tmpfs_mem_avail() is completely wrong, it mis-interprets v_free_count.
I think that tmpfs should only check the current page usage by specific mount point against per-mount point limit, if any. Trying to formulate a limit against some formula involving v_free_count and other VM metrics cannot work, due to the VM algorithms. The main reason is that we support paging memory to the backing store, so v_free_count indicates wasted memory (as opposed to the free or free-able memory in the common sense of the word).
Created attachment 187197 [details]
Certainly not my intention to mislead with the bug subject!
I have no idea how to approach a fix.
Probably related to tmpfs reluctance to use VM: in order to use tmpfs as /tmp and /usr/obj for a kernel build on a read-only RPI3, the previous patch adds a vfs.tmpfs.inactive_percent sysctl that "allows" tmpfs to use X% of inactive memory (default 0%). Otherwise builds _may_ fail.
(In reply to Keith White from comment #3)
The tmpfs_mem_avail() function should die. tmpfs_mount.tm_pages_max should be enough, it already allows administrator to limit the memory usage by mount point, mount -o size.
(In reply to Konstantin Belousov from comment #4)
After some dtracing (and head scratching), I believe I see a solution for my problem by using resident_page_count instead of size. It will complicate tmpfs_write() though...
Created attachment 187384 [details]
POC for using resident_page_count
This patch allows me to use large files with holes. e.g.
# mkdir /tmp/_x
# mount -t tmpfs -osize=10m tmp /tmp/_x
# truncate -s4g /tmp/_x/4g
# ls -ls /tmp/_x/4g
0 -rw-r--r-- 1 root wheel 4294967296 Oct 22 19:51 /tmp/_x/4g
# df /tmp/_x
Filesystem 1K-blocks Used Avail Capacity Mounted on
tmpfs 10240 4 10236 0% /tmp/_x
# du /tmp/_x
# umount /tmp/_x
# rmdir /tmp/_x
(In reply to Keith White from comment #5)
object->resident_page_fault is equally meaningless for your purposes. What is not clear in my comment #4 ?
The patch does not require any of these dynamic calculations using meaningless (for this purpose) values. If you want to limit the tmpfs mount memory use, specify explicit limit to mount_tmpfs. The attempts to mis-use some pagedaemon or object internal counters would not work out, their purpose is very different and they do not match with resource limiting for tmpfs.
I read but probably misunderstood! If I create a file with holes I don't want to be "charged" for it until the holes are filled in. I may want to create a disk image file, say, that is 4g but I know that I will only be storing 20m in total. ffs-like filesystems allow this. tmpfs has a more straight-forward idea of a file (allocate up-front, no hole management?). A "BUGS" section in tmpfs(5) would have guided me away from attempting to use tmpfs for sparse files. Othersize tmpfs is an excellent fit for me since I'm running diskless+swapless.
I'll re-read your comments. I see a drawing-board over there that I should get back to...
(In reply to Keith White from comment #8)
Let me explain it in full:
1. tmpfs should not try to use current counters of the active/inactive or free queues, since they are irrelevant to the system ability to satisfy page requests. If page is needed, the queues are scanned and a usable page might appear even if there is no free pages or all swap space is used (e.g. we can write out dirty file page or reuse clean file page).
2. tmpfs should provide a global limit on the number of used pages, in fact it already has it "-o size". The limit is compared against the maintained counter of the supposedly used pages tm_pages_used.
3. Your problem is because tm_pages_used is too harsh. It just sums up all files sizes, while it really should only count file pages which were really written to. In other words, instead of adjusting tm_pages_used in tmpfs_reg_resize(), it should be adjusted in tmpfs_write(). [There is additional complication, see below].
4. The tmpfs_mem_avail() should be removed. See item 1.
The complication is due to the tmpfs using in-place mapping, i.e. the vm object which contains the pages with the file data, directly provides the pages used for file mapping. This was highly desirable feature, because it avoids duplicating memory for the mmapped tmpfs files, and makes mmap zero-copy.
Problem is, page faults in the sparcerly allocated mmaped file range instantiate the file pages, which must be accounted for in tm_pages_max. The tmpfs vm objects are already flagged so this is not too hard to do, just that you cannot limit the patch to fs/tmpfs only.
This is my current opinion on the issue, hope this is clean enough.
(In reply to Konstantin Belousov from comment #9)
Yes, this helps to clear things up. Very many thanks!