During a compilation of FreeBSD 12.1 for the Raspberry Pi 2B, the build failed at the file /usr/src/contrib/googletest/googlemock/test/gmock-matchers_test.cc with an out of memory condition. Further analysis revealed, that it takes more than 1.5 GB memory to build this file. Far too much for my puny computer. This makes it impossible to finish a FreeBSD build and given the lack of binary upgrade possibilities, makes it very difficult for me to upgrade to FreeBSD 12.1. Please find out what causes this pathological memory usage and make it possible to build FreeBSD on a machine with no more than 1 GB of RAM as it was before. Not that 1 GB of RAM (mostly due to having to build clang) isn't already an annoyingly high memory requirement for upgrading your system from source.
I'm doing a buildworld for amd64 13-CURRENT with FreeBSD 12.1 guest on macOS VirtualBox host with 2.4GB ram assigned to FreeBSD guest. I have 7 GB of swap. The build gets to the Build Everything stage. When it gets to gmock-matchers_test.cc the process is killed with an out of swap signal. swapinfo: Device 1K-blocks Used Avail Capacity /dev/ada0p2 2097152 11524 2085628 1% /dev/zvol/zroot/swap 5242880 12012 5230868 0% Total 7340032 23536 7316496 0% How much swap does this file need to compile? swapon complains if I try to add more.
(In reply to Dave Evans from comment #1) If you get console messages something like (extracted example): . . . kernel: pid 7963 (strip), jid 0, uid 0, was killed: out of swap space This message text's detailed wording is frequently a misnomer. Do you also have any messages of the form: . . . sentinel kernel: swap_pager_getswapspace(32): failed If yes: you really were out of swap space. If no: you were not out of swap space, or at least it is highly unlikely that you were. FreeBSD kills processes for multiple potential reasons. For example: a) Still low on free RAM after a number of tries to increase it above a threshold. b) Slow paging I/O. c) . . . (I do not know the full list) . . . Unfortunately, FreeBSD is not explicit about the category of problem that leads to the kill activity that happens. You might learn more by watching how things are going via top or some such program or other way of monitoring. You likely will find the swap space not low. Below are some notes about specific tunables that might or might not be of help. (There may be more tunables that can help that I do not know about.) For (a) there is a way to test if it is the issue by adding to the number of tries before it gives up and starts killing things. That will either: 1) let it get more done before kills start 2) let it complete before the count is reached 3) make no significant difference (3) would imply that (b) or (c) are involved instead. (1) might be handled by having it do even more tries. For delaying how long free RAM staying low is tolerated, one can increase vm.pageout_oom_seq from 12 to larger. The management of slow paging I've less experience with but do have some notes about below. Examples follow that I use in contexts with sufficient RAM that I do not have to worry about out of swap/page space. These I've set in /etc/sysctl.conf . (Of course, I'm not trying to deliberately run out of RAM.) # # Delay when persisstent low free RAM leads to # Out Of Memory killing of processes: vm.pageout_oom_seq=120 I'll note that figures like 1024 or 1200 or even more are possible. This is controlling how many tries at regaining sufficient free RAM that that level would be tolerated long-term. After that it starts Out Of Memory kills to get some free RAM. No figure is designed to make the delay unbounded. There may be large enough figures to effectively be bounded beyond any reasonable time to wait. As for paging I/O (WARNING: all the below tunables are specific to head (13), or was last I checked): # # For plunty of swap/paging space (will not # run out), avoid pageout delays leading to # Out Of Memory killing of processes: vm.pfault_oom_attempts=-1 (Note: In my context "plunty" really means sufficient RAM that paging is rare. But others have reported on using the -1 in contexts where paging was heavy at times and OOM kills had been happening that were eliminated by the assignment.) I've no experience with the below alternative to that -1 use: # # For possibly insufficient swap/paging space # (might run out), increase the pageout delay # that leads to Out Of Memory killing of # processes: #vm.pfault_oom_attempts= ??? #vm.pfault_oom_wait= ??? # (The multiplication is the total but there # are other potential tradoffs in the factors # multiplied, even for nearly the same total.) I'm not claiming that these 3 vm.???_oom_??? figures are always sufficient. Nor am I claiming that tunables are always available that would be sufficient. Nor that it is easy to find the ones that do exist that might help for specific OOM kill issues. I have seen reports of OOM kills for other reasons when both vm.pageout_oom_seq and vm.pfault_oom_attempts=-1 were in use. As I understand, FreeBSD did not report what kind of condition lead to the decision to do an OOM kill. So the above notes may or may-not help you.
These notes aside: this memory usage is far from the norm for compiling a C++ source file. I believe there must be a bug in clang or llvm or some unfortunate design in the source file itself that causes this memory usage. This is more than twice the highest memory usage I observed before (roughly 800 MB for one of the X86 instruction selection files in the LLVM source) and as opposed to that case, there is no apparent explanation for the memory usage.
(In reply to Mark Millard from comment #2) I should have noted: if a process stays runnable, FreeBSD does not stop it and swap it out but instead just pages it. (For FreeBSD, swapping basically seems to mean that the kernel stacks were also moved out to swap space and would have to be brought back in for the process to run.) Thus, 1 or more processes that use large amounts of memory relative to the RAM size but also stay runnable, are not not stopped and swapped out to make room. In such a context, if free RAM stays low, despite other efforts to gain some back, processses are then killed instead. vm.pageout_oom_seq controls how many attempts are made to gain more free RAM before the kills start.
(In reply to Dave Evans from comment #1) Was this a -j1 build? Something larger? (There could be other things worth reporting about the build context.) It can be interesting to watch the system with top sorted by resident RAM use (decreasing). It can give a clue what are the major things contributing to low free RAM while watching. I'm not sure one can infer much from which process FreeBSD decides to kill. Other evidence is likely better about what is contributing to the sustained low free RAM that likely is what leaded to the process kill(s). To Robert: I've been replying mostly to Dave because it has been a significant time since I've experimented with a 1 GiByte machine for buildworld buildkernel and the like. Dave indicated over 2 GiByte for his context. You could try vm.pageout_oom_seq=1200 and a -j1 build and see if it helps you. Reporting the result here might be useful. Actually, you indicated "upgrade to FreeBSD 12.1", so your context is appearently older, such as 12.0. I'm not sure if vm.pageout_oom_seq goes back that far. That might leave you only with -j1 (which, for all I know, you might have already been using).
(In reply to Mark Millard from comment #5) Looks like vm.pageout_oom_seq goes back to 10.3.0-RELEASE so experiments with it on a 12.0-RELEASE based system should be possible.
Thanks for all the useful comments. I've now set kern.maxswzone=42949664 which as far as I can tell from loader(8) is the value to be used for a theoretical 8GB of swap. I've configured 4GB of swap and rebooted. I then ran a stress test of running 3 compilations of the offending file simultaneously and monitored the system with top. Each job peaked at size: 1500M, resident: 600M swap usage peaked at 75% or 3054M The 3 jobs took 30 minutes to complete, as I would expect. There were no out of swap messages, which I good. The initial problem was that default kern.maxswzone was set way too low. It is not something I've ever tweaked before. It was probably not allowing more than 1GB or less of swap. This experience has taught me to read the output of dmesg more frequently and studiously. It also helps to read the man pages.
(In reply to Mark Millard from comment #5) I was not specifying any value for make -j The virtual machine is set up to use 1 cpu core.
In reply to comment #5 of Mark: That was a -j1 world build with swap enabled. On a separate SSH session, I watched the clang process spike to 1.5 GB (with some 700 MB resident, not sure) before it got killed. I was eventually able to get the compilation to run through by temporarily configuring extra swap space but it was a real pain to do. Please though: I am extremely sure this is a compiler bug or poorly designed program, not a configuration issue. Tweaking VM settings will not solve the underlying issue which is probably a memory leak or something. And if people have to perform arcane tweaks to be able to upgrade their system at all (as no other upgrade path than upgrading from source is supported on ARM32), that's really bad news for people who actually want to run FreeBSD on their ARM boxes. Please solve the underlying issue. I want a solution, not a bandaid.
(In reply to Robert Clausecker from comment #9) > I am extremely sure this is a compiler bug or poorly designed program You may or may not be right. I do not have the knowledge to know what is appropriate to expect for the test case. I do not know what reasonable memory use figures would be. I do not know which side of that "or" should have the blaim (or if either should). I've no clue just what would need to change in either (if anything). > Please solve the underlying issue. I want a solution, not a bandaid. I am not a llvm developer, nor a Google Test developer. (I've only done a few small patches for FreeBSD, generally for personal use. So, effectively, I'm not a FreeBSD developer either: a user.) llvm is an upstream project used by FreeBSD but not developed by FreeBSD. There is a: https://bugs.llvm.org/ That would be a more appropriate place for requesting a fix or redesign from the compiler side of things. If they improved things, FreeBSD would pick up the change at some point. Google Test is an upstream project used by FreeBSD but not developed by FreeBSD. It looks like it uses: https://github.com/google/googletest/issues for submitting and tracking issues. That would be a more appropriate place for requesting a fix or redesign from the Google Test side of things. I've no evidence for which place would be the right place to submit something. I do know that FreeBSD's bugvilla is not the right place for upstream project changes. (Although having a FreeBSD bugzilla item for pointing to an upstream item for reference can be appropriate at times.)
It's not just arm. I ran into the same bug on amd64 trying to build a release of stable/12 at r358079. I'm going to set WITHOUT_GOOGLETEST=1 in src.conf as a workaround.
Just to add to the examples of what it takes to build and link gmock-matchers_test . . . In /usr/src/lib/googletest/gmock_main/tests/ I tried building gmock-matchers_test on an Orange Pi+ 2ed (armv7 Cortex-A7 with 2 GiBytes of RAM and 1740Mi swap/paging space). The context is head -r358132 . I use a modified version of top that keeps track of its sampled "Max. Observed Active" (MaxObsActive), MaxObsWired, MaxObs(Act+Wir), and, for swap use, MaxObsUsed (if any). It is also biased to present more digits (smaller unit size) and be explicit about powers of 2 factors being in use for memory size display. After finishing in somewhat over 20 minutes (under 25?), the odd variant of top was showing: 1019Mi MaxObsActive, 193444Ki MaxObsWired, 1146Mi MaxObs(Act+Wir) Swap: 1740Mi Total, 1740Mi Free That spans the link as well. So swap/paging space was not observed to be used --but clearly would have been on a 1 GiByte machine. Similarly Free RAM was never observed to be low but would have been on a 1 GiByte machine. An example aarch64 is a Rock64 (not Pro) with 4 GiBytes of RAM: 1753Mi MaxObsActive, 633084Ki MaxObsWired, 2368Mi MaxObs(Act+Wir) Swap: 4608Mi Total, 4608Mi Free (It shows a lot more Wired even without the build, just because of the larger amount of RAM.) So, even just looking at the MaxObsActive, it indicates that a 1 GiByte RAM machine would be paging/swaping and a 2 GiByte machine would likely do some as well (far less). There is a significant MaxObsActive difference between the armv7 and aarch64 contexts. But it would be interesting to see what a 2 GiByte aarch64 would be like.
(In reply to Mark Millard from comment #12) Adding a Pine64+2G example (so 2 GiBytes of RAM on aarch64, again head -r358132 based): 1682Mi MaxObsActive, 278228Ki MaxObsWired, 1845Mi MaxObs(Act+Wir) Swap: 3584Mi Total, 3584Mi Free It did not use swap but looks like it was fairly close to doing so. Note: It is expected that MaxObs(Act+Wir)<=MaxObsActive+MaxObsWired. The right hand side need not be figures from similar time frames but the left hand side is from figures from comparatively similar time frames. Plus just the Math: The maximum of a sum is at most the sum of the maximums.
(In reply to Mark Millard from comment #13) I should have noted that I used my normal context relative to controlling the criteria for kills for Out Of Memory and related issues: # # Delay when persistent low free RAM leads to # Out Of Memory killing of processes: vm.pageout_oom_seq=120 # # For plunty of swap/paging space (will not # run out), avoid pageout delays leading to # Out Of Memory killing of processes: vm.pfault_oom_attempts=-1 (That last one may only be for head but the first has been around for longer.)
This particular source file is indeed a rather pathological case. On my 13.0-CURRENT test system, using clang 10.0.0-rc3 (with assertions enabled), it takes a maxrss of 1982620, so ~1936 MiB to compile with -O2. Gcc 9.2.0 from ports fares even worse, it takes about 20% more time to compile, and a maxrss of 2684812, so ~2622 MiB. I also tried the clang90 port, but this assertions disabled, and this takes a maxrss of 1755320, so ~1714 MiB. For now, my advice would be to compile this file with -O1, or even -O0, as it seems to be an internal test for googletest itself, and not something that we actively need to have heavily optimized.
(In reply to Dimitry Andric from comment #15) I'll note one possibility may be jemalloc behavior contributing. (Not that I've specific evidence one way or the other.) QUOTING (although I changed the top-post order to bottom posting) . . . On Thu, Jan 9, 2020 at 1:45 PM Bryan Drewery <bdrewery at freebsd.org> wrote: > > Do you plan to get this back in soon? I hope to see it before 12.2 if > possible. Is there some way I can help? > > I'm interested in these changes in 5.2.1 (I think) > - Properly trigger decay on tcache destroy. (@interwq, @amosbird) > - Fix tcache.flush. (@interwq) > - Fix a side effect caused by extent_max_active_fit combined with > decay-based purging, where freed extents can accumulate and not be > reused for an extended period of time. (@interwq, @mpghf) > > I have a test case where virtual memory was peaking at 275M on 4.x, 1GB > on 5.0.0, around 750M on 5.1.0, and finally 275M again on 5.2.0. The > 5.0/5.1 versions appeared to be a widespread leak to us. . . . I think it's fine to get jemalloc 5.2.1 in again now. The previous fails were due to ancient gcc421. Now the in-tree gcc has been removed and the default compiler of non-llvm platforms are all using gcc6 from ports. The CI environment are also updated to follow the current standard. I've tested a patch combines r354605 + r355975 and it builds fine on amd64 (clang10) and mips (gcc6). Best, Li-Wen
(In reply to Mark Millard from comment #12) I had a RPi3 that was based on head -r358966 do a build world buildkernel of the same version, from scratch, -j4 style. The RPi3 is a 1 GiByte RAM context. I had 3072 GiBytes for the swap partition. That ,and the ufs file system, were on a USB SSD, not the microsd card. The build completed without any /var/log/message or console output during the build. My modified version of top reported (details copied from a ssh window) . . . For Mem: 738512Ki MaxObsActive, 190608Ki MaxObsWired, 906372Ki MaxObs(Act+Wir) For Swap: 1927Mi MaxObsUsed (top was started before the build. "MaxObs" is short for "Maximum Observed".) The build took a few minutes under 31 hrs. The build used (the PINE64 media are also set up to boot the RPi3, explaining some naming): vm.pageout_oom_seq=120 vm.pfault_oom_attempts=-1 vfs.root.mountfrom="ufs:/dev/gpt/PINE642Groot" dumpdev="/dev/gpt/PINE642Gswp2" /dev/gpt/PINE642Groot / ufs rw,noatime 1 1 /dev/gpt/PINE642Gswap none swap sw 0 0 (So this avoided the microsd card for ufs and swap/page space.) Overall, it looks like having more than 2 GiBytes of swap partition is appropriate for -j4 : 1927 MiByte is not much less than 2048 MiByte. But, with appropriate configuration anyway, the RPi3 can do buildworld buildkernel for head 13, even -j4 style. This was aarch64. armv7 style with 1 GiByte RAM does not allow as much swap/page space without complaining at boot. It does not appear that such a -j4 build would be appropriate for armv7. But I've not investigated what would fit.
(In reply to Mark Millard from comment #17) Poor wording: "the PINE64 media are also set up to boot the RPi3, explaining some naming" Better: my PINE64 media are also set up to boot the RPi3, explaining some naming Note: This works because the dd based PINE64+ 2GB material and the msdosfs based RPi3 materials do not interfere with each other and can both be in place. After that, FreeBSD need not care which it is.
(In reply to Mark Millard from comment #17) A 1 GiByte RAM armv7 test . . . I tested a RPi2 V1.2 based on armv7 head -r359427 as the context (self-hosted, from-scratch build), using -j2 with 1800 MiByte swap partition used for the 1 GiByte RPi2. vm.pageout_oom_seq=120 and vm.pfault_oom_attempts=-1 and USB SSD and the like again, avoiding the microsd card after the kernel loads. The 1800 MiByte swap avoided boot notices of the form: warning: total configured swap (... pages) exceeds maximum recommended amount (... pages). I stayed somewhat under the recommended maximum. (stable/12 reportedly lists a smaller recommended maximum when its figure is exceeded, somewhat more than 1200 MiByte.) The build completed fine, with my odd top variant showing "maximum Observed" figures: Mem: 758544Ki MaxObsActive, 189972Ki MaxObsWired, 928060Ki MaxObs(Act+Wir) Swap: 527388Ki MaxObsUsed But it turned out that the high memory use time frame for gmock-matchers_test.cc was matched with a very low memory use activity. So the 527388Ki MaxObsUsed is on the low side for figuring out having margin to cover -j2 variability in what the paired activity might be. Other pairings could easily have used over 700 MiByte more (say, linking clang), and so have reached in the realm of 1400 to 1500 MiByte for swap, leaving, say, 400 MiBytes to 300 MiBytes unused. (I happened to be there to watch the top display over the period of time at issue, seeing the growth to 527388Ki MaxObsUsed.) I'd not push it to -j3 for armv7 FreeBSD with 1 GiByte RAM. Having swap fairly near (but under) the recommended maximum seems appropriate for -j2 . Appropriately configured, -j1 seems unlikely to be a problem for for 1 GiByte RAM stable/12 (swap space contributing, vm.pfault_oom_attempts=-1 contributing, vm.pageout_oom_seq=120 contributing). I've not experimented with the more problematical microsd cards instead of using the particular USB SSD's that I have around. In the microsd card context vm.pfault_oom_attempts=-1 is likely appropriate to avoid paging activity latency from leading to OOM kills. For reference: the build took somewhat less than 38 hrs. I will note that stable/12 does support vm.pfault_oom_attempts=-1 as of -r351776 (2019-Sep-3) and vm.pageout_oom_seq=120 for much longer.
(In reply to Mark Millard from comment #19) Looks like head -r366850 will make parallel build activity while gmock-matchers_test.cc is building more likely, causing increased peak-attempted memory use for -j2 overall. It might be that for 1 GiByte armv7 contexts that -j1 will effectively be required if gmock-matchers_test.cc is to be part of the build. -j2 for 1 GiByte aarch64 contexts with sufficient page/swap space may page/swap heavily over this period. Use of the tuning controls to avoid OOM kills would seem to be required for such contexts.
See also Alex's https://reviews.freebsd.org/D26751, which is supposed to lower the CPU and RAM requirements.
https://reviews.freebsd.org/D26067 may also be of interest as it stops building this test by default.