I have a VM with 1GB RAM and 3GB swap. I could compile the -current kernel without problems in the last months. Now the builds fails with "out of swap space" . A change in the last 10 days broke the build of the kernel. Sad. I do not remember that we make the decision to require to have more than 1GB RAM to build a kernel. Please fix this. Thank you.
The last output of make buildkernel is: ctfconvert -L VERSION -g vers.o linking kernel.full ctfmerge -L VERSION -g -o kernel.full ... I see in top(1) that ctfmerge is huge, 250MB rss. Then the terminal freeze. I cannot provide more information, because the kernel kills all monitoring tools (ssh, screen, top, munin-node etc.)
Maybe try lowering the -j level? Add more swap? Trim your kernel configuration file so it uses less devices? Turn off debug information? Etc, etc.
The VM has only one CPU. I do not use make -j, it does not make sense on a single CPU machine. How much swap space should I add? 10GB, 20GB? I do not want to tune the kernel, the default kernel should just compile (GENERIC or GENERIC-NODEBUG).
(In reply to Wolfram Schneider from comment #3) Also, as I understand it, with only 1 GiByte of RAM assigned FreeBSD will self limit the swap space, reporting somethings like (not necessarily with matching figures): warning: total configured swap (524288 pages) exceeds maximum recommended amount (405460 pages). [Note the above is from a RPi2B V1.1 with 1 GiByte of RAM (armv7). An RPi3B (aarch64) allows far more but still limits it.] It will also say something like: warning: increase kern.maxswzone or reduce amount of swap. but looking at the code that "increase kern.maxswzone" only applies if one has already set kern.maxswzone to a figure that limited the amount of swap. The default value of zero makes kern.maxswzone be otherwise ignored. (But i386 seemed to have a non-zero default, if I remember correctly.) Quoting "man 8 loader" : kern.maxswzone . . . Note that swap metadata can be fragmented, which means that the system can run out of space before it reaches the theoretical limit. Therefore, care should be taken to not configure more swap than approximately half of the theoretical maximum. . . . It turns out that the "maximum recommended amount" earlier is that "half". But the theoretical maximum is not always the "8 times the amount of physical memory" that man page indicates (or anywhere near even 6 times). (The RPi2B V1.1 is an example of that.)
(In reply to Wolfram Schneider from comment #3) Possibly of interest is some recent Out Of Memory handling related work, see: https://reviews.freebsd.org/D13671 and its associated check-ins.
Usually my machine has 300MB free memory, and 200MB inact memory. cc(1) needs less than 50-100MB RES memory, and c++(1) less than 100-200MB RES memory. I saw that ctfmerge(1) grows up to 690 MB RES memory for a short period of time.
Did you check that your kernel configuration doesn't have WITH_CTF=1? I agree such ever increasing memory requirements are bad, but there's not much to do, other than add more swap.
(In reply to Pedro F. Giffuni from comment #7) Relative to increasing swap on a RPi2B V1.1 and its 1 GiByte of RAM, I'll reference: https://lists.freebsd.org/pipermail/freebsd-hackers/2018-January/052165.html which is an example of pfault/vmwait mixure hangups during buildworld when top shows: Swap: 2048M Total, 52M Used, 1996M Free, 2% Inuse The RPi2B V1.1 was getting the report: warning: total configured swap (524288 pages) exceeds maximum recommended amount (405460 pages). So, the hangup is apparently an example of the "man 8 loader" note: Note that swap metadata can be fragmented, which means that the system can run out of space before it reaches the theoretical limit. Therefore, care should be taken to not configure more swap than approximately half of the theoretical maximum. ctfmerge is not the only place that people have trouble with self-hosting based on 1 GiByte of RAM for a 32-bit architecture.
(In reply to Mark Millard from comment #8) I probably should have noted that with the much larger swap that a 1 GiByte RPi3 (aarch64) allows without the complaint about "exceeds maximum recommended amount", I'm not aware of examples of the problems I and others have had with RPi2B V1.1's. (I also experiment with an RPi3.) Apparently, there is an advantage to the infrastructure for larger addresses, even with just the same amount of RAM (1 GiByte). Note: I've had pfault and vmwait mixture hangups building ports on the RPi2B V1.1. It is not just buildworld and buildkernel. I've not seen such for the RPi3 for the swap size I use there, somewhat under what would produce a "exceeds maximum recommended amount" notice.
I run a test to find the first commit which breaks the buildkernel on my VM. I run the tests von 2 VMs, one with 1GB swap and the other VM without swap, but with a 3GB swap file. The first bad commit is: 86c1e7ab7b74d489659c9e4f3332a6d5f8bc9690 r327949 | pfg | 2018-01-13 22:30:30 all following commits are bad, and the next good commit is 9026a8ae0a17bbcad9b415afd5386f0d15688e4f r327953 | pfg | 2018-01-14 00:31:34 I got the same results on both machines.
Created attachment 189930 [details] Drop size attributes from mallocarray Sorry .. and thank you for your extraordinary diligence with this issue. Before I revert everything (which I will), can you test this patch?
(In reply to Pedro F. Giffuni from comment #11) The gnu documentation uses a distinct notation for the 2 arguments multiplied case, not just listing each separately: QUOTE void* my_calloc(size_t, size_t) __attribute__((alloc_size(1,2))) void* my_realloc(void*, size_t) __attribute__((alloc_size(2))) declares that my_calloc returns memory of the size given by the product of parameter 1 and 2 and that my_realloc returns memory of the size given by parameter 2. END QUOTE It indicates that the information is used for improved __builtin_object_size correctness. https://clang.llvm.org/docs/AttributeReference.html also lists such, with some notes on limitations on __builtin_object_size use. Might there be something analogous/appropriate for __alloc_size for the two size argument case? (Looks like __alloc_size in FreeBSD translates to a __attribute__((alloc_size(. . .))) use.)
Hi Mark; (In reply to Mark Millard from comment #12) Our current notation is correct: I actually prefer the documented GCC notation, but I had to change it after a long discussion about old compilers not supporting VA_ARGS. See the commit log for r280801.
For the record: The function of mallocarray(9) is panicing as soon as an overflow is detected. The attributes, OTOH, try to detect obvious cases of an overflow at compile time. I prefer to prevent as much as possible any overflow but avoid panics so I think it's better to keep the attributes. I don't plan to drop the attributes, I am just curious if they are causing the excessive swap. I am preparing to revert r327828, r327949, r327953, r328016-r328026, r328041, which vastly reduce the mallocarray calls in the kernel.
Created attachment 189933 [details] mallocarray(9) use revert patch For reference.
(In reply to Pedro F. Giffuni from comment #13) Okay. It is interesting that a sequence of individual attributes gets an implicit multiply to produce a correct __builtin_object_size value. Just for reference for anyone reading: From cdefs.h : #if __GNUC_PREREQ__(4, 3) || __has_attribute(__alloc_size__) #define __alloc_size(x) __attribute__((__alloc_size__(x))) #else #define __alloc_size(x) #endif #if __GNUC_PREREQ__(4, 9) || __has_attribute(__alloc_align__) #define __alloc_align(x) __attribute__((__alloc_align__(x))) #else #define __alloc_align(x) #endif So. . . __alloc_size(1) __alloc_size(2) translates to (when not a pair of no-ops): __attribute__((__alloc_size__(1))) __attribute__((__alloc_size__(2))) Side note: If that produces an accurate __builtin_object_size value, I wonder why gcc bothered to create the notation (by example): __attribute__((__alloc_size__(1,2))) and only document that notation for the issue. I've not managed to find anything about it. My curiosity need not be anyone's action-item. I do not expect that you would happen to know.
A commit references this bug: Author: pfg Date: Sun Jan 21 15:42:42 UTC 2018 New revision: 328218 URL: https://svnweb.freebsd.org/changeset/base/328218 Log: Revert r327828, r327949, r327953, r328016-r328026, r328041: Uses of mallocarray(9). The use of mallocarray(9) has rocketed the required swap to build FreeBSD. This is likely caused by the allocation size attributes which put extra pressure on the compiler. Given that most of these checks are superfluous we have to choose better where to use mallocarray(9). We still have more uses of mallocarray(9) but hopefully this is enough to bring swap usage to a reasonable level. Reported by: wosch PR: 225197 Changes: head/sys/amd64/amd64/bpf_jit_machdep.c head/sys/arm/xscale/ixp425/if_npe.c head/sys/arm64/arm64/busdma_bounce.c head/sys/cam/cam_queue.c head/sys/cam/ctl/ctl_frontend.c head/sys/compat/ndis/subr_ndis.c head/sys/dev/aacraid/aacraid.c head/sys/dev/advansys/advansys.c head/sys/dev/ath/if_ath_rx_edma.c head/sys/dev/beri/virtio/virtio.c head/sys/dev/bnxt/if_bnxt.c head/sys/dev/bwn/if_bwn.c head/sys/dev/bwn/if_bwn_phy_lp.c head/sys/dev/ciss/ciss.c head/sys/dev/cxgbe/crypto/t4_crypto.c head/sys/dev/e1000/if_em.c head/sys/dev/esp/ncr53c9x.c head/sys/dev/fb/splash.c head/sys/dev/gpio/gpiobus.c head/sys/dev/if_ndis/if_ndis.c head/sys/dev/iwi/if_iwi.c head/sys/dev/ixl/if_ixlv.c head/sys/dev/ixl/ixl_pf_iov.c head/sys/dev/ixl/ixl_pf_main.c head/sys/dev/kbd/kbd.c head/sys/dev/liquidio/base/lio_request_manager.c head/sys/dev/liquidio/lio_main.c head/sys/dev/mpr/mpr.c head/sys/dev/mpr/mpr_mapping.c head/sys/dev/mps/mps.c head/sys/dev/mps/mps_mapping.c head/sys/dev/mpt/mpt_cam.c head/sys/dev/mrsas/mrsas.c head/sys/dev/mxge/if_mxge.c head/sys/dev/netmap/if_ptnet.c head/sys/dev/nvme/nvme_ns.c head/sys/dev/pst/pst-iop.c head/sys/dev/ral/rt2560.c head/sys/dev/ral/rt2661.c head/sys/dev/rp/rp.c head/sys/dev/rp/rp_isa.c head/sys/dev/rp/rp_pci.c head/sys/dev/sound/midi/midi.c head/sys/dev/sound/pci/hda/hdaa.c head/sys/dev/syscons/fire/fire_saver.c head/sys/dev/virtio/console/virtio_console.c head/sys/dev/virtio/mmio/virtio_mmio.c head/sys/dev/virtio/network/if_vtnet.c head/sys/dev/virtio/pci/virtio_pci.c head/sys/dev/vmware/vmxnet3/if_vmx.c head/sys/dev/vnic/nicvf_queues.c head/sys/dev/xen/blkback/blkback.c head/sys/dev/xen/blkfront/blkfront.c head/sys/fs/nfsclient/nfs_clvnops.c head/sys/geom/uzip/g_uzip_zlib.c head/sys/gnu/dev/bwn/phy_n/if_bwn_phy_n_core.c head/sys/i386/i386/bpf_jit_machdep.c head/sys/i386/i386/k6_mem.c head/sys/kern/init_main.c head/sys/kern/kern_cpu.c head/sys/kern/kern_ctf.c head/sys/kern/kern_pmc.c head/sys/kern/subr_bus.c head/sys/kern/subr_taskqueue.c head/sys/kern/subr_vmem.c head/sys/mips/mips/busdma_machdep.c head/sys/mips/nlm/dev/sec/nlmrsa.c head/sys/net/if_vlan.c head/sys/net/iflib.c head/sys/netgraph/ng_bridge.c head/sys/netgraph/ng_deflate.c head/sys/netgraph/ng_parse.c head/sys/netinet6/in6_jail.c head/sys/powerpc/pseries/phyp_vscsi.c head/sys/x86/cpufreq/est.c
Can we close this issue?
I can compile r328218 without problems. Many thanks!
(In reply to Pedro F. Giffuni from comment #18) That -r328218 (the reversion) makes the problem disappear for Wolfram needs to be explained in my view. It is far from obvious that it should change swapping behavior. That is does suggests that the pre-reversion context got at least one large size allocation that now is again not nearly as large. (No evidence of having more allocations, only the sizes seem likely to vary.) It might be appropriate to try subsets of the original list of changes to try to isolate which one(s) matter. (Painful exploration.) (For the above wording I'm presuming that use of: __alloc_size(1) _alloc_size(2) is not tied into the problem and that likely the experiments would be done via a new __alloc_size2(1,2) being used, once a new macro is created, at least for the testing. So the first experiment might be if the problem is even reproducible after such an adjustment.)
(In reply to Mark Millard from comment #20) Well I am assuming the machine failed due to lack of swap space and not due to an overflow in some call to mallocarray. If a runtime overflow had been triggered during the build process, it would be present in other build machines, independent of the swap size. It would have also been clearly traceable. As I wrote it would have been interesting to see if it was the attributes that were causing the issue but ultimately I prefer to deal with such overflows at compile time so I think the attributes must stay.
(In reply to Pedro F. Giffuni from comment #21) Okay. It is just that, if I understand right, the amount of swap space is the same both ways and it has to be additional use of more swap space in order for Wolfram's context to run out of swap space when mallocarray is in use but not when malloc is used directly. (And mallocarray just calls malloc and returns what malloc returns, leaving the additional interface size_t's and such as what is different.) In other words: my notes were based on what is varying vs. what is not varying (as far as I know).
(In reply to Mark Millard from comment #22) Well, exactly what happened is a mistery to me. Both malloc and mallocarray are C functions so it shouldn't be much different to use either by the compiler. I am only especulating that because mallocarray has two alloc_size attributes, while malloc only has one, the compiler is adding a bunch more intermediate code and choking with checks. Maybe there is something to report to upstream (llvm) but I am not sure what ;). mallocarray(9) is rather weird since its only useful if it causes panics: we never detected any specific panic, so the checks were not all that useful and we can live without them. I had another set of malloc--> mallocarray changes that will now go to the bitbucket ... but at least the exrcise was useful to detect some variables that could be unsigned to reduce casting.
(In reply to Wolfram Schneider from comment #19) If you are into an abundance of caution, -r328238 has introduced some of the signed -> unsigned type changes that -r328218 reverted. While mallocarray use has not be reinstated where it was reverted, the still-existing use now has __alloc_size2(n,s) on mallocarray and others with 2 parameters that are multiplied to get the overall size ( -r328221 and -r328237 ). (No longer a pair of separate __alloc_size(x) 's.) So you might want to check that the build problem has not returned. (I do not know why the code that was reverted by -r328218 caused what it did.)
(In reply to Mark Millard from comment #24) I'm busy this week. I takes a lot of time to run the buildkernel on a small VM. I will run the tests again next week, and report back.
Created attachment 190022 [details] malloc(9) --> mallocarray(9) diff Here is a mallocarray(9) conversion patch, in case someone feels like playing with it.