lang/gcc11 is now the default version of gcc at ports git cad2f66a384d74cffa870d978722e6fd579c7d2f. On updating, build exhausted by insufficient capacity of /tmp. At the moment, I had 2GiB of tmpfs for /tmp, backed with 32GiB swap. stable13 at git dac438a9b599cfec13b463a1a8d90283f7ed830f, amd64. By increasing /tmp with 1GiB step (with reboots), finally build finished successfully with 5GiB of/tmp. (4GiB was insufficient.) I'm not sure this is really reasonable for just a toolchain though, at least, this SHALL be warned like www/chromium does on pre-everything:: target does. Is this because of LTO_BOOTSTRAP option (default of amd64)? Note that /tmp was used only 1.7MiB after building lang/gcc11 finished and lang/gcc10 with default option was OK with 2GiB of /tmp.
Rust already solved a similar problem. Here is the commit that solved the issue there: https://cgit.freebsd.org/ports/commit/lang/rust/Makefile?id=b1670e2c3d42a2aeacff843ef0ccea21c0929d03
I'm not sure moving TMPDIR to WRKDIR is a good idea. I'd generally be wary of putting /tmp on tmpfs in low-memory environments. I like the idea of a warning from www/chromium more. Ideally, we would have something like https://gitweb.gentoo.org/repo/gentoo.git/tree/eclass/check-reqs.eclass
Very nice together with WRKDIRPREFIX=/tmp/work in make.conf :-D Several years ago I create 16GB /tmp special for build lang/rust, but it require 17GB now! :-o
(In reply to Lorenzo Salvadore from comment #1) poudriere(-devel) based build? If so, what USE_TMPFS in /usr/local/etc/poudriere.conf ? I'll note that if USE_TMPFS includes workdir (for example, for "yes"), something like rust can use 20+ GiBytes of tmpfs. The cited change in how rust was made to work does not generally solve the problem that it was trying to solve, just for a limited range of styles of builds. poudriere.conf also has: # List of package globs that are not allowed to use tmpfs for their WRKDIR # Note that you *must* set TMPFS_BLACKLIST_TMPDIR # EXAMPLE: TMPFS_BLACKLIST="rust" The TMPFS_BLACKLIST_TMPDIR place needs sufficient free disk space, even though this avoids RAM+SWAP being needed. I avoid large tmpfs use by poudriere bulk runs via using USE_TMPFS="data".
(In reply to Mark Millard from comment #4) portupgrade for me. It serializes port build order, so parallel builds are only within each single port.
(In reply to VVD from comment #3) For parallel build systems, something like TMPDIR=/tmp/portbuild/lang_gcc11/ would be reasonable to ease deleting leftovers of broken builds. I'm using tmpfs for /tmp, so simple reboot cleans everything. But if someone having dedicated physical partition for /tmp, it could be difficult for certain cases if everything is directly under /tmp.
(In reply to Piotr Kubaj from comment #2) Interesting. If each port can state minimum requiments (default: unlimited to ease porting small ones) for TMPDIR etc, Mk/bsd.port.mk would be able to check if there's enough space and stop without actual extracting and later processes started. If setting some limited default, we would need considering local builds on small embedded computers.
(In reply to Lorenzo Salvadore from comment #1) It would be simplest. But possibly make smaller ports built slower if WRKDIR is on slower drive than TMPDIR is on. It's a trade-off. If doing this safer for ports BUILD_DEPENDS on gcc11 (including USE_GCC and USES=compiler cases), the default configuration should be in Mk/bsd.gcc.mk or Mk/Uses/compiler.mk.
(In reply to Tomoaki AOKI from comment #5) Sort of like telling poudriere to use just one builder, such as via -J1 . I'm not familar with portupgrade. It may not have anything analogous to poudriere.conf's USE_TMPFS .
I add to CC the person that pointed my attention to the rust commit and the commit author. Maybe they are able to help or might find the conversation useful to improve the solution for rust. By the way, I confirm that I also had similar problems in poudriere that I fixed with USE_TMPFS=data. But I can't remember if it was with gcc11. Probably not, especially because I do not build packages for my system, I only test specific ports and fetch their dependencies with the -b flag.
(In reply to Tomoaki AOKI from comment #0) > Is this because of LTO_BOOTSTRAP option (default of amd64)? I think you might be right about it. Have you tried compiling the port with STANDARD_BOOTSTRAP instead? What about without any bootstrap option? See also bug #261977, which contains a reference to commit https://cgit.freebsd.org/ports/commit/?id=aadf6428cc480fbeda72ec90d53ef340e95f49ca that recently introduced LTO_BOOTSTRAP as default. (In reply to Piotr Kubaj from comment #2) If the issue is indeed LTO, we could add a warning as you suggested. I think it would be nice to introduce it as a pkg-help file, which is displayed when choosing options and should explain that, if the machine is not powerful enough, the default option should be changed to disable LTO_BOOTSTRAP. On the other hand, we could also change the default options, as you already did in some architectures. We would lose the optimiziation in prebuilt packages, but I don't know how much is it worth it: I think we are risking that many people that compile their ports with poudriere without modifying port options would get into trouble... Is the performance improvement using LTO really significant? If not, I would renounce to it for the sake of convenience. Another possibility could be to have separate ports or flavors: one without LTO and one with LTO, but maintaining all the versions of GCC that we already have seems complex enough, I don't think it is wise to increase complexity.
(In reply to Lorenzo Salvadore from comment #11) FreeBSD's port build servers are not used for native armv7 or armv6 build activity: qemu is used instead. lang/gcc* have the issue that the after the system clang built stage, the rest of the stages no longer use system clang to cross compile: just qemu. (armv7 and armv6 also have address space limitations in contexts were the hardware can directly execute the code, not that official builds use such hardware.) LTO_BOOTSTRAP based builds targeting for armv7/armv6 never worked on those builders as I understand. The disabling of LTO_BOOTSTRAP style for armv7/armv6 was to allow the FreeBSD servers to finally manage to build the lang/gcc* armv7/armv6 ports that had tried to use LTO_BOOTSTRAP for armv7/armv6. (There is more to the story for what alternative was selected as I understand, but that is not relevant here.) As always, the default OPTIONS are the definition of what the official port build servers build --and those definitions are not based on what people who choose to do their own builds select for OPTIONS. (Note: I build my own.) The OPTIONS allow avoiding the LTO bootstrap. (And I do avoid LTO_BOOTSTRAP .)
(In reply to Mark Millard from comment #12) Thanks Mark, I agree that default options should not be based on what users choose as their options when they build their own ports. However I wonder - if it makes sense to select default options that do not work for many machines. Selecting such defaults will create issues for many users, and some of them will not be able to figure out how to solve the problem. Some will ask for help and we risk to repeat many times the same anwser. - if it is correct to assume that the port build servers can build anything. I think they can indeed compile GCC with LTO (maybe they have already done it), but how much time do they need? We have lots of ports, maybe having the servers stuck on a port that could be compiled much faster with different default options is not a good idea, it could slow down the packages creation process too much. Do we have some tool to check how much time do port build servers use on specific ports? I tried https://pkg-status.freebsd.org/ , but I could not find anything. In the meantime, I tested building GCC 11 without any bootstrap option and with STANDARD_BOOTSTRAP: I had very reasonable times in both cases. I still have to verify how is GCC 11 with LTO_BOOTSTRAP in my case.
(In reply to Lorenzo Salvadore from comment #13) The point of the defaults is only for the FreeBSD build server activity and, so just those machines. I'd expect the defaults to be tailored to work only for that context. Setting defaults otherwise involves too many unknowns, too many conflicting goals, too wide a variety of machines, etc. for the build server activity to be readily/well managed as well. A means of control (OPTIONS) has been given. (One can also work in a way that one varies the Makefiles themselves.) Yes, it does mean that understanding what to do to do ones own builds involves more that if someone had already done the tailoring based something that one just happens to find fits ones desires for doing builds of parts. (I actually experimented with "bulk -a -c" to learn enough to not have to worry much about having builds fail for resource limitations, spanning 8 GiByte RPI4B's through a 1st generation ThreadRipper 1950X as builders. On very rare occasion I start a "bulk -a -c" test in one of the contexts to see if I should adjust things. That is not the only type of rare test.) Measuring and comparing the time of individual builds that are built via parallel builders, each allowed to have parallel processes involved, is very problematical. The easier you make that individual comparisons, the more likely you are causing significant, extra idle "freebsd cpu" time. The more time spent making some progress on something whenever something could be worked on, the harder it is to compare individual times in a useful way. (Personally, I use criteria that lead to high load averages compared to the "freebsd cpu" count involved most of the time. This makes individual comparisons very messy. But "bulk -a -c" total time is still very useful. Smaller but specific subsets can also serve. The official servers are biased differently for how they are handled.) https://pkg-status.freebsd.org/?all=1&type=package can be used to look at on-going and past production of packages from ports. (It is just a place to find other pages about specific builds and see some summary information.) This is not an appropriate place to go into detail. But a recent "bulk -a -c" sort of run for main targeting amd64 is visible via: http://beefy18.nyi.freebsd.org/build.html?mastername=main-amd64-default&build=pb790baec9029_s70b5d8fa0f I got to it via starting from https://pkg-status.freebsd.org/?all=1&type=package .
(In reply to Mark Millard from comment #14) Relative to poudriere, I should have mentioned poudriere.conf and other poudriere configuration files along with my mentioning OPTIONS.
(In reply to Mark Millard from comment #14) Thanks Mark, those links are exactly what I was searching. I read in there that our package build servers took 8 hours to build gcc11. which is 6 hours more than what was necessary for gcc10, for which LTO_BOOTSTRAP is not available. Of course, this is only for one version (CURRENT) and for one architecture (amd64): LTO_BOOTSTRAP is default on 6 architectures and at the moment we support 3 releases. So LTO_BOOTSTRAP by default probably increases packages build time consumption by a few days (distributed on all machines building packages; I don't know how many they are). Is it really acceptable? Also, please keep in mind that everything we are discussing for gcc11 also concerns gcc11-devel, so all measurements should be multiplied by 2. Moreover, we have to think about exp-runs: is it acceptable to increase exp-runs building times by 6 hours per jail? I am unsure. I add antoine@ to CC, so we can ask him directly. I would also like to ask some packages building servers maintainer, but I am unsure who should I ask. Unless LTO optimiziation is really significant (but I fear it is not), I would disable it by default for the sake of more efficient building machines and faster exp-runs. If there is no agreement on this, I think at least explaining the issue in pkg-help is necessary.
(In reply to Lorenzo Salvadore from comment #11) Very roughly tested result: Disabling both bootstrap allowed build with 2GiB (my previous working size) /tmp. STANDARD_BOOTSTRAP also allowed build with 2GiB /tmp. With LTO_BOOTSTRAP (default), requires at least 5GiB of /tmp. (reproduced) Sorry, build times are not measured. Note that I could confirm free space on /tmp became 195MiB with LTO_BOOTSTRAP and 5GiB /tmp. Possibly lesser free space left at some point, but anyway, 5GiB was sufficient for me.
(In reply to Lorenzo Salvadore from comment #16) The FreeBSD port building servers use MAKE_JOBS_NUMBER=2 in its make environment but LTO_BOOTSTRAP does not fully respect that process count limit for its builder. If I remember right, main-amd64 uses 14 builders. (No on-going build so I can not check.) There is also how many threads per process (at times anyway). MAKE_JOBS_NUMBER=2 is visible in the logs. Pretend, for the moment, that MAKE_JOBS_NUMBER=2 was fully respected and only one thread per process. That would mean 28 processes on a system with 32 "freebsd cpus" (16 hyper-threaded cores). In this kind of context, building lang/gcc1[1-3]* (so 5 ports) in separate builders at the same time would take more like 8 hours elapsed than 40 hours elapsed. (I ignore that lang/gcc11* takes less time than lang/gcc12* or lang/gcc13-devel [if I remember right].) And other ports would also be built during those same 8+ hours: 23 other builders would be available and probably be active much of the time. Things are actually more complicated and I can not run such a build on the system to see what would actually happen. But it seems likely that less than 40 hours would elapse and other ports would build over the time interval(s) that at least one lang/gcc1[1-3]* builder was still active. The general point would apply even if lang/gcc1[2-3]* took more like 12+ hours but the detailed numbers would be different.
(In reply to Mark Millard from comment #18) Gack. I mixed from builder counts and process counts in one place my example. Correcting: And other ports would also be built during those same 8+ hours: 9 (==14-5) other builders would be available and probably be active much of the time.
(In reply to Lorenzo Salvadore from comment #16) Another interesting build-time example from the build I used as an example: 74:30:50 elapsed time (a little over 3 days) for building 31767 ports (plus 111 failed, the rest skipped or ignored of the 32460 queued). That 3 days or so includes building all the lang/gcc* , devel/llvm* , lang/rust , and some other ports that take even longer than these do (individually, not in total across a compiler family). Only one server is used to do the build (beefy18 for the example). Of course, the timescale would be different for, say, arm64 than for amd64. An example for arm64 is: 143:10:05 with 30694 built, 127 failed. So more like 6 days. arm64 is built on native hardware (ampere2 for the example), not via qemu.
A few thoughts here: - should build cluster defaults possibly be refactored to a separate cluster-specific configuration (that will obviously have to be documented and publicly available so we stand a chance of analyzing pkg-fallout mail) - should there be a tuning guide which gets revisited, say, yearly, from configuration data we have, or from polling with users, to decide what the default port settings should look like? Say, "what did people usually buy five or four years ago" which should cover most, and those who run written-off stuff may occasionally tweak. (*) Or do we need to offer a template file for make.conf with a handful commented-out sections that we consider as a union cover most end-user machines? I know I have had fierce discussions about optimization for graphics/rawtherapee which is default-tuned for run-time performance and one user was pissing at me in an egoistic local-optimization style "but I don't need" and generalizing from there. Arguing based on truly obsolete hardware, like decade old, and I should not rely on GCC (that was at a time when base clang was not up to snuff the same way as it is today, but let's leave that aside), and we will not be able to cater for those in the builders or port defaults either. (*) This includes my low-end rental virtual root-access server, too, which has 1 Xeon core and 2 GB RAM but I try to avoid building the big stuff. I only get upset when default Python 3 stops building there. ;-)
I have the impression that I am the only one to be worried about package build servers load, so I would say that at least for now writing a pkg-help is enough. We can always find a better solution later if needed. I have created a phabricator review with a pkg-help draft: https://reviews.freebsd.org/D35688 (In reply to Mark Millard from comment #18) > And other ports would also be built during those same 8+ > hours: 23 other builders would be available and probably > be active much of the time. If all remaining builders are active, it probably means that the builders making gcc could be used to make something else if compilation was quickier. On the other hand, if there are some inactive builders, then they are inactive because they have gcc as dependency and then we have a bottleneck. (In reply to Matthias Andree from comment #21) > - should build cluster defaults possibly be refactored to a separate cluster-specific configuration (that will obviously have to be documented and publicly available so we stand a chance of analyzing pkg-fallout mail) I think it is a good idea, but I don't see it happening: it would make maintainance and testing more complex for too little gain. Do we have other cases where default options are good for package build servers but often bad for users building their own ports? (By often, I mean often enough to have reports of users failing their compilations. In this case we have this PR and some more cases on EFNet.) > - should there be a tuning guide which gets revisited, say, yearly, from configuration data we have, or from polling with users, to decide what the default port settings should look like? Say, "what did people usually buy five or four years ago" which should cover most, and those who run written-off stuff may occasionally tweak. We have plenty of different ports, each one with their specific particular cases. I think we can have some general guidelines, as we already have, but I think choosing the best combination of default options is the maintainer's responsibility.
(In reply to Lorenzo Salvadore from comment #16) FYI: a beefy18 main-amd64 build started and it actually uses 13 builders. See: http://beefy18.nyi.freebsd.org/build.html?mastername=main-amd64-default&build=p4cf95047288c_s836d47d38e It was showing a load average of around 24.4 (at 51%) when I looked. So it appears to have 48 FreeBSD "cpus" (24 hyperthreaded cores?) And 24.4 with MAKE_JOBS_NUMBER=2 per builder means around 12 or 13 builders active if nearly all are using 2 active processes via make. (Some ports only use 1.) That matches up with the list of 13 that is shown. So the load average is only about half of the FreeBSD "cpu" count. Not a bias to minimizing time-to-completion. (But some ports do not respect MAKE_JOBS_NUMBER=2 in their load average contribution, so at times more than half may be in use.) With the configuration avoiding load averages that are larger than the FreeBSD "cpu" count, the port build times are closer to being independent, more like they would be if built separately. Thus comparing individual port-build times with other configurations that do similarly is useful. (But usefully comparing to a high load average configuration [from lack of such use of MAKE_JOBS_NUMBER] would be much messier.)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=320e9debc3c3b4a90292a9aa29f139be9df00f40 commit 320e9debc3c3b4a90292a9aa29f139be9df00f40 Author: Lorenzo Salvadore <salvadore@FreeBSD.org> AuthorDate: 2022-07-01 09:40:20 +0000 Commit: Lorenzo Salvadore <salvadore@FreeBSD.org> CommitDate: 2022-07-07 22:56:01 +0000 lang/gcc>=11: Warn about LTO_BOOTSTRAP Warn users about the amount of ram and time needed to build GCC with LTO_BOOTSTRP enabled. PR: 264949 Reported by: Tomoaki AOKI <junchoon@dec.sakura.ne.jp> and others Reviewed by: gerald Differential Revision: https://reviews.freebsd.org/D35688 lang/gcc11-devel/pkg-help (new) | 3 +++ lang/gcc11/pkg-help (new) | 3 +++ lang/gcc12-devel/pkg-help (new) | 3 +++ lang/gcc12/pkg-help (new) | 3 +++ lang/gcc13-devel/pkg-help (new) | 3 +++ 5 files changed, 15 insertions(+)
I have committed the warning. Hopefully this will be helpful.
What is the point of LTO bootstrap? Bootstrap is built to build gcc later, in the second phase. LTO increases the build time of bootstrap by hours. And later it saves maybe ~10 minutes during gcc build. So what is the point of having LTO_BOOTSTRAP=ON? What does it achieve? Yuri
(In reply to Yuri Victorovich from comment #26) When a FreeBSD package builder updates a lang/gcc11 , how many machines is that update installed to? On those machines, how many times is the lang/gcc11 used before the machine gets its next update of lang/gcc11 ? For such usage, how much was gained in total from how lang/gcc11 was built on the FreeBSD package builder? Does this make the FreeBSD's package builder use of LTO_BOOTSTRAP an overall gain for folks that do not build their own lang/gcc11 toolchain updates? Of course, this is very different than, say, my context: the resource/time usage of lang/gcc11 between lang/gcc11 updates, that I build via poudriere, normally totals to well less than the build resource/time usage for doing one lang/gcc11 LTO_BOOTSTRAP style build. (Use of lang/gcc* by ports I build is rare.) (Applies to whatever lang/gcc* that I use. gcc11 is just a potential example.) For my context, I normally have both of the following disabled for all my lang/gcc* build activities, at least where the 2 *_BOOTSTRAP options exist: LTO_BOOTSTRAP STANDARD_BOOTSTRAP So no bootstrap at all. The build is vastly faster and otherwise less resource intensive than use of either *_BOOTSTRAP option, which is reasonable for my specific usage pattern. Nothing forces one to use LTO_BOOTSTRAP for one's own builds of lang/gcc11. It is avoidable by configuration activity. (The defaults are set up for the FreeBSD package build servers.)
(In reply to Mark Millard from comment #27) Maybe I am confused, but to me LTO_BOOTSTRAP is LTO for the bootstrap phase. This speeds up the second phase, and not gcc itself. It is unclear if the second phase uses LTO at all.
(In reply to Yuri Victorovich from comment #28) Building a lang/gcc* always starts with a system-clang based build that does not involve LLVM's LTO. (That system-clang is used is a FreeBSD-ism. Other contexts need not use llvm/clang for their system toolchain.) If neither LTO_BOOSTRAP nor STANDARD_BOOTSTRAP is enabled, then that is all that is done and the gcc* built was not built by a gcc* at all. But if one of the *_BOOTSTRAP is enabled, then the gcc* built by system-clang in turn builds itself (the boostrap stages). The bootstrap stages can include multiple builds of the gcc* where a gcc* build is used to make another gcc* for comparison to be sure no differences in the output show up. This activity involves LTO when LTO_BOOTSTRAP is enabled. See https://gcc.gnu.org/install/build.html . Quoting some of the material: QUOTE Building a native compiler For a native build, the default configuration is to perform a 3-stage bootstrap of the compiler when ‘make’ is invoked. This will build the entire GCC system and ensure that it compiles itself correctly. It can be disabled with the --disable-bootstrap parameter to ‘configure’, but bootstrapping is suggested because the compiler will be tested more completely and could also have better performance. The bootstrapping process will complete the following steps: Build tools necessary to build the compiler. Perform a 3-stage bootstrap of the compiler. This includes building three times the target tools for use by the compiler such as binutils (bfd, binutils, gas, gprof, ld, and opcodes) if they have been individually linked or moved into the top level GCC source tree before configuring. Perform a comparison test of the stage2 and stage3 compilers. Build runtime libraries using the stage3 compiler from the previous step. END QUOTE
Thanks Lorenzo, I was able to use your advice to figure out why gcc was taking so long on my home server. With LTO optimization as the default, my builds were taking way longer than 3h45m (I cancelled it since I was trying to figure out why and experimenting with different configurations). This was even with me running the poudriere bulk build with 'nice -n -20' on an AMD FX(tm)-8120 Eight-Core Processor with 32 GB of RAM (all spinning disks). Once I built I changed the gcc option to 'STANDARD_BOOTSTRAP', the build finished in 1h20m! On my framework laptop also running FreeBSD, its 11th gen Intel Core i7-1165G7 CPU has no issues with building with default settings. I didn't look at the time though on the other side, it just didn't feel like it was "stuck". The laptop is also on an NVMe though so I'm sure that actually makes a big difference. build of lang/gcc11 | gcc11-11.3.0_5 ended at Sun Nov 6 19:16:57 EST 2022 build time: 01:20:57 I would probably recommend using whatever build option does the most good, in this case I would think allowing processors to continue building and updating most of the system in a faster capacity, would be better than the benefits of LTO as a default option. This of course is a shift in allowing older harder way to continue -relatively smoothly- to receive/compile updates in a faster manner. But I'm happy that there is an option at least :).