Bug 261977 - lang/gcc12-devel: enable LTO
Summary: lang/gcc12-devel: enable LTO
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Piotr Kubaj
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-15 22:58 UTC by Piotr Kubaj
Modified: 2022-06-29 23:23 UTC (History)
9 users (show)

See Also:
bugzilla: maintainer-feedback? (toolchain)


Attachments
patch (501 bytes, patch)
2022-02-15 22:58 UTC, Piotr Kubaj
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Piotr Kubaj freebsd_committer freebsd_triage 2022-02-15 22:58:56 UTC
Created attachment 231848 [details]
patch

Tested to build on amd64, i386, powerpc64 and powerpc, so it seems safe for 32-bit architectures as well.

Older GCC ports should be switched as well later when this changed is tested.
Comment 1 dewayne 2022-02-18 02:22:38 UTC
(In reply to Piotr Kubaj from comment #0)
Thankyou for the patch.  I've applied the one line change to gcc10 on FreeBSD12.2S, and without any other changes this is my observation.
 
With the patch
-rwxr-xr-x  3 root  wheel  854248 17 Feb 17:57 /usr/local/bin/gcc10

Without the patch
-rwxr-xr-x  3 root  wheel  1202736 16 Jan  2021 /usr/local/bin/gcc10*

Building gcc10 with the patch took longer to build/package.  I've compiled a few (simple) packages and they perform as expected.
Comment 2 Gerald Pfeifer freebsd_committer freebsd_triage 2022-03-02 08:11:36 UTC
(In reply to Piotr Kubaj from comment #0)
> Older GCC ports should be switched as well later when this changed
> is tested.

I don't recommend that, with the exception of lang/gcc11* *maybe*.

If anyone has time and energy available, updating the default version
of GCC from 10 to 11 is more important:

   https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=258378
   [exp-run] Update GCC_DEFAULT from 10 to 11
Comment 3 commit-hook freebsd_committer freebsd_triage 2022-03-02 12:06:44 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=9982a1e984870436263fe7fd4d257bf7dd9d2b23

commit 9982a1e984870436263fe7fd4d257bf7dd9d2b23
Author:     Piotr Kubaj <pkubaj@FreeBSD.org>
AuthorDate: 2022-03-02 11:58:04 +0000
Commit:     Piotr Kubaj <pkubaj@FreeBSD.org>
CommitDate: 2022-03-02 11:58:04 +0000

    lang/gcc12-devel: switch to LTO bootstrap

    PR:     261977
    Approved by:    toolchain (maintainer timeout)

 lang/gcc12-devel/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 4 Piotr Kubaj freebsd_committer freebsd_triage 2022-03-02 12:07:43 UTC
OK, committed for 12. Let's see whether it builds now fine for all the packaged architectures and versions.
I think it would still be nice to have it for 11 later.
Comment 5 commit-hook freebsd_committer freebsd_triage 2022-03-12 20:50:33 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=c1d00dc843e253fc9c7029fdff88535802466f58

commit c1d00dc843e253fc9c7029fdff88535802466f58
Author:     Piotr Kubaj <pkubaj@FreeBSD.org>
AuthorDate: 2022-03-12 20:46:50 +0000
Commit:     Piotr Kubaj <pkubaj@FreeBSD.org>
CommitDate: 2022-03-12 20:46:50 +0000

    lang/gcc11-devel: switch to LTO bootstrap

    Following successful builds of lang/gcc12-devel on amd64, i386, aarch64, powerpc
    and powerpc64 and lack of actiol from toolchain@, switch lang/gcc11-devel to
    LTO as well.

    PR:     261977

 lang/gcc11-devel/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 6 Mark Millard 2022-03-27 01:43:44 UTC
(In reply to commit-hook from comments #3 and #5)

These changes have caused large increases in my build times and resource usage
build building gcc* 's --without providing an OPTION to control the behavior.

For example, I built lang/gcc12-devel on a 16 Cortex-A72 HoneyComb with 64
GiByes of RAM. This was the only builder active and the machine was otherwise
unloaded. It was allowed use of all 16 cores. It took 4 hours to build and
at one point got: 12278Mi MaxObs(Act+Lndry+SwapUsed). [But no swap usage was
ever reported, so: Act+Lndry.] At least twice there were 7 /usr/local/bin/ld
processes doing lto1 activity at the same time over long periods. The large
Act+Lndry was from one of these times.

(I have a patched version of top that monitors and reports various
"Maximum Observed" figures.)

I'll note that my prior build of devel/llvm14 (.r4) by otself via the
HoneyComb [also unloaded] took under 1hr 40 min. But, in that case, I
had used:

# more /usr/local/etc/poudriere.d/options/devel_llvm14/options 
# This file is auto-generated by 'make config'.
# Options for llvm14-14.0.0.r2
_OPTIONS_READ=llvm14-14.0.0.r2
_FILE_COMPLETE_OPTIONS_LIST=BE_AMDGPU BE_WASM CLANG DOCS EXTRAS FLANG LIT LLD LLDB MLIR OPENMP PYCLANG BE_FREEBSD BE_NATIVE BE_STANDARD
OPTIONS_FILE_SET+=BE_AMDGPU
OPTIONS_FILE_UNSET+=BE_WASM
OPTIONS_FILE_SET+=CLANG
OPTIONS_FILE_SET+=DOCS
OPTIONS_FILE_SET+=EXTRAS
OPTIONS_FILE_UNSET+=FLANG
OPTIONS_FILE_SET+=LIT
OPTIONS_FILE_SET+=LLD
OPTIONS_FILE_SET+=LLDB
OPTIONS_FILE_SET+=MLIR
OPTIONS_FILE_SET+=OPENMP
OPTIONS_FILE_UNSET+=PYCLANG
OPTIONS_FILE_UNSET+=BE_FREEBSD
OPTIONS_FILE_SET+=BE_NATIVE
OPTIONS_FILE_UNSET+=BE_STANDARD

So one could argue with how to make such a comparison.

I looked around that the package status information from various
builds and sometimes lang/gcc12-devel gets runaway_process and
other times completes. My guess for the official build servers
is that it depends on what other builders are doing over the
same time frame.

But there is another, possibly related issue. In my 16-core context,
top reported:

last pid: . . .;  load averages:  . . . MaxObs:  28.02,  17.04,  16.87

So, on the timescale of the first load average, the lang/gcc12-devel
build does not always stay limited to the hardware threads available.
(I happen to have my configuration set up for high load average
contexts.)

Overall, the implications of the LTO based builds for those with
systems with signficantly less resources are messy for them.

I understand having default options that match what the FreeBSD build
servers are supposed to build. But not having control of such things
without editing of the Makefiles seems odd for the general audience
that does local builds. (I can maintain adjusted Makefiles so I am
not stopped from reverting the code for my own activities. But still
. . .)
Comment 7 Piotr Kubaj freebsd_committer freebsd_triage 2022-03-27 08:43:34 UTC
You should contact portmgr@ to increase timeouts on the package builders.
Regarding adding option, please provide a tested patch :)
Comment 8 Mark Millard 2022-03-27 15:47:14 UTC
(In reply to Mark Millard from comment #6)

Just for reference: I tested the build time for reverting the
code in my context:

# git -C /usr/ports/ diff lang/gcc12-devel/Makefile
diff --git a/lang/gcc12-devel/Makefile b/lang/gcc12-devel/Makefile
index 644abf2cbb86..fab28c952f80 100644
--- a/lang/gcc12-devel/Makefile
+++ b/lang/gcc12-devel/Makefile
@@ -83,7 +83,7 @@ CONFIGURE_OUTSOURCE=  yes
 .if empty(PORT_OPTIONS:MBOOTSTRAP)
 CONFIGURE_ARGS+=--disable-bootstrap
 .else
-CONFIGURE_ARGS+=--with-build-config=bootstrap-lto-noplugin
+CONFIGURE_ARGS+=--with-build-config=bootstrap-debug
 ALL_TARGET=    bootstrap-lean
 .endif
 INSTALL_TARGET=        install-strip

The result was 01:38:45 for using bootstrap-debug:

[01:39:00] [01] [01:38:45] Finished lang/gcc12-devel | gcc12-devel-12.0.1.s20220306_1: Success

instead of the earlier 04:06:27 for using bootstrap-lto-noplugin :

[04:33:13] [01] [04:06:27] Finished lang/gcc12-devel | gcc12-devel-12.0.1.s20220306_1: Success
Comment 9 Mark Millard 2022-03-27 16:11:21 UTC
(In reply to Mark Millard from comment #8)

Hmm. I forgot to quote about memory use as well: For bootstrap-debug use
it was: 5317Mi MaxObs(Act+Lndry+SwapUsed) vs., for example, the 12278Mi MaxObs(Act+Lndry+SwapUsed) when bootstrap-lto-noplugin is used.
Comment 10 Mark Millard 2022-03-27 19:18:08 UTC
(In reply to Piotr Kubaj from comment #7)

While I had been referencing a port-specific option, I see there
is a USE_LTO= available in the infrastructure.

I've no clue of the intent. May be use of bootstrap-lto-noplugin
vs. bootstrap-debug should repect the status of the USE_LTO?

I leave it to someone that knows the intended direction for having
optional LTO use as the ports progress --or if there is an intent
to have it be optional at all long term. (Even if I consider lack
of optionality for LTO being an odd choice when there are notable
time or memory tradeoffs.)
Comment 11 Matthias Andree freebsd_committer freebsd_triage 2022-03-28 09:46:36 UTC
We are acting as though memory (perhaps on rented machines) were an ubiquity. It is not and the memory requirements shown here are nothing but insane.  This is not only about GCC, but also other parts of the typical "local build" set, for instance, rust and webkit stuff. Cores galore have become cheap, but memory still has not.  What Mark's comment on memory essentially means is that we break the 8 GB barrier...
Comment 12 Mark Millard 2022-03-28 10:29:42 UTC
(In reply to Matthias Andree from comment #11)

Careful with size calculations: my figures were for a 16 core system
using all the cores. I've not investigated if, say, a 4 core system
using all its cores would use more like 1/4 the memory or not. It
would not get as many ld's going at once and I do not know the
relative sizes of the various ld's that might overlap in time for
whatever the configuration is.

The same sort of point goes for the memory use for bootstrap-debug
style builds: it was the same 16 core type of context.

General expectations of relatively bigger vs. smaller memory use
between the two bootstrap-* styles is likely very reasonable. But
I do not know about detailed sizing relationships across a variety
of contexts.

I only got into testing this as a way to provide some background
information for:

https://lists.freebsd.org/archives/freebsd-toolchain/2022-March/000450.html

where Dimitry Andric did not have direct access to information
he wanted about a recent build failure on a FreeBSD aarch64 server.

I do not really plan on doing more in this bugzilla submittal's
comments. I've provided my evidence about tradeoffs and presented
my points. I'm unclear which mechanism would be appropriate to
express picking the bootstrap-* style to use.
Comment 13 Matthias Andree freebsd_committer freebsd_triage 2022-03-28 11:02:20 UTC
My point is that cores are a commodity, but RAM is not. Not on amd64 and even less so on the single-board RISC stuff (ARM64, or MIPS), but still your data point is quite transferable to my big local builder which has 32 GB RAM 8-core/16-thread Ryzen 7 1700, usually running Linux as Desktop OS with a FreeBSD headless builder that gets like 12 GB RAM allotted, which would now be insufficient...

We really need to take care that our toolchain does not end up running only on the big irons but no longer on end user's machines.
Comment 14 Dimitry Andric freebsd_committer freebsd_triage 2022-03-28 11:11:44 UTC
LTO is a trade-off, where you sacrifice more memory and CPU at link time for possibly faster and/or smaller final binaries. That it eats more resources is almost guaranteed, but slightly "better" binaries may not be worth the cost...

(Side note: one annoying aspect is that because it can be so time consuming, these heavy link jobs tend to cluster up, if you do a multi-jobbed build, either via poudriere or some other build system. Which then means that you have a bunch of huge linking processes running in parallel, eating up even more resources!)

That said, I would say the safe, conservative choice is to make use of LTO an opt-in (either global, or per port), instead of an opt-out?

I mean, if you have a huge machine with plenty of CPU and RAM, then by all means LTO the hell out of everything, but most people aren't that lucky (or rich). :-)
Comment 15 Matthias Andree freebsd_committer freebsd_triage 2022-03-28 11:57:41 UTC
That would certainly fit MY bill. The thing with LTO is that in essence the code generation moves into the "linker" stage which in fact, with LTO, is also the final cross-module optimizing compiler stage... Earlier LTO implementations were single-threaded and thus crap slow. (Tried that with graphics/rawtherapee or g.../darktable, the latter of which has different LTO issues though)
Comment 16 Piotr Kubaj freebsd_committer freebsd_triage 2022-03-28 12:34:56 UTC
I don't have exact statistics but I believe it's fair to say that most users currently use pre-built packages. And FreeBSD Foundation has enough money to buy some more RAM.

As such, it would be beneficial to have smaller and faster binaries by default.
People who build their own packages are free to disable LTO. It would be obviously still opt-out, but if you build your own packages, you probably do some customizations anyway.

So please just add an option (enabled by default) to use LTO and disable it on your own builders if that causes issues.
Comment 17 Robert Clausecker freebsd_committer freebsd_triage 2022-04-19 15:23:36 UTC
The new LTO bootstrap makes the build dog slow.  lang/gcc11-devel for armv7 failed with QEMU on a Skylake XEON box, timing out after ~70 hours.  The native build on my RPi 4B is still going strong after 64 hours, total duration unknown.  This is not an acceptable build time for any package.

If it was me, I would turn off the whole bootstrap thing completely.  The gcc binary is just as good when compiled with clang and build times will be a lot better.
Comment 18 Piotr Kubaj freebsd_committer freebsd_triage 2022-04-19 16:27:11 UTC
(In reply to Robert Clausecker from comment #17)
Are build times that relevant? Sure, if you insist on building manually on a slow hardware, it's going to be painful, but we have binary packages for such users.

IMO faster and smaller packages for everyone is worth it and we have package cluster exactly to allow users to easily install ready-to-use packages.

If for some reason builds time out on the cluster (don't know, didn't check), then you should open a new PR to ask portmgr@ to increase timeouts (or maybe use more threads to build gcc). Since I'm not in portmgr@, I can't do that.

Although I was not around at that time, I believe when -O2 was added to default flags, many people also complained about that, due to more resources being used, but after all, it was worth the cost. Or do you also add -O0 to your builds to make them faster?
Comment 19 Robert Clausecker freebsd_committer freebsd_triage 2022-04-19 16:47:51 UTC
(In reply to Piotr Kubaj from comment #18)

Built times are not super relevant as long as they stay in a reasonable range.  Waiting 3 days for the compiler to build is not reasonable.  And as I said, this is both for native and for cross builds from reasonably fast hardware.

Whenever there is some sort of update to the ports tree that changes a dependency of gcc, Poudriere recompiles gcc.  So excessive build times significantly interfere with my ability to test ports.

Right now I do not change any build flags nor do I provide any custom options because if I did, my test results would not be applicable.

And even for normal, source-building users (which do exist), waiting three days for a compiler to be built is completely unreasonable.  And that's the time up until now.  The compiler build has not finished yet and I don't know when it will.

Have you weighed the extra build time against the performance advantage this brings?  How much does the LTO-built gcc speed up build times in comparison to a normally built (or even non-bootstrapped) gcc?
Comment 20 Matthias Andree freebsd_committer freebsd_triage 2022-04-19 16:51:56 UTC
(In reply to Piotr Kubaj from comment #18)
Build times are super relevant if we want to keep any faint trace of manpower working on ports.

For regular work on any of my ports, the usual *re*build set comprises 300 to 900 ports, and I have jumped through hoops for MONTHS to deal with the excessive RAM and disk space use of, usual offenders, webkit, rust (because it's a requisite to one of the popular graphics libraries), and usually at least two versions EACH of llvm and gcc, although I do not build ${COMPILER}-devel ports.

That also means that I spend one day setting the tables for ports work and then a very short time doing my ports work, test, commit, and then hope that a git pull --rebase won't break my port when one of its upstream requisites changed again.
Comment 21 Brooks Davis freebsd_committer freebsd_triage 2022-04-19 16:58:45 UTC
A metric to consider when enabling LTO by default for a compiler: does enabling LTO decrease the total build time for the full ports collection? If so, turn it on (subject to memory constraints). If not, keep it off.

My assumption is that few users are installing most of our toolchain ports because they want to use them directly. They are installing them because they are dependencies. Obviously some people are using them directly, but I suspect being pulled in as a dependency is more common.
Comment 22 Robert Clausecker freebsd_committer freebsd_triage 2022-04-19 17:01:47 UTC
(In reply to Piotr Kubaj from comment #18)

> Sure, if you insist on building manually on a slow hardware,
> it's going to be painful, but we have binary packages for such users.

So please tell me: which fast and affordable hardware am I supposed to test armv7 ports on then?  Just let me know what the entry barrier is so I can check if I can afford to continue working on the FreeBSD ports collection.

Note that even cross-building doesn't help as even my Intel(R) Xeon(R) CPU E3-1270 v5 based home server takes more than 72 hours to build lang/gcc11-devel (after which the build was mercy-killed by Poudriere).  Is this machine already considered slow hardware that no reasonable developer would compile ports on?

What about native ARM boxes?  QEMU support for armv7 sucks and many ports fail due to concurrency and other issues, often with terrible failure modes.  So testing armv7 mostly cannot be done with QEMU.

Just please let me know what the target audience is for which the build times are reasonable.
Comment 23 Matthias Andree freebsd_committer freebsd_triage 2022-04-19 17:06:43 UTC
Re compiler bootstrapping with LTO, it is pointless per se because you build the first compiler stage with whatever is on the system (which is pretty reasonable on FreeBSD), then you build the second compiler stage, i. e. the full compiler with the stage-1-compiler, and in the third stage, you build the SAME THING again with the stage-2-compiler and compare stage 2 to stage 3, i. e. check if the self-compiled compiler is the same.

If it weren't for a comparison, we would not need LTO in stages one and two because they are disposed of, and only the compiler built in stage 3 will be used. And in essence, stage 3 is all you build if you build the compiler as a cross-compiler without full bootstrap (you knew that already). So if our base compilers are good enough, let's just build all GCC as cross, or at least all Tier-1 and Tier-2.

And my builder runs up to(*) 16 Zen threads (AMD Ryzen 7 1700, 8 cores w/ 2 threads each), but I usually need to take that down to 4-ish or so because else this deadlocks pretty soon with the 12 or how many GBytes of RAM I pass to it. FreeBSD 13.0, ZFS-based poudriere, no swap.
Comment 24 Dimitry Andric freebsd_committer freebsd_triage 2022-04-19 17:29:56 UTC
The full 3 stage bootstrap is really only useful for:
* gcc developers (daily use, making releases etc)
* when you are bringing it up on a system with a crappy/ancient host compiler
* when you are compiling a random gcc trunk snapshot

For most of our gcc ports, these should not apply, as the ports (hopefully) use curated, i.e. release or stable-snapshot tarballs.

LTO is completely orthogonal to all the above. My suggestion is to enable it for the official package builders, and let everybody else opt-in?
Comment 25 Mark Millard 2022-04-19 19:05:59 UTC
(In reply to Brooks Davis from comment #21)

An interesting property of the criteria suggestion is:

How much value is there for more frequent package update
releases vs. less frequent package releases for the actual
timescale difference involved from before any gcc LTO
builds to now with 3 gcc LTO builds (gcc11, gcc11-devel,
and gcc12-devel)?

A quick set of data could be to look up some 30,000+ port
bulk -a runs on amd64 and aarch64 (tier 1's) in the
time frames and see if the total time is significantly
different and by how much.

A complication is the currently-frequent build failures
for aarch64 building gcc12-devel for NOHANG_TIME and/or
MAX_EXECUTION_TIME . One might have to search for a
successful case to have a reasonable comparison
(approximating "as if the timeouts were longer").

An incompleteness in the comparison could be the status
of gcc1 being the default vs. not. Once default, more
things will wait on its build. Exp-run like test of
default gcc11?
Comment 26 Gerald Pfeifer freebsd_committer freebsd_triage 2022-04-19 19:44:26 UTC
(In reply to Piotr Kubaj from comment #16)
> So please just add an option (enabled by default) to use LTO and
> disable it on your own builders if that causes issues.

I believe what you are seeing is backlash to your unilaterally switching
to LTO without providing an option for users.

IMnsHO you have received sufficient feedback to warrant adding such an
LTO option (and the default can then be easily tweaked).

(In reply to Dimitry Andric from comment #14)
> That said, I would say the safe, conservative choice is to make use
> of LTO an opt-in (either global, or per port), instead of an opt-out?

Well, first of all it should be an option to begin with. :-)
Comment 27 Mark Millard 2022-04-19 20:13:51 UTC
(In reply to Mark Millard from comment #25)

Looks like there has not been an aarch64 or amd64 30,000+ bulk -a
with gcc11 LTO style yet.

But looking at amd64's:

http://beefy16.nyi.freebsd.org/build.html?mastername=130amd64-default&build=cb1788291f45 

proved surprising:

gcc12-devel-12.0.1.s20220306_2 success 08:05:31
gcc11-devel-11.2.1.s20211009_1 success 15:26:37

Both show: --with-build-config=bootstrap-lto-noplugin

Did one run with more parallelism allowed --or just less
activity by other builders in the bulk -a over the time of
the specific gcc build?

For reference for non-LTO style:

gcc11-11.2.0                   success 04:08:59

So the gcc12-devel-12.0.1.s20220306_2 time still suggests
the LTO status of the build.
Comment 28 Mark Millard 2022-04-19 20:57:24 UTC
(In reply to Brooks Davis from comment #21)

FYI (30,000+ bulk -a examples, default and quarterly):

http://beefy16.nyi.freebsd.org/build.html?mastername=130amd64-default&build=cb1788291f45
http://beefy14.nyi.freebsd.org/build.html?mastername=130amd64-quarterly&build=04bac8927e5b

where: 75:34:51 and 75:19:28, both gcc1[12]-devels LTO style

vs.

http://beefy16.nyi.freebsd.org/build.html?mastername=130amd64-default&build=d79790970038
http://beefy14.nyi.freebsd.org/build.html?mastername=130amd64-quarterly&build=e082dc0ec3cc

where: 70:11:06 and 68:32:09 (Feb, before LTO style)

So, for the way the FreeBSD build servers are used (deliberate
slack capacity), the overall change that I can reference
(-devel's, not gcc1 as well) did not make a great difference in
overall time but did increase it.

(This does not cover what a gcc11 default with gcc11 built LTO
style would be like. More would likely wait for the LTO build
to finish.)
Comment 29 Matthias Andree freebsd_committer freebsd_triage 2022-04-19 21:17:52 UTC
So the question again, can LTO cost be compensated by removing bootstrap from the default options? What systems need the full bootstrap? What can get away with the simple non-bootstrap build?
Comment 30 Piotr Kubaj freebsd_committer freebsd_triage 2022-04-19 21:22:58 UTC
This issue is quickly getting out of hand, I'll do the last fix to default to LTO, but allow non LTO.

However, some remarks:
- LTO is NOT used only for stage 1. In fact, if it were used only for only stage (like mat@ implied), then powerpc64 builds would be broken (because LTO with LLVM is known to be broken on powerpc64). That is not the case, LTOized GCC works fine on powerpc64, because -flto is only passed AFTER stage 1.
- comparison of bulk -a builds is just idiotic. How can you compare bulk -a with LTOized GCC when nothing else depends on it? It just doesn't make sense at all. Those -devel ports are only to serve as CI-like safety measure to make sure the newest GCC snapshots build. They use LTO, because the current release (lang/gcc11) also does, and I hope lang/gcc12 will as well.
- if you compare bulk -a builds with LTO for gcc*-devel ports, then we might as well just drop -devel ports completely - build times will be lower. In fact, they will be even lower if we drop all the gcc ports completely (along with the required reverse dependencies).
- Regarding performance benchmarks for LTO - has anyone done that when -O2 was introduced? This is one of the reasons why I hear things like "it's hard introduce anything in FreeBSD, because many devs oppose it" and "I gave up fighting it" (from one former portmgr@ people, who also did some ppc work).
- How is it that e.g. Ubuntu or OpenSUSE can afford to enforce LTO for their own repo and we can't?
Comment 31 Mark Millard 2022-04-19 21:27:14 UTC
(In reply to Mark Millard from comment #28)

As for what I can get for aarch64 . . .

http://ampere2.nyi.freebsd.org/build.html?mastername=main-arm64-default&build=p1853d90f79b6_s27ac4281fd

actually has both gcc1[12]-devels in LTO style built successfully in
a 30,000+ port bulk -a . (Seems rare for the 30,000+ bulks on aarch64.)

where: 115:50:28

vs. [predating the LTO style use (Feb.)].

http://ampere2.nyi.freebsd.org/build.html?mastername=main-arm64-default&build=pde1a3d3a0c66_sa4a31271cc

where: 107:17:11

So, like amd64, not a huge difference in the time scale --for how
the FreeBSD servers are used-- but an increase observed.

But the increases for amd64 and aarch64 are small enough to not be
clearly mostly-gcc-LTO related for the cause of the variation.
Still, it would not be surprising for LTO being a contribution,
just not huge for the overall time involved.

As near as I can tell, unless having gcc11 as the default and
built via LTO so more ports wait and that changes things
significantly, having LTO used on the FreeBSD servers looks
reasonable relative to how other aspects are (and have been)
handled.

[A context configured to allow for high load averages relative to
hardware threads and large RAM+SWAP to match (avoiding processing
slack) might get significantly different results. But that is not
how FreeBSD build servers are used.]
Comment 32 Matthias Andree freebsd_committer freebsd_triage 2022-04-19 21:42:59 UTC
I do not object to LTO per se (in fact I have declined requests in the past to build graphics/rawtherapee without OMP and other features and with old base compilers), but I am complaining that we really need to keep the ports tree manageable for the average contributor who doesn't have this 200+-thread EPYC or XEON server with 1 TB of RAM to "poudriere testport" their builds quickly.

For a home computer, my 4 y.o. Ryzen machine is still somewhat beefy but the very frequent rebuilding rust, gcc, whatnot just to test one of my port is really pulling on my nerves.

So again my provocation, if we don't want ports maintainers, then we continue  down that road named "who cares for build times if the high-performance cluster can build all ports in under 4 days".

I am assuming that some derivative of gcc12-devel might be our default ports compiler some day.
Comment 33 Mark Millard 2022-04-19 21:57:46 UTC
(In reply to Mark Millard from comment #31)

Relative to FreeBSD build server contexts, do not take my notes
as applying to QEMU contexts, such as targeting armv6 and armv7.
I suspect that the time frames for those targets would indicate
avoiding LTO style on FreeBSD servers. But, as far as I can tell,
there is no historical data available to make comparisons with,
so I'm unfortunately staying in the opinion-without-matching-data
realm for them.
Comment 34 Piotr Kubaj freebsd_committer freebsd_triage 2022-04-19 22:02:19 UTC
(In reply to Matthias Andree from comment #32)
OK, since no one replied to me about this point, I'll ask again: what's wrong with making Poudriere use pre-built packages?
Comment 35 Piotr Kubaj freebsd_committer freebsd_triage 2022-04-19 22:24:51 UTC
I'm currently testing the following patch:
--- lang/gcc12-devel/Makefile
+++ lang/gcc12-devel/Makefile
@@ -42,9 +42,12 @@ SUFFIX=              ${PORTVERSION:C/([0-9]+).*/\1/}
 CFLAGS:=       ${CFLAGS:N-mretpoline}
 CXXFLAGS:=     ${CXXFLAGS:N-mretpoline}

-OPTIONS_DEFINE=                BOOTSTRAP GRAPHITE
-OPTIONS_DEFAULT=       BOOTSTRAP
-BOOTSTRAP_DESC=                Build using a full bootstrap
+OPTIONS_DEFINE=                GRAPHITE
+OPTIONS_DEFAULT=       LTO_BOOTSTRAP
+OPTIONS_RADIO= BOOTSTRAP
+OPTIONS_RADIO_BOOTSTRAP=       LTO_BOOTSTRAP STANDARD_BOOTSTRAP
+LTO_BOOTSTRAP_DESC=    Build using a full LTO bootstrap
+STANDARD_BOOTSTRAP_DESC=       Build using a full bootstrap without LTO
 GRAPHITE_DESC=         Support for Graphite loop optimizations

 .if exists(/usr/lib32/libc.so)
@@ -82,9 +85,12 @@ GNU_CONFIGURE=       yes
 CONFIGURE_OUTSOURCE=   yes
 .if empty(PORT_OPTIONS:MBOOTSTRAP)
 CONFIGURE_ARGS+=--disable-bootstrap
-.else
+.elif ${PORT_OPTIONS:MLTO_BOOTSTRAP}
 CONFIGURE_ARGS+=--with-build-config=bootstrap-lto-noplugin
 ALL_TARGET=    bootstrap-lean
+.else
+CONFIGURE_ARGS+=--with-build-config=bootstrap-debug
+ALL_TARGET=    bootstrap-lean
 .endif
 INSTALL_TARGET=        install-strip
 .if ${UID} != 0
Comment 36 Robert Clausecker freebsd_committer freebsd_triage 2022-04-19 22:38:13 UTC
(In reply to Piotr Kubaj from comment #34)

Pre-built packages are not in synch with the development ports tree and may cause or hide issues due to this mismatch.  So using them is not really an option when testing ports.
Comment 37 Mark Millard 2022-04-19 23:58:23 UTC
(In reply to Piotr Kubaj from comment #34)

This is just FYI. I've not well thought through how to
best use the properties of PACKAGE_FETCH_BRANCH=latest
or the like.

In my little bit of experimenting, it appeared that
keeping the local environment such that it avoided
the following types of rejections of using pre-built
packages was somewhat messy, at least for main [14]
and main [latest] (the kind of context I played with):

remote osversion too new: 1400056 (want <=1400047)
remote version mismatch: lumina-fm-1.6.2

osversion requires not getting behind the FreeBSD build
server --but how to notice updates to the server osversion
seems non-obvious and so tracking/timing the OS updates
seems messy.

Remote version matching requires a /usr/ports/ git status
matching what the server used, not getting ahead or behind,
at least for the ports one is trying to not rebuild.
Again, how to match what the available FreeBSD packages
used for /usr/ports/ content seems non-obvious.

It also seems to be that having options set to avoid
something like LTO style in local builds, prevents use
of the LTO style builds from the server: OPTIONS mismatch.
(Some OPTIONS need not imply that the results would be
incompatible for the usage but nothing allows declaring
and having it use such relationships.)
Comment 38 Mark Millard 2022-04-20 00:59:53 UTC
(In reply to Mark Millard from comment #37)

I should have also listed the following type of mismatch
issue as well:

qt5-uiplugin-5.15.2p17: deps wanted: qt5-core-5.15.2p263 qt5-gui-5.15.2p263 qt5-widgets-5.15.2p263
qt5-uiplugin-5.15.2p17: deps remote: qt5-core-5.15.2p263_2 qt5-gui-5.15.2p263 qt5-widgets-5.15.2p263

Same port, same version, one different dependency version.

But it gets back to well matching the /usr/ports/ commit that
the FreeBSD build server used for the package builds (for
packages one is trying to avoid rebuilding, including
dependencies).
Comment 39 Robert Clausecker freebsd_committer freebsd_triage 2022-04-20 01:01:34 UTC
Update on my native gcc11-devel armv7 build: after 74 hours, the build timed out.  From the progress I guess it would take 80 hour-ish to complete.

For comparison, the pre-LTO build time was 20 hours.
Comment 40 Mark Millard 2022-04-20 01:55:54 UTC
(In reply to Piotr Kubaj from comment #35)

Note: I agree that most FreeBSD build server activity
for building lang/gcc*'s (gcc11+) can reasonably use
LTO style. But there is a reason for the "most" . . .

To my knowledge, no FreeBSD build server targeting
armv6 or armv7 has ever had such a LTO build of a gcc*
complete, one reason being because after the first
stage bootstrap, it is nearly all via QEMU, possibly
leading to 100+ hours being needed for a build (!no
good data!). In other words, these two target
architectures did not meet the Comment #4 criteria:

QUOTE
Let's see whether it builds now fine for all the packaged architectures and versions.
END QUOTE

Avoiding the LTO style builds targeting armv6 or armv7
may well be reasonable for the FreeBSD build servers and
for other contexts targeting armv6 or armv7.

To my knowledge, the rest of architectures that the FreeBSD
servers target avoid needing to use QEMU (or analogous) and
so avoid the related issues.
Comment 41 Mark Millard 2022-04-20 02:12:59 UTC
(In reply to Mark Millard from comment #40)

Adding somewhat good data for FreeBSD servers targeting armv7:

Feb, pre-LTO style:

http://beefy12.nyi.freebsd.org/build.html?mastername=130releng-armv7-quarterly&build=35b5f38563a2

gcc11-devel-11.2.1.s20211009 success       76:54:27
gcc12-devel-12.0.0.s20211205 build/timeout 96:01:10


So if LTO-style takes 3 or more times longer to build, and using
the shorter time (so:a lower bound estimate), that is over 200
hours, well over a week.

The larger time (gcc12-devel's) and, say, 4 times longer is
more like 400 hours, so over 2 weeks.
Comment 42 Piotr Kubaj freebsd_committer freebsd_triage 2022-04-20 06:12:46 UTC
Would that be good for you?
--- lang/gcc12-devel/Makefile
+++ lang/gcc12-devel/Makefile
@@ -42,9 +42,16 @@ SUFFIX=              ${PORTVERSION:C/([0-9]+).*/\1/}
 CFLAGS:=       ${CFLAGS:N-mretpoline}
 CXXFLAGS:=     ${CXXFLAGS:N-mretpoline}

-OPTIONS_DEFINE=                BOOTSTRAP GRAPHITE
-OPTIONS_DEFAULT=       BOOTSTRAP
-BOOTSTRAP_DESC=                Build using a full bootstrap
+OPTIONS_DEFINE=                GRAPHITE
+OPTIONS_DEFAULT=       LTO_BOOTSTRAP
+OPTIONS_EXCLUDE_armv6= LTO_BOOTSTRAP
+OPTIONS_EXCLUDE_armv7= LTO_BOOTSTRAP
+OPTIONS_DEFAULT_armv6= STANDARD_BOOTSTRAP
+OPTIONS_DEFAULT_armv7= STANDARD_BOOTSTRAP
+OPTIONS_RADIO= BOOTSTRAP
+OPTIONS_RADIO_BOOTSTRAP=       LTO_BOOTSTRAP STANDARD_BOOTSTRAP
+LTO_BOOTSTRAP_DESC=    Build using a full LTO bootstrap
+STANDARD_BOOTSTRAP_DESC=       Build using a full bootstrap without LTO
 GRAPHITE_DESC=         Support for Graphite loop optimizations

 .if exists(/usr/lib32/libc.so)
@@ -80,11 +87,14 @@ TARGLIB32=  ${PREFIX}/lib32 # The version information i
 LIBEXEC=       ${PREFIX}/libexec/gcc${SUFFIX}
 GNU_CONFIGURE= yes
 CONFIGURE_OUTSOURCE=   yes
-.if empty(PORT_OPTIONS:MBOOTSTRAP)
+.if empty(PORT_OPTIONS:M*BOOTSTRAP)
 CONFIGURE_ARGS+=--disable-bootstrap
-.else
+.elif ${PORT_OPTIONS:MLTO_BOOTSTRAP}
 CONFIGURE_ARGS+=--with-build-config=bootstrap-lto-noplugin
 ALL_TARGET=    bootstrap-lean
+.else
+CONFIGURE_ARGS+=--with-build-config=bootstrap-debug
+ALL_TARGET=    bootstrap-lean
 .endif
 INSTALL_TARGET=        install-strip
 .if ${UID} != 0
Comment 43 Robert Clausecker freebsd_committer freebsd_triage 2022-04-20 07:53:02 UTC
(In reply to Piotr Kubaj from comment #42)

Not really a fan of excluding the BOOTSTRAP_LTO option on armv6/armv7 as it is possible to build the compiler with it (it just takes crazy long).  Just make the default not to bootstrap at all on these architectures.  Also perhaps on riscv and arm64 which suffer from the same problem of most available hardware being slow (and I'm not going to shell $1000+ to buy an Apple M1 just yet).

Of course this change would have to apply to all gcc ports.
Comment 44 Matthias Andree freebsd_committer freebsd_triage 2022-04-20 14:52:52 UTC
I ask again: 

1. why do we bootstrap by default? 
2. why do we use LTO on disposable parts of the build?

- LTO is pointless in all non-installed stages of the bootstrap
- only the final stage gets installed and used at scale

Are our base system compilers still insufficient to build GCC in a single "stage 3" build? If all supported Tier-1 and possibly Tier-2 base compilers can build GCC, let's get rid of the three-stage bootstrap and build The Real Thing That Gets Installed right away.
Comment 45 Piotr Kubaj freebsd_committer freebsd_triage 2022-04-20 19:16:12 UTC
(In reply to Matthias Andree from comment #44)
1. I assume it's to build GCC in a way that is as safe as possible, so that the bootstrapped compiler will work fine. When you combine that with LTO and us using clang to build, it's definitely much safer than plain building.
2. LTO is used only for stage 2 and stage 3 (you can see that in config/bootstrap-lto-noplugin.mk). IMO that is reasonable, since using LTOized GCC from stage 2 in stage 3 lets us make sure that the compiler works fine.

(In reply to Robert Clausecker from comment #43)
OK, I will disable bootstrap there according to your instructions. Regarding riscv64, we currently don't support GCC there.
Regarding your earlier question for a capable ARM hardware, I'd consider getting something like https://www.ipi.wiki/products/ampere-altra-developer-platform.
Comment 46 Robert Clausecker freebsd_committer freebsd_triage 2022-04-20 23:40:47 UTC
(In reply to Piotr Kubaj from comment #45)

> OK, I will disable bootstrap there according to your instructions. Regarding riscv64, we currently don't support GCC there.

Thanks for the help.  I appreciate this.

> https://www.ipi.wiki/products/ampere-altra-developer-platform

This is a nice system and very well suited to my ports work (it even natively supports AArch32).  Unfortunately at $4000 it is very much outside of the range of what I can afford.  If this is the recommended system for ARM ports work, perhaps you could help me file a grant with the FreeBSD foundation so I can continue to improve the quality of the FreeBSD ports collection on armv7.
Comment 47 Matthias Andree freebsd_committer freebsd_triage 2022-04-21 16:52:26 UTC
(In reply to Piotr Kubaj from comment #45)
Piotr, it seems we are talking past each other. You "defend" the bootstrap feature's existence and motivation, and I am aiming at "why are we doing this in FreeBSD". So, my plea, explicitly, is: why do the FreeBSD GCC ports default to bootstrap builds as opposed to the single-stage one-shot cross builds?

The other question that newly comes to my mind is: unless we are already doing it, do we have a way inside FreeBSD to parallelize the LTO "link" stage (which is in fact the optimizer and code generation and linker stage unified) if it runs as a single thread in the current state?
Comment 48 Mark Millard 2022-04-21 17:45:59 UTC
(In reply to Matthias Andree from comment #47)

The lto related stages are parallel --and get higher load averages
than the hardware thread count some of the time. I've watched them
in top.
Comment 49 Mark Millard 2022-04-21 17:55:42 UTC
(In reply to Mark Millard from comment #48)

I should have been more careful. I've not watched the detailed
of a build system set up how FreeBSD's servers are. I expect that
what I've seen implies that more hardware threads are sometimes
put to use than the server's poudriere has been told to use for
each builder (2 as I remember). It might even be exactly as I
said --but I've watched ALLOW_MAKE_JOBS= (which I use). I can
not watch top on FreeBSD's servers.
Comment 50 Piotr Kubaj freebsd_committer freebsd_triage 2022-04-21 19:39:52 UTC
(In reply to Matthias Andree from comment #47)
For the standard bootstrap without LTO, I'm actually not sure what's the current motivation for that. I assume it's just safer, especially with the development snapshots.
It changes when the port is switched to use LTO - which I did. Upstream provides no way to use LTO (unless you manually append -flto to CFLAGS / CXXFLAGS, but I trust upstream developers more).

(In reply to Mark Millard from comment #48)
I can confirm that LTO-related processes run in parallel with the full capacity of the box.
Comment 51 commit-hook freebsd_committer freebsd_triage 2022-04-21 19:44:04 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=2200a356e4b7e83058a1fb48d01326bfe8772909

commit 2200a356e4b7e83058a1fb48d01326bfe8772909
Author:     Piotr Kubaj <pkubaj@FreeBSD.org>
AuthorDate: 2022-04-21 19:40:05 +0000
Commit:     Piotr Kubaj <pkubaj@FreeBSD.org>
CommitDate: 2022-04-21 19:40:05 +0000

    lang/gcc12-devel: disable LTO on armv6/7

    PR:     261977
    Requested by:    fuz@fuz.su

 lang/gcc12-devel/Makefile | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)
Comment 52 Mark Millard 2022-04-21 20:53:25 UTC
(In reply to Piotr Kubaj from comment #50)

Sounds like an FreeBSD build server will be in an atypical
type of context if it happens to do multiple lang/gcc* builds
in parallel: likely a high load average (beyond 2*#BuildersAllowed
and possibly at times even well beyond the number of hardware
threads). This might lead to needing timeout adjustments if it
leads to NOHANG_TIME or MAXIMUM_EXECUTION_TIME or such timeouts
on some builders that overlap in time, at least if such timeouts
are frequent enough. Possibly something to monitor for.
Comment 53 commit-hook freebsd_committer freebsd_triage 2022-05-06 18:10:30 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=9c6b74370b44eaf613d26417185ceeb3a861d292

commit 9c6b74370b44eaf613d26417185ceeb3a861d292
Author:     Piotr Kubaj <pkubaj@FreeBSD.org>
AuthorDate: 2022-05-06 18:01:41 +0000
Commit:     Piotr Kubaj <pkubaj@FreeBSD.org>
CommitDate: 2022-05-06 18:01:41 +0000

    lang/gcc11-devel: update to the newest snapshot and disable LTO on armv6/7

    PR:     261977

 lang/gcc11-devel/Makefile | 23 +++++++++++++++++------
 lang/gcc11-devel/distinfo |  6 +++---
 2 files changed, 20 insertions(+), 9 deletions(-)
Comment 54 Mark Millard 2022-05-06 19:58:39 UTC
(In reply to commit-hook from comment #53)

Cool. Thanks.

I see that the new lang/gcc12 got the treatment as well.
Thanks again.

Will lang/gcc11 get the treatment as well? Or will lang/gcc12
be the start of the "not -devel" ones avoiding the LTO bootstrap
for armv7/armv6?
Comment 55 Piotr Kubaj freebsd_committer freebsd_triage 2022-05-06 22:01:42 UTC
(In reply to Mark Millard from comment #54)
I will switch gcc11 to disable bootstrapping on armv6/7 as well.
Comment 56 commit-hook freebsd_committer freebsd_triage 2022-06-28 17:25:29 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=aadf6428cc480fbeda72ec90d53ef340e95f49ca

commit aadf6428cc480fbeda72ec90d53ef340e95f49ca
Author:     Piotr Kubaj <pkubaj@FreeBSD.org>
AuthorDate: 2022-06-28 17:23:18 +0000
Commit:     Piotr Kubaj <pkubaj@FreeBSD.org>
CommitDate: 2022-06-28 17:24:24 +0000

    lang/gcc11: disable LTO on armv6/7

    PR:     261977

 lang/gcc11/Makefile | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)