Bug 263353 - lang/python3*: Fails to link with LTO: Python includes unconditionally adds -g to --with-lto
Summary: lang/python3*: Fails to link with LTO: Python includes unconditionally adds -...
Status: In Progress
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Piotr Kubaj
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-04-16 21:58 UTC by Matthias Andree
Modified: 2022-07-14 15:00 UTC (History)
6 users (show)

See Also:
koobs: maintainer-feedback+


Attachments
remove -g from the CFLAGS/LDFLAGS unless WITH_DEBUG is set not empty and not "no". (970 bytes, patch)
2022-04-16 21:58 UTC, Matthias Andree
koobs: maintainer-approval-
Details | Diff
failing poudriere build log with LTO and optimization, showing excess -g. (112.86 KB, text/plain)
2022-04-24 22:36 UTC, Matthias Andree
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Andree freebsd_committer freebsd_triage 2022-04-16 21:58:20 UTC
Created attachment 233259 [details]
remove -g from the CFLAGS/LDFLAGS unless WITH_DEBUG is set not empty and not "no".

I can no longer build Python 3.8 on a FreeBSD 13.0-RELEASE amd64 computer with 1.6 GB available memory (1 GB physical, 1 GB encrypted swap file on UFS).

The upstream configure.ac file in Python 3.8.13 as of 0058eede0ebf adds "-flto -g" to the build (through CFLAGS and LDFLAGS). This happens even without WITH_DEBUG=yes.

This considerably increases the memory that my FreeBSD 13.0-RELEASE ld.lld requires, and the kernel kills it with "out of swap space". 

I used to be able to build Python 3.8 on that machine earlier, and reviewing port history, I surmise that * 9a31e1b6d3bf 2022-03-09 | lang/python3*: add LTO option and enable by default everywhere except powerpc64 and riscv64 [Piotr Kubaj] 
caused the increase in memory consumption from the desired LTO.

Chopping out the -g from CFLAGS and LDFLAGS fixes this for me, reducing linker memory consumption to below 400 MB VMEM, and lets my low-on-RAM computer complete the build. (It is a single Xeon core I rented out for a virtual server.)

I am proposing the attached patch. I have not investigated yet whether Pythons 3.7 3.9 3.10 3.11 need an analogue patch, for lack of time. In Python 3.8 you will see "-flto -g" in the ${WRKSRC}/Makefile that results from a "make configure" with LTO option enabled, because the upstream configure.ac is written that way.
Comment 1 Dan Kotowski 2022-04-24 12:39:33 UTC
I'm seeing the same problem even on a 10C/20T Xeon with 32GB DDR3 + 64GB NVMe swap.
Comment 2 Kubilay Kocak freebsd_committer freebsd_triage 2022-04-24 22:03:48 UTC
@Matthias Can you test a build with -flto=thin -g please?

LTO is well known to require significantly more resources during link time, and is precisely why a =thin model/option exists. Everything else equal, failing to link due to insufficient resources isn't a bug, with or without -g.
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2022-04-24 22:07:47 UTC
@Matthias Just realised that poyrts set --with-lto=thin already. Can you attach a full build log (compressed if necessary), so we can take a look at the compile invocations
Comment 4 Matthias Andree freebsd_committer freebsd_triage 2022-04-24 22:36:40 UTC
Created attachment 233456 [details]
failing poudriere build log with LTO and optimization, showing excess -g.

Kubilay, Python@ guys,

the issue is an excess "-g" in spite of NOT setting WITH_DEBUG, not LTO. The FreeBSD ports Mk/* stuff properly omits -g, but the Python 3.8 build stuff adds it specifically when LTO is in place. In the log, find lots of "-flto -g" (search for it, literally, of course without the quotes).

This goes unnoticed because everything's stripped in the stage area, so no stage-qa complaints, just excess disk space use.

The culprit is the upstream configure stuff, not something in FreeBSD.

I am attaching my log. But I seriously ask everyone to respect my debugging and solution and not fuss with distractions and excursions. Trust me to be able to read my Poudriere error logs. All these questions are wasting my time. 

So, unless there are substantiated vetos with better solutions  brought up, I may commit my patch 14 days after my proposal, or with maintainer approval, whichever comes first.
Comment 5 Kubilay Kocak freebsd_committer freebsd_triage 2022-04-24 23:03:23 UTC
(In reply to Matthias Andree from comment #4)

That's not how this works Matthias. Just like you'd like people to respect your time, everyone else also appreciates you respecting theirs, particularly as they spent time and effort on your bug reports.

There are multiple considerations here:

1) with-lto supports =thin argument, but it was only set in python311 in ports 9a31e1b6d3 because support for =thin was not backported to earlier branches. I asked in comment 1 for a test of whether you can link with =thin -g as that (thin), along with decoupling our LTo build from Pythons lto option is the most desirable target state

2) Bug 261974 added support via Pythons build system flag for LTO, not using our own toolchain flags. This means we were necessarily coupled to what Python does with this flag. That commit landed somewhat prematurely prematurely and we are now dealing with that.

2) Adding -g in the LTO was added a while ago [1][2], and earlier than 3.8, and affects multiple Python port versions

3) The definition and scope of 'debug' in freebsd ports isn't an exact overlap with upstreams definitions of 'debug' (not just compile args).

4) Python without compelling arguments/cases to the contrary prefers to remain as close to upstream as possible, and where changes need to be made, changes are submitted for upstream inclusion/improvement unless that is impossible.

We're not asking that you care about all of the above, but these are the things we need to consider, along with the overhead of carrying local patches, where alternatives may (and do) exist.

[1] https://github.com/python/cpython/commit/1bb9dd337ed5aa9eafc8e2ce017ceedf044145e3
[2] https://github.com/python/cpython/issues/74530
Comment 6 Matthias Andree freebsd_committer freebsd_triage 2022-04-24 23:21:56 UTC
koobs@, 

we cannot work like that and leave ports regressed and non-building just because someone wants to have the perfect fix, possibly sitting the fix out until the cows come home (*)

I don't buy your PGO consideration: What piece of the code would enable PGO anyways? The configure stage suggests to --enable-optimizations (which we do not do) to enable PGO, "make configure" is sufficient to see that. We don't see the regression that prompted the upstream change, and given the nature of the latter, I contend that it's trying to downstream (as in Python) fix compiler bugs.  And now you're talking to me about unfitting patches.

I fail to see what benefit -ftlto=thin would bring, because it does not take any of your objectsion away; but feel free to run it and see how far it reduces size of the wkrdir and peak linker memory use, and if it really achieves the same goals, commit before maintainer timeout. I am not wasting more time.

(*) I do not care about minimal deviation from upstream on a life support branch (which is what Python 3.7 and 3.8 are, security fix only).

The build regression is real, and the easy fix would have been to just revert the offending commit and possibly bump PORTEPOCH, which I haven't done although I could perfectly have invoked the "fix broken build" blanket just as well.

We can always refine fixes later, but barring a better solution in due time (end April), we need a solution and we need to get the short-term fix in place and get it out of the (time-)critical path.
Comment 7 Matthias Andree freebsd_committer freebsd_triage 2022-04-24 23:25:03 UTC
TL;DR: none of your objections are relevant, or would be remedied with -flto=thin.  The only item you mentioned that is half-way relevant is (3) but still we "port" ports by adjusting them to our conventions WRT compiler flags.
Comment 8 Kubilay Kocak freebsd_committer freebsd_triage 2022-04-24 23:48:48 UTC
(In reply to Matthias Andree from comment #6)

The issue applies to multiple python ports, you are welcome to provide a patch against all ports for consideration.

The issue here is not about a perfect fix, its about your assumption that asking for additional details, testing or questions means someone is looking for a perfect fix, or in some manner, to block you. This is not the case. Please leave your frustrations at the Bugzilla door.

In the mean time, since the LTO patch landed prematurely, it *is* indeed an option, and the report here is a symptom of the implementation of said option, you are welcome to provide a separate patch removing LTO from OPTIONS_DEFAULT (for all python ports), which we're happy to approve once provided.
Comment 9 Kubilay Kocak freebsd_committer freebsd_triage 2022-04-24 23:55:33 UTC
(In reply to Kubilay Kocak from comment #8)

Said patch for removal from all python ports could omit python311, as it uses (supports) -flto=thin.

Additionally testing with thin/-g combinations remain useful to determine the contribution of each in the failure cases, to inform and assist resolution upstream.
Comment 10 Piotr Kubaj freebsd_committer freebsd_triage 2022-04-25 15:26:03 UTC
@mandree

Since you're building manually, why don't use just disable LTO option?
Comment 11 Matthias Andree freebsd_committer freebsd_triage 2022-04-25 18:04:39 UTC
koobs@ if you would only *READ* what is written through this entire report. 
The -g is wrong, not LTO.
Comment 12 Matthias Andree freebsd_committer freebsd_triage 2022-04-25 18:05:02 UTC
t0-5 days and counting.
Comment 13 Kubilay Kocak freebsd_committer freebsd_triage 2022-04-27 00:05:03 UTC
Comment on attachment 233259 [details]
remove -g from the CFLAGS/LDFLAGS unless WITH_DEBUG is set not empty and not "no".

(In reply to Matthias Andree from comment #12)

Please provide a patch against all Python port versions for your original proposal, for all affected Python versions (except python311), and, (if you would like), a patch against all Python versions removing LTO from OPTIONS_DEFAULT, resolving this issue as reported, until a correct and complete change/resolution can be addressed by the team.
Comment 14 commit-hook freebsd_committer freebsd_triage 2022-04-27 09:27:11 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=423a79650284abeaca4aff2e7b10ee6c178cdfeb

commit 423a79650284abeaca4aff2e7b10ee6c178cdfeb
Author:     Piotr Kubaj <pkubaj@FreeBSD.org>
AuthorDate: 2022-04-27 09:22:45 +0000
Commit:     Piotr Kubaj <pkubaj@FreeBSD.org>
CommitDate: 2022-04-27 09:22:45 +0000

    lang/python3{7,8,9,10}: remove LTO from defaults

    LTO is still default in python3.11 because it uses thin LTO.

    Requested by:   koobs
    PR:     263353

 lang/python310/Makefile | 3 ++-
 lang/python37/Makefile  | 3 ++-
 lang/python38/Makefile  | 3 ++-
 lang/python39/Makefile  | 3 ++-
 4 files changed, 8 insertions(+), 4 deletions(-)
Comment 15 Dan Kotowski 2022-04-27 10:22:48 UTC
(In reply to commit-hook from comment #14)

Thank you, can confirm this fixes builds targeting i386
Comment 16 Matthias Andree freebsd_committer freebsd_triage 2022-04-27 22:19:28 UTC
(In reply to commit-hook from comment #14)
Thank you very much. The new version of the port successfully builds with 1 GB RAM + 1 GB swap + some memory load on the computer.
Comment 17 Kubilay Kocak freebsd_committer freebsd_triage 2022-04-28 02:30:55 UTC
Happy days. Will close as resolved (issue as reported), pending creation of a dependent issue for resolving the broader issue more permanently.

Thank you Piotr.
Comment 18 Daniel Engberg freebsd_committer freebsd_triage 2022-07-14 15:00:43 UTC
Is this tracked somewhere? I couldn't find anything of relevance reported upstream. One could also argue that -g / -ggdb clashes with "13.9. Respect CFLAGS" since we do have a WITH_DEBUG switch for the framework