Bug 257124 - multimedia/ffmpeg: Fails to link: ld: error: inline assembly requires more registers than available at line [on i386 with LTO option]
Summary: multimedia/ffmpeg: Fails to link: ld: error: inline assembly requires more re...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: i386 Any
: --- Affects Some People
Assignee: freebsd-multimedia (Nobody)
URL: https://bugs.llvm.org/show_bug.cgi?id...
Keywords: needs-patch, needs-qa
Depends on:
Blocks:
 
Reported: 2021-07-11 23:37 UTC by Mikhail Teterin
Modified: 2021-08-29 14:08 UTC (History)
4 users (show)

See Also:
riggs: maintainer-feedback+
koobs: maintainer-feedback? (dim)
koobs: maintainer-feedback? (bapt)
riggs: maintainer-feedback+
riggs: merge-quarterly+


Attachments
Configured ffmpeg options (3.13 KB, text/plain)
2021-07-11 23:37 UTC, Mikhail Teterin
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mikhail Teterin freebsd_committer 2021-07-11 23:37:45 UTC
Created attachment 226381 [details]
Configured ffmpeg options

Trying to build the port, configured as per the attached options-file, I get the following errors. The machine is FreeBSD-12/i386.

The problem strikes with both base cc (clang-10.0.1) and the ports-installed clang12 (12.0.0).


ld: error: inline assembly requires more registers than available at line 1545398
ld: error: inline assembly requires more registers than available at line 1545398
ld: error: inline assembly requires more registers than available at line 1062512
ld: error: inline assembly requires more registers than available at line 1062512
ld: error: inline assembly requires more registers than available at line 1297481
ld: error: inline assembly requires more registers than available at line 1297481
clang: error: linker command failed with exit code 1 (use -v to see invocation)
gmake[2]: *** [ffbuild/library.mak:103: libpostproc/libpostproc.so.55] Error 1
gmake[2]: *** Waiting for unfinished jobs....
ld: error: inline assembly requires more registers than available at line 2148613353
ld: error: inline assembly requires more registers than available at line 2148613353
ld: error: inline assembly requires more registers than available at line 2148613353
ld: error: inline assembly requires more registers than available at line 2148625407
ld: error: inline assembly requires more registers than available at line 2148625407
ld: error: inline assembly requires more registers than available at line 2148625407
ld: error: inline assembly requires more registers than available at line 1018758
ld: error: inline assembly requires more registers than available at line 1018038
ld: error: inline assembly requires more registers than available at line 1018758
ld: error: inline assembly requires more registers than available at line 1018038
ld: error: inline assembly requires more registers than available at line 1001035
ld: error: inline assembly requires more registers than available at line 1001035
ld: error: inline assembly requires more registers than available at line 2148301486
ld: error: inline assembly requires more registers than available at line 2148301486
ld: error: inline assembly requires more registers than available at line 2148301486
ld: error: inline assembly requires more registers than available at line 2148313628
ld: error: inline assembly requires more registers than available at line 2148313628
ld: error: inline assembly requires more registers than available at line 2148313628
ld: error: inline assembly requires more registers than available at line 700842
ld: error: inline assembly requires more registers than available at line 700122
ld: error: too many errors emitted, stopping now (use -error-limit=0 to see all errors)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
gmake[2]: *** [ffbuild/library.mak:103: libswscale/libswscale.so.5] Error 1


Is LTO-option to blame?

In addition, the maintainer of this port -- multimedia@FreeBSD.org -- is no longer approachable: the mailing list software rejects e-mails from non-members. I don't think, this is suitable for an official port-maintainer contact.
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2021-07-12 02:19:18 UTC
@Mikhail: Is the issue reproducible with the LTO option disabled?

Request feedback from two members of multimedia [1]

A separate issue should be created for any mail/maintainer changes required.

[1] https://wiki.freebsd.org/Multimedia
Comment 2 Mikhail Teterin freebsd_committer 2021-07-12 02:23:58 UTC
(In reply to Kubilay Kocak from comment #1)
> @Mikhail: Is the issue reproducible with the LTO option disabled?

Nope. With LTO disabled, I was able to build the new ffmpeg-4.4_2,1.
Comment 3 Daniel Engberg freebsd_committer 2021-07-12 08:20:17 UTC
I'd suggest that we remove the LTO option for i386 if that's the issue. There's no point in keeping broken options around and looking forward I doubt that i386 issues are going way.
Comment 4 Dimitry Andric freebsd_committer 2021-07-12 12:20:39 UTC
Yes, it's likely that the low number of registers, combined with more aggressive whole program optimization (in particular inlining) will lead to this type of error.

Unfortunately it is not possible to figure out from lld's error messages *which* particular inline asm fragments are responsible for this. If we could somehow find that out, you could possibly mark the functions that contain those fragments as __noinline, to prevent the register allocation from blowing up.
Comment 5 Ed Maste freebsd_committer 2021-07-12 13:13:47 UTC
> I'd suggest that we remove the LTO option for i386

Yes, the benefit of fixing LTO for i386 is not worth the effort it will require.

I am not sure how this is generally handled in ports, but perhaps

# i386 is too register-starved for LTO (PR257124)
.if ${ARCH} != i386
OPTIONS_EXCLUDE+= LTO
.endif
Comment 6 Dimitry Andric freebsd_committer 2021-07-12 13:49:36 UTC
(In reply to Ed Maste from comment #5)
I've noticed this interesting construct in graphics/libjxl/Makefile:

OPTIONS_DEFINE= GIF GIMP JPEG LTO MANPAGES OPENEXR PIXBUF PNG
OPTIONS_DEFAULT=GIF GIMP JPEG LTO MANPAGES OPENEXR PIXBUF PNG
OPTIONS_EXCLUDE_i386=   LTO # ConvolutionWithTranspose(): JXL_CHECK: out->xsize() == in.ysize()
OPTIONS_EXCLUDE_powerpc64=      ${"${/usr/bin/ld:L:tA}"==/usr/bin/ld.lld:?LTO:} # LLVM bug 47353

This also seems to be used a few other Makefiles.
Comment 7 Mikhail Teterin freebsd_committer 2021-07-12 16:21:11 UTC
> Yes, it's likely that the low number of registers
I'm confused... Is not the number of registers the same on the same processor -- whether it is running in 32- or 64-bit mode? Even if they are named/accessed differently?

> combined with more aggressive whole program optimization (in particular
> inlining) will lead to this type of error.
Frankly, if optimization results in errors, then it is not an optimization... I don't blame anyone here for the failure, just debating terminology :-)

If the otherwise valid code cannot be compiled (and/or linked), than it is a compiler (and/or linker) bug, is not it? Would it make sense to bring this up with LLVM-project directly?

> Yes, the benefit of fixing LTO for i386 is not worth the effort
> it will require.
I would've thought, with multimedia every CPU-instruction counts... Even if the hardware is fast enough for regular realtime playback, when performing format-conversions, CPU is almost always the bottleneck even on the fastest computers.

Perhaps, the option should carry a warning -- and be disabled by default on i386 -- but disabling it altogether seems too drastic.
Comment 8 Dimitry Andric freebsd_committer 2021-07-12 16:37:55 UTC
(In reply to Mikhail Teterin from comment #7)
> I'm confused... Is not the number of registers the same on the same processor -- whether it is running in 32- or 64-bit mode? Even if they are named/accessed differently?

No. The i386 architecture has 8 general purpose registers (32 bit only), of which only 6 are freely usable (as %esp and %ebp are used for the stack). The amd64 architecture has 16 general purpose registers (64 bit), of which 14 are freely usable.

Certain inline assembly constructs are parameterized such that the compiler can't satisfy the number of free registers asked, and then you get this error. If such assembly is critical, it is better to write the whole routines in .S files instead of attempting to do it inline in C or C++.


> Frankly, if optimization results in errors, then it is not an optimization... I don't blame anyone here for the failure, just debating terminology :-)

The problem is that once you start inlining, assumptions on how many registers are available might become invalid, as many more variables that were first on some stack frame are now put in registers.


> If the otherwise valid code cannot be compiled (and/or linked), than it is a compiler (and/or linker) bug, is not it? Would it make sense to bring this up with LLVM-project directly?

You could try, but I think they'll say that your inline assembly is bad (even if you could point to the particular piece of inline assembly, which we can't, unfortunately). There is never a guarantee that a compiler can instantiate any inline assembly, it is really a best effort. Again, if you need full control at that level, you should write your assembly code in assembly sources. FFmpeg does this for a lot of things using nasm or yasm, but for some reason (probably historical?) they still put a whole bunch of inline assembly in C files.


>I would've thought, with multimedia every CPU-instruction counts... Even if the hardware is fast enough for regular realtime playback, when performing format-conversions, CPU is almost always the bottleneck even on the fastest computers.

I'm sure nobody will use an ancient i386 only machine as an FFmpeg conversion box. :)


> Perhaps, the option should carry a warning -- and be disabled by default on i386 -- but disabling it altogether seems too drastic.

I'm fine with that, but people will likely encounter the errors mentioned in the beginning, and submit more bugs. You could maybe use a BROKEN= syntax in the Makefile? I'm not entirely sure how that works though.
Comment 9 Ed Maste freebsd_committer 2021-07-12 16:53:43 UTC
(In reply to Mikhail Teterin from comment #7)
> I'm confused... Is not the number of registers the same on the same processor
> -- whether it is running in 32- or 64-bit mode? Even if they are
> named/accessed differently?

amd64 has r8 through r15 in addition to the 64-bit versions of the common registers (rax, rbx etc.)

> Frankly, if optimization results in errors, then it is not an optimization...
> I don't blame anyone here for the failure, just debating terminology :-)

Perhaps a non-functional optimization, but indeed it doesn't really matter.

> If the otherwise valid code cannot be compiled (and/or linked), than it is a
> compiler (and/or linker) bug, is not it? Would it make sense to bring this up
> with LLVM-project directly?

Perhaps, although I suspect there will not be a lot of interest in investigating i386-specific optimization issues.

> I would've thought, with multimedia every CPU-instruction counts... Even if
> the hardware is fast enough for regular realtime playback, when performing
> format-conversions, CPU is almost always the bottleneck even on the fastest
> computers.

Indeed, no disagreement that optimization is desirable on multimedia ports. My point is just that there is likely to be little developer effort available (upstream or in FreeBSD) to work on these issues on i386.

> Perhaps, the option should carry a warning -- and be disabled by default on
> i386 -- but disabling it altogether seems too drastic.

I believe it is disabled by default on all archs right now?

The option should be a warning, error, or not available on i386; I don't have strong feelings on which it is.
Comment 10 Mikhail Teterin freebsd_committer 2021-07-12 17:01:37 UTC
> The problem is that once you start inlining

It is my understanding, that "inline" is a hint (as is/was "register")... If compiler knows, the target architecture cannot handle it, it will/should skip it.

> Perhaps, although I suspect there will not be a lot of interest

Ok, so we agree, that it is a clang/llvm bug and what's left is to figure out, what to do about it.

> investigating i386-specific optimization issues

Sad... i386 is still listed as Tier-1 in 11. and 12., becoming Tier-2 in 13.

    https://www.freebsd.org/platforms/

I wonder, what LLVM's stance on this is.

> I believe it is disabled by default on all archs right now?

Yes, it is, you're right. I'd say, it can be enabled by default, where known to work, and marked with warning elsewhere. But not REMOVED altogether.

> I'm sure nobody will use an ancient i386 only machine

There are valid reasons to use i386 even on modern processors -- such as, for example, as a small-memory (under 4Gb) VM in a large (64-bit) host.
Comment 11 Ed Maste freebsd_committer 2021-07-13 15:27:53 UTC
(In reply to Mikhail Teterin from comment #10)
> It is my understanding, that "inline" is a hint (as is/was "register")...
> If compiler knows, the target architecture cannot handle it, it will/should
> skip it.

I have not actually looked at the source, but I think the error is referring to inline assembly, i.e., asm("whatever"); in C source.

> Ok, so we agree, that it is a clang/llvm bug

I'm not sure that is the case.
Comment 12 Mikhail Teterin freebsd_committer 2021-07-13 16:18:46 UTC
> I think the error is referring to inline assembly, i.e., asm("whatever");
Why would a problem with inline assembly be manifested only in case of LTO?

> > Ok, so we agree, that it is a clang/llvm bug
> I'm not sure that is the case.
If it is not, then the bug is with ffmpeg -- something FreeBSD port-maintainers may be able to patch and/or raise with the upstream authors.
Comment 13 Dimitry Andric freebsd_committer 2021-07-13 18:52:53 UTC
(In reply to Mikhail Teterin from comment #12)
> Why would a problem with inline assembly be manifested only in case of LTO?

LTO will enable global optimizations, such as inlining functions from different translation units. If functions get inlined, the number of available register slots can change, since there are now also other variables in the blocks of code being compiled. This can lead to a shortage of registers. In case of plain C or C++ code, the compiler knows how to rearrange some variables so they go to the stack instead, but for inline assembly there is no such choice, since the author specifies (or at least, can specify) precisely the registers to use. In some situations this can leave the compiler with no registers 'left' to put other variables in, and typically it will error out then. This is indeed a cop-out strategy, and happens often with inline assembly.

In the past, FFmpeg had many more cases of this, and also could make gcc keel over unless you used very specific optimization flags. Which is probably why FFmpeg started rewriting most of their assembly functions in nasm.


> If it is not, then the bug is with ffmpeg -- something FreeBSD port-maintainers may be able to patch and/or raise with the upstream authors.

The problem here is pinpointing the particular function that is causing the error. This is at the moment not possible, since the LTO stage doesn't seem to keep enough information around to tell you which source file is involved. Also, since the line numbers it mentions are very large, I guess it is reporting something about an intermediate form (.ll or .bc file). So to be able to find the culprit(s), you'd have to somehow get to that intermediate form.
Comment 14 Mikhail Teterin freebsd_committer 2021-07-13 19:01:41 UTC
(In reply to Dimitry Andric from comment #13)
Thank you, Dimitry, this is quite educational -- and will help people bumping into this in the future.

Still:

> If functions get inlined, the number of available register slots can change,
> since there are now also other variables in the blocks of code being compiled.
> This can lead to a shortage of registers.

If the code still links without LTO, should not the linker, upon detecting such problem, abandon the LTO-attempts for this file/function and move-on as if LTO was not requested? With a warning, but not an error...

That's, what a compiler would do, when unachievable optimization is requested, is it too much to expect a graceful (rather than catastrophic) degradation from a linker as well?
Comment 15 Daniel Engberg freebsd_committer 2021-07-13 22:14:14 UTC
(In reply to Mikhail Teterin from comment #7)
I think you might be overestimating how much it actually matters (performs). Last time I checked it was a few percent at best which on i386 only hardware most likely wont matter at all in the end. It's very likely not to make your H.264 1080p clip play smoothly etc and keep in mind that for the majority of this ports lifetime LTO hasn't even been enabled or for that matter available. See https://github.com/freebsd/freebsd-ports/commit/7edd31685d7170e930790542a302ad4115c83c36 (added in Dec of 2019)

I very much welcome "free" performance enhancements and if someone wants to investigate it that's great but it's probably not worth adding to the pile of "things to investigate do with very limited resources".
Comment 16 Mikhail Teterin freebsd_committer 2021-07-13 22:38:50 UTC
(In reply to Daniel Engberg from comment #15)
> I think you might be overestimating how much it actually matters

Oh, I don't expect any dramatic performance improvements, no. I'm just annoyed, that things, that are supposed to work, do not.

Which annoyance is further compounded by the lack of concern on behalf of colleagues.

It is a more general problem, than i386 and ffmpeg: thunderbird wouldn't build with LTO enabled even on amd64 -- but firefox builds. Libreoffice fails with LTO too.

I don't expect anyone to rush into fixing all of these, but, at least, I'd like to see an agreement, that there is a BUG (or more).
Comment 17 Ed Maste freebsd_committer 2021-07-27 19:36:34 UTC
Nobody claims there is no bug, just that getting this port to work with LTO on i386 is rather low on the priority list.
Comment 18 commit-hook freebsd_committer 2021-08-29 14:05:58 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=0561f3e6355206d3ff81b2d9a69f62e10fc7f16e

commit 0561f3e6355206d3ff81b2d9a69f62e10fc7f16e
Author:     Thomas Zander <riggs@FreeBSD.org>
AuthorDate: 2021-08-29 14:00:36 +0000
Commit:     Thomas Zander <riggs@FreeBSD.org>
CommitDate: 2021-08-29 14:05:37 +0000

    multimedia/ffmpeg: Exclude LTO from OPTIONS on i386.

    Details:
    - The low number of i386 leads to register exhaustion when compiling
      with LTO. Due to the decreasing popularity of 32 bit i386 machines
      which require hyper-optimised ffmpeg builds, the option is excluded
      from the builds for now.

    PR:             257124
    MFH:            2021Q3

 multimedia/ffmpeg/Makefile | 3 +++
 1 file changed, 3 insertions(+)
Comment 19 commit-hook freebsd_committer 2021-08-29 14:08:00 UTC
A commit in branch 2021Q3 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=ce777b825e3c72834daf274b53bfde010f24e460

commit ce777b825e3c72834daf274b53bfde010f24e460
Author:     Thomas Zander <riggs@FreeBSD.org>
AuthorDate: 2021-08-29 14:00:36 +0000
Commit:     Thomas Zander <riggs@FreeBSD.org>
CommitDate: 2021-08-29 14:06:31 +0000

    multimedia/ffmpeg: Exclude LTO from OPTIONS on i386.

    Details:
    - The low number of i386 leads to register exhaustion when compiling
      with LTO. Due to the decreasing popularity of 32 bit i386 machines
      which require hyper-optimised ffmpeg builds, the option is excluded
      from the builds for now.

    PR:             257124
    MFH:            2021Q3
    (cherry picked from commit 0561f3e6355206d3ff81b2d9a69f62e10fc7f16e)

 multimedia/ffmpeg/Makefile | 3 +++
 1 file changed, 3 insertions(+)