Bug 230888 - Missing 64 bit atomic functions for i386 (libatomic)
Summary: Missing 64 bit atomic functions for i386 (libatomic)
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: i386 Any
: --- Affects Some People
Assignee: freebsd-toolchain mailing list
URL:
Keywords: feature, needs-patch, needs-qa
: 220822 233725 234976 (view as bug list)
Depends on:
Blocks: 240594 242229
  Show dependency treegraph
 
Reported: 2018-08-25 12:53 UTC by Dimitry Andric
Modified: 2019-11-26 18:41 UTC (History)
14 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dimitry Andric freebsd_committer 2018-08-25 12:53:48 UTC
We have always been missing 64 bit atomic functions for i386, and this issue has now unfortunately come up again when testing llvm 7.0.0 release candidates[1].

Our old gcc 4.2 never supported the necessary C11 constructs, so it was never a problem there.  But since clang has been introduced, it has silently emitted "lock cmpxchg8b" instructions, even though these are only supported by i586 and higher CPUs.

Upstream llvm attempted to fix this a number of times, most recently in <https://reviews.llvm.org/rL323281>.  However, for the default target CPU on i386-freebsd, which is i486, it will be forced to emit function calls to atomic primitives, as __atomic_load_8(), __atomic_fetch_add_8() and such.

Demonstration with ports gcc:

$ cat test-c11-atomic-2.c
_Atomic(long long) ll;

int main(void)
{
        ++ll;
        return 0;
}

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc8/gcc/i386-portbld-freebsd12.0/8.2.0/lto-wrapper
Target: i386-portbld-freebsd12.0
Configured with: /wrkdirs/share/dim/ports/lang/gcc8/work/gcc-8.2.0/configure --disable-multilib --disable-bootstrap --disable-nls --enable-gnu-indirect-function --libdir=/usr/local/lib/gcc8 --libexecdir=/usr/local/libexec/gcc8 --program-suffix=8 --with-as=/usr/local/bin/as --with-gmp=/usr/local --with-gxx-include-dir=/usr/local/lib/gcc8/include/c++/ --with-ld=/usr/local/bin/ld --with-pkgversion='FreeBSD Ports Collection' --with-system-zlib --with-isl=/usr/local --enable-languages=c,c++,objc,fortran --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --infodir=/usr/local/info/gcc8 --build=i386-portbld-freebsd12.0
Thread model: posix
gcc version 8.2.0 (FreeBSD Ports Collection)

$ gcc -O2 -S test-c11-atomic-2.c -o -
        .file   "test-c11-atomic-2.c"
        .text
        .section        .text.startup,"ax",@progbits
        .p2align 4,,15
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        leal    4(%esp), %ecx
        .cfi_def_cfa 1, 0
        andl    $-16, %esp
        pushl   -4(%ecx)
        pushl   %ebp
        .cfi_escape 0x10,0x5,0x2,0x75,0
        movl    %esp, %ebp
        pushl   %ecx
        .cfi_escape 0xf,0x3,0x75,0x7c,0x6
        subl    $4, %esp
        pushl   $5
        pushl   $0
        pushl   $1
        pushl   $ll
        call    __atomic_fetch_add_8
        movl    -4(%ebp), %ecx
        .cfi_def_cfa 1, 0
        addl    $16, %esp
        xorl    %eax, %eax
        leave
        .cfi_restore 5
        leal    -4(%ecx), %esp
        .cfi_def_cfa 4, 4
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .comm   ll,8,8
        .ident  "GCC: (FreeBSD Ports Collection) 8.2.0"
        .section        .note.GNU-stack,"",@progbits

$ gcc -O2 test-c11-atomic-2.c
/tmp//cc4hCSj6.o: In function `main':
test-c11-atomic-2.c:(.text.startup+0x1d): undefined reference to `__atomic_fetch_add_8'
collect2: error: ld returned 1 exit status

We should really make a choice and either:
1) Implement __atomic_load_8, __atomic_fetch_8 and others for i386, with or without kernel support, and add them to libc or libgcc
2) Change the default target CPU to i586 (which is actually the de facto default for a few years now), or even i686.

I assume 1) is quite a lot of work, so 2) is likely the way of least effort, but might meet with some resistance to change and POLA complaints.

[1] http://lists.llvm.org/pipermail/release-testers/2018-August/000753.html
[2] https://reviews.llvm.org/rL323281
Comment 1 Dimitry Andric freebsd_committer 2018-08-25 13:01:19 UTC
See also bug 220822, bug 229605, and probably lots more.

Assigning to toolchain@ since that is a better place than bugs@, which I think nobody reads. :)
Comment 2 Conrad Meyer freebsd_committer 2018-08-25 21:59:18 UTC
Isn't (1) just: kernel: mask all interrupts, do operation, unmask; userspace: take a global mutex over the op?

Do we care if 64-bit atomics don't work from NMI contexts on i486-class hardware?  I doubt it.

I'm also pro gone_in(12)'ing i486 and lower.
Comment 3 David Chisnall freebsd_committer 2018-08-26 10:44:30 UTC
Is there a reason, given that LLVM's atomic.c is in contrib, that we don't just connect it to the libgcc_s build?

Note that the problem is not just 64-bit.  An increasing amount of x86-64 code depends on 128-bit atomics (which are single instructions if compiling with -mcx16, libcalls otherwise).  C11 allows arbitrary sized atomics.  It's allowed to write _Atomic(struct X) for any arbitrary X.  The code in atomic.c handles this, though not for atomic types in shared memory (which can't be supported without changing the ABI - something that WG21 thought through by making std::atomic a library feature and allowing larger atomic types to be implemented with an inline lock, and which WG14 completely messed up).
Comment 4 Conrad Meyer freebsd_committer 2018-08-26 17:22:24 UTC
(In reply to David Chisnall from comment #3)
Seems like a C lock implementation for large types is allowed to place the lock in the shared memory with the object.  (Not that using shared memory or _Atomic structs is necessarily a good idea.)

"The size, representation, and alignment of an atomic type need not be the same as those of the corresponding unqualified type."
Comment 5 Harlan Stenn 2018-08-27 10:57:53 UTC
This is also an issue for MariaDB and unpatched haproxy-1.8 .

As far as i486 goes, I'd rather see this fixed for the easy cases now, and perhaps the more difficult cases can be addressed later, if possible.

H
Comment 6 Dimitry Andric freebsd_committer 2019-03-09 15:14:43 UTC
*** Bug 233725 has been marked as a duplicate of this bug. ***
Comment 7 Dimitry Andric freebsd_committer 2019-03-09 15:14:52 UTC
*** Bug 220822 has been marked as a duplicate of this bug. ***
Comment 8 Dimitry Andric freebsd_committer 2019-03-09 15:15:17 UTC
*** Bug 234976 has been marked as a duplicate of this bug. ***
Comment 9 Jan Beich freebsd_committer 2019-04-27 18:17:43 UTC
Also affects LLVM openmp e.g.,
http://beefy11.nyi.freebsd.org/data/head-i386-default/p499924_s346654/logs/errors/colmap-3.5_7.log

$ cat a.c
#include <stdint.h>

int main(void)
{
	int64_t j, i = 1;
#pragma omp atomic read
	j = i;
	return 0;
}

$ cc -fopenmp a.c
ld: error: undefined symbol: __atomic_load
>>> referenced by foo.c
>>>               /tmp/foo-fdf421.o:(main)
cc: error: linker command failed with exit code 1 (use -v to see invocation)
Comment 10 Jan Beich freebsd_committer 2019-04-27 18:19:22 UTC
Err, I meant Clang -fopenmp, not LLVM openmp (library).
Comment 11 Piotr Kubaj freebsd_committer 2019-07-19 16:17:57 UTC
This also affects powerpc64 elfv2 (which uses clang from base).

E.g. www/node* ports need -latomic:
        # Platforms that don't have Compare-And-Swap (CAS) support need to link atomic library
        # to implement atomic memory access
        ['v8_current_cpu in ["mips", "mipsel", "mips64", "mips64el", "ppc", "ppc64", "s390", "s390x"]', {
          'link_settings': {
            'libraries': ['-latomic', ],
          },
        }],
Comment 12 Dimitry Andric freebsd_committer 2019-08-30 20:48:22 UTC
This issue has now come up again with the import of llvm 9.0.0 (which is ongoing).  Upstream has committed the following change, https://reviews.llvm.org/rL356631 ("[X86] Add CMPXCHG8B feature flag. Set it for all CPUs except i386/i486 including 'generic'. Disable use of CMPXCHG8B when this flag isn't set")

Before that change, clang actually had a bug, in the sense that it *did* sometimes generate cmpxchg8b instructions on i486, which is still our default target CPU.  This saved us from most cases where otherwise __atomic_xxx function calls would be inserted.  However, now this bug has been fixed, so it will always use function calls for atomic 64 bit operations.

Upstream, I have had to report that it cannot even do the unit tests for compiler-rt 9.0.0 anymore, since that already runs into link errors about __atomic_xxx functions: https://bugs.llvm.org/show_bug.cgi?id=42892

Realistically, the best way forward in the short term is to raise the default target CPU for FreeBSD to i586 or even i686, so cmpxchg8b is supported.  Specifically, because I do not think it is very realistic to run modern FreeBSD on real i486-class hardware.  Not to mention that that is basically museum class hardware now. :-)
Comment 13 Conrad Meyer freebsd_committer 2019-08-30 22:52:00 UTC
The project is somewhat schizophrenic about i386.  On the one hand there are calls for tier 3 status or even removal.  On the other hand we *must* support ancient i486 at the cost of usability on slightly less ancient ia32 hardware.

+100, let's drop i486 support and go straight to i686 at a minimum.

I ran a 2.6 Linux on a real i586 with 128 MB of RAM over 10 years ago *and it was dog slow then*.  To run the package manager required disk swap.  People with real museum class ia32 that want to run a recently released BSD slowly can use NetBSD or OpenBSD (lock contention shouldn't be an issue).
Comment 14 Warner Losh freebsd_committer 2019-08-31 01:43:00 UTC
In the past we've kept 486 for two reasons. As a core technology, it was around in the embedded space well into the 686 era, so there were latter-day versions of that technology well past the classic 486s that are being sneered at a bit in this bug (though these too are now quite old). The Soekris box was one example. Now that it's become a burden, I think a good case could be made for its removal.

We can make the default i686, say, and give people that are interested in 486/586 until just before the 13 branch to fix it or we remove it. That blunts the criticism somewhat, and make people put their money where their mouths are...

And the 'let's remove i386' is an outlier position. There's strong support for it at least being a userland ABI that we support as a tier 1 platform, with the kernel dropping to tier 2 for 13. Now, this may change in 14, but that's 5 years off yet :). There's always radical positions within the project... Best not to take what any one person says seriously...

But whatever you do, I'd strongly suggest talking about it in arch@. It helps to have a firm plan and good justification for that plan. If I may be so bold, I'd suggest removing 486 support in the kernel; support for generating new 486 binaries and make the default i686, but allow i586 builds (unless there's a good technical reason for not doing that). I'd justify it with the amount of work to support the 486 has become burdensome and if we're going to change, we might as well go to something a bit newer by default, but allow the folks that need it to build binaries (or not, depending on the technical stuff). I suspect that this will be close enough to what most people want as to make it through an arch@ gauntlet and even though that might be a bit painful, it will get us to buy in.

My own experience is that 600MHz pentium III are still decent enough, though for a desktop with a modern web browser, you really need something quite a bit more modern. I know people are still embedding 686 and maybe 586 boxes still with FreeBSD, though I know of no-one that still needs the 486 stuff. This came up 6 months ago and that was the result of my survey then...
Comment 15 Conrad Meyer freebsd_committer 2019-08-31 02:24:45 UTC
Dropping 586 entirely just gives me a good excuse to throw out this useless heavy monster I've got cluttering my office with the vague idea of eventually running FreeBSD on it.  I don't need another project, help me toss it out :-).
Comment 16 Konstantin Belousov freebsd_committer 2019-08-31 09:16:20 UTC
I can implement emulation of cmpxchg8b where it missed.  It would help at runtime but not at linktime.

If, as mentioned in the discussion, there is already an implementation of doubleCAS in compiler_rt, why not put it into libgcc_s.a (*not .so*) ?
Comment 17 Fernando Apesteguía freebsd_committer 2019-11-18 17:04:07 UTC
Hi there,

cad/openvsp is affected by this problem too. Is there a roadmap to fix this?

Thanks!
Comment 18 Kubilay Kocak freebsd_committer freebsd_triage 2019-11-25 01:25:30 UTC
(In reply to Fernando Apesteguía from comment #17)

Please create separate issues that "depend on" this issue ID, for any ports that depend on it.

We should probably create a separate issue that blocks this one for "Remove $arch support" if it is a pre-requisite/dependency for getting this resolved, assigned to the appropriate people taking care of it.
Comment 19 Jan Beich freebsd_committer 2019-11-26 18:19:07 UTC
What's the plan for /stable/11 and /stable/12? If you're going to keep i486 support by NOT upgrading Clang let's revive bug 215193.
Comment 20 Dimitry Andric freebsd_committer 2019-11-26 18:41:31 UTC
(In reply to Jan Beich from comment #19)
I have no specific plans for this particular bug, but my personal preference would be to merge llvm/clang/libc+/etc 9.0 (and later) into stable/12 and stable/11.

Because in practice, clang 8.0 and earlier have generating code that will only run on i586 and higher anyway, for a long time, even if you *did* pass -march=i486!

That said, I know some people are of the opinion that stable branches should receive far less "feature updates", or even none at all, and only security or "P1" bug fixes.  Therefore, this is not really my call, but I gladly leave it up to the community to decide. :)

Note that this latter scenario (stable branches getting no feature updates) will almost certainly require either stable branches for ports in general, or the ports collection stopping to use the base toolchains and only using their own.