Bug 282471 - databases/mongodb80: Illegal instruction due use of the AVX instruction vxorps when NOAVX=ON
Summary: databases/mongodb80: Illegal instruction due use of the AVX instruction vxorp...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Ronald Klop
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-11-01 16:43 UTC by Yuri Victorovich
Modified: 2024-11-16 03:16 UTC (History)
3 users (show)

See Also:
ronald: maintainer-feedback+


Attachments
git diff fixing noavx (1.91 KB, patch)
2024-11-07 15:40 UTC, Ronald Klop
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yuri Victorovich freebsd_committer freebsd_triage 2024-11-01 16:43:16 UTC
mongod from the default build crashes with "Illegal instruction" with this stack:

>  1│ Dump of assembler code for function _GLOBAL__sub_I_TimeStamp.cpp:
>  2│    0x0000000008a26840 <+0>:     push   %rbp
>  3│    0x0000000008a26841 <+1>:     mov    %rsp,%rbp
>  4│    0x0000000008a26844 <+4>:     push   %rbx
>  5│    0x0000000008a26845 <+5>:     push   %rax
>  6├──> 0x0000000008a26846 <+6>:     vxorps %xmm0,%xmm0,%xmm0
>  7│    0x0000000008a2684a <+10>:    vmovups %xmm0,0x166f426(%rip)        # 0xa095c78 <_ZN7mozillaL9sInitOnceE>
>  8│    0x0000000008a26852 <+18>:    lea    0x166f41f(%rip),%rbx        # 0xa095c78 <_ZN7mozillaL9sInitOnceE>
>  9│    0x0000000008a26859 <+25>:    call   0x8a36e30 <_ZN7mozilla9TimeStamp7StartupEv>
> 10│    0x0000000008a2685e <+30>:    mov    $0x1,%edi
> 11│    0x0000000008a26863 <+35>:    call   0x8a371f0 <_ZN7mozilla9TimeStamp3NowEb>
> 12│    0x0000000008a26868 <+40>:    mov    %rax,0x166f409(%rip)        # 0xa095c78 <_ZN7mozillaL9sInitOnceE>

Apparently the NOAVX option wasn't used for at least the TimeStamp.cpp module.
Comment 1 Yuri Victorovich freebsd_committer freebsd_triage 2024-11-01 16:43:52 UTC
Instruction docs:
https://hjlebbink.github.io/x86doc/html/XORPS.html
Comment 2 Ronald Klop freebsd_committer freebsd_triage 2024-11-01 18:10:00 UTC
Interesting. Do you have an URL pointer to the build log of the version you installed?
On what version of FreeBSD did you run it?
Comment 3 Yuri Victorovich freebsd_committer freebsd_triage 2024-11-01 18:27:31 UTC
(In reply to Ronald Klop from comment #2)

I run FreeBSD 14.1 on an older CPU.
The CPU causes the problem because it can't execute this AVX instruction.
The build from the package: 'pkg install mongodb80'.
This port has the NOAVX option which obviously is intended to build for generic CPUs (assuming only SSE2) but it apparently still builds with AVX instructions.
Comment 4 Ronald Klop freebsd_committer freebsd_triage 2024-11-01 22:42:21 UTC
(In reply to Yuri Victorovich from comment #3)
I think I found the cause. Some files (the mozjs part) are compiled with -mavx2.
Mongodb70 probably fails too, and mongodb50 and mongodb60 are working fine I think.
Could you check that? I don't have non-AVX2 hardware.

I'm testing a patch now.
Comment 5 Yuri Victorovich freebsd_committer freebsd_triage 2024-11-01 22:50:31 UTC
mongodb70 has the same problem.
mongodb60 doesn't have it.
mongodb50 doesn't have it.
Comment 6 commit-hook freebsd_committer freebsd_triage 2024-11-02 07:58:18 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=2dbe4cb866c9380901bf81d0a939a78a731e64b7

commit 2dbe4cb866c9380901bf81d0a939a78a731e64b7
Author:     Ronald Klop <ronald@FreeBSD.org>
AuthorDate: 2024-11-02 07:52:02 +0000
Commit:     Ronald Klop <ronald@FreeBSD.org>
CommitDate: 2024-11-02 07:55:54 +0000

    databases/mongodb[78]0: fix NOAVX option

    bump portrevision as it did build but generated broken executables
    piggyback a small portlint pacifier

    PR:     282471

 databases/mongodb70/Makefile                                  |  6 ++++--
 .../files/extrapatch-src_third__party_mozjs_SConscript (new)  | 11 +++++++++++
 databases/mongodb80/Makefile                                  |  2 ++
 .../files/extrapatch-src_third__party_mozjs_SConscript (new)  | 11 +++++++++++
 4 files changed, 28 insertions(+), 2 deletions(-)
Comment 7 Ronald Klop freebsd_committer freebsd_triage 2024-11-02 10:51:03 UTC
Thanks for the issue report.
I'm pretty sure the package will be fixed. It will take a couple of days for rebuild packages to appear in the official pkg repository.
If the issue is not resolved, please re-open or add a comment.

NB: I didn't apply the fix on the quarterly branch. If needed, please add a comment too.
Comment 8 Yuri Victorovich freebsd_committer freebsd_triage 2024-11-02 15:33:04 UTC
The build fails:

ld.lld: error: undefined symbol: mozilla::sse_private::avx2_enabled
>>> referenced by SSE.h:324 (src/third_party/mozjs/include/mozilla/SSE.h:324)
>>>               extract/mozglue/misc/SIMD.o:(mozilla::SupportsAVX2()) in archive build/opt/third_party/mozjs/libmozjs.a
>>> referenced by SSE.h:324 (src/third_party/mozjs/include/mozilla/SSE.h:324)
>>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr8(char const*, char, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a
>>> referenced by SSE.h:324 (src/third_party/mozjs/include/mozilla/SSE.h:324)
>>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr16(char16_t const*, char16_t, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a
>>> referenced 1 more times

ld.lld: error: undefined symbol: mozilla::SIMD::memchr8AVX2(char const*, char, unsigned long)
>>> referenced by SIMD.cpp:463 (src/third_party/mozjs/extract/mozglue/misc/SIMD.cpp:463)
>>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr8(char const*, char, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a

ld.lld: error: undefined symbol: mozilla::SIMD::memchr16AVX2(char16_t const*, char16_t, unsigned long)
>>> referenced by SIMD.cpp:476 (src/third_party/mozjs/extract/mozglue/misc/SIMD.cpp:476)
>>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr16(char16_t const*, char16_t, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a

ld.lld: error: undefined symbol: mozilla::SIMD::memchr64AVX2(unsigned long const*, unsigned long, unsigned long)
>>> referenced by SIMD.cpp:484 (src/third_party/mozjs/extract/mozglue/misc/SIMD.cpp:484)
Comment 9 Ronald Klop freebsd_committer freebsd_triage 2024-11-02 20:01:06 UTC
Thanks for reporting. Apparently committed too soon.

Patches are welcome.
I'm thinking about opening an issue upstream about this.
Comment 10 commit-hook freebsd_committer freebsd_triage 2024-11-02 20:02:47 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=a3bd39f8744c63eba6e345c8cfb220f0c2036840

commit a3bd39f8744c63eba6e345c8cfb220f0c2036840
Author:     Ronald Klop <ronald@FreeBSD.org>
AuthorDate: 2024-11-02 19:55:42 +0000
Commit:     Ronald Klop <ronald@FreeBSD.org>
CommitDate: 2024-11-02 20:01:52 +0000

    databases/mongodb[78]0: NOAVX is broken

    Disable NOAVX as default until further investigation.

    ld.lld: error: undefined symbol: mozilla::sse_private::avx2_enabled
    >>> referenced by SSE.h:324 (src/third_party/mozjs/include/mozilla/SSE.h:324)
    >>>               extract/mozglue/misc/SIMD.o:(mozilla::SupportsAVX2()) in archive build/opt/third_party/mozjs/libmozjs.a
    >>> referenced by SSE.h:324 (src/third_party/mozjs/include/mozilla/SSE.h:324)
    >>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr8(char const*, char, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a
    >>> referenced by SSE.h:324 (src/third_party/mozjs/include/mozilla/SSE.h:324)
    >>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr16(char16_t const*, char16_t, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a
    >>> referenced 1 more times

    ld.lld: error: undefined symbol: mozilla::SIMD::memchr8AVX2(char const*, char, unsigned long)
    >>> referenced by SIMD.cpp:463 (src/third_party/mozjs/extract/mozglue/misc/SIMD.cpp:463)
    >>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr8(char const*, char, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a

    ld.lld: error: undefined symbol: mozilla::SIMD::memchr16AVX2(char16_t const*, char16_t, unsigned long)
    >>> referenced by SIMD.cpp:476 (src/third_party/mozjs/extract/mozglue/misc/SIMD.cpp:476)
    >>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr16(char16_t const*, char16_t, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a

    ld.lld: error: undefined symbol: mozilla::SIMD::memchr64AVX2(unsigned long const*, unsigned long, unsigned long)
    >>> referenced by SIMD.cpp:484 (src/third_party/mozjs/extract/mozglue/misc/SIMD.cpp:484)
    >>>               extract/mozglue/misc/SIMD.o:(mozilla::SIMD::memchr64(unsigned long const*, unsigned long, unsigned long)) in archive build/opt/third_party/mozjs/libmozjs.a
    c++: error: linker command failed with exit code 1 (use -v to see invocation)

    PR:     282471

 databases/mongodb70/Makefile | 2 +-
 databases/mongodb80/Makefile | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
Comment 11 Ronald Klop freebsd_committer freebsd_triage 2024-11-04 15:42:18 UTC
Hi Borja Marcos,

I'm reaching out to you as you made patches for bug #268510 to add NOAVX support to mongodb50 and mongodb60.
Currently NOAVX is not working for mongodb70 and mongodb80.

Would you have time to see what is needed for a fix? If you don't have time or interest please let it know. That is fine too of course.

Regards,
Ronald.
Comment 12 Borja Marcos 2024-11-04 16:37:22 UTC
Which CPU model are you using?

Is it maybe older than Sandybridge?

(please include CPU data from /var/run/dmesg.boot)
Comment 13 Borja Marcos 2024-11-05 14:35:21 UTC
Loking at the Mongodb 8.0sources, I see this on the src/third_party/mozjs directory.

Thre is a file called SConscript with this code line:

SConscript:    env.Append(CCFLAGS=['-mavx2'])

Seems it is getting fashionable to demand certain CPU instruction sets. Of course they are great when you have them. 

-----
if env['TARGET_ARCH'] == 'x86_64' and not env.TargetOSIs('windows'):
    env.Append(CCFLAGS=['-mavx2'])
    sources.extend(["extract/mozglue/misc/SIMD_avx2.cpp", "extract/mozglue/misc/
SSE.cpp"])
-----

I would try to comment those lines and see how it goes. If it works we can check whether we can define a variable from the port Makefile that can really reach that SCons file.
Comment 14 Ronald Klop freebsd_committer freebsd_triage 2024-11-06 11:58:17 UTC
(In reply to Borja Marcos from comment #13)
That mozjs -mavx2 is what I tried in https://cgit.freebsd.org/ports/commit/?id=2dbe4cb866c9380901bf81d0a939a78a731e64b7.
But after that change linking fails as shown in comment #10.
Comment 15 Borja Marcos 2024-11-06 12:30:19 UTC
OK sorry.

I see a file that references SIMD_avx2.cpp

(in src/third_party/mozjs/extract/mozglue/misc)

-----
#  if defined(MOZILLA_MAY_SUPPORT_AVX2) && defined(__x86_64__)

bool SupportsAVX2() { return supports_avx2(); }

#  else

bool SupportsAVX2() { return false; }

#  endif
-------

If we undefine MOZILLA_MAY_SUPPORT_AVX2 it should fix it. 


Sorry, can't try right now. Just quick and dirty look.

I will try later but now I have "modern" CPUs. (Not so old actually!)


Hope it helps!
Comment 16 Borja Marcos 2024-11-06 12:32:59 UTC
Sorry, the code snippet is from SIMD.cpp.

If we undefine MOZILLA_MAY_SUPPORT_AVX2 (together with your previous patch) I think it should work.
Comment 17 borjamar 2024-11-06 19:17:59 UTC
It's me different username.

It worked. I can't run it (no appropiate CPU handy) but I am pretty sure it went well.

It built, there were no avx / sandybridge flags that I know, and the SIMD_avx2.cpp file was not compiled at all. 

On line 451 I added a False && to the ifdef.

-----
#  if False && defined(MOZILLA_MAY_SUPPORT_AVX2) && defined(__x86_64__)
-----

So that, together with suppressing the references to the avx compile flag and the SIMD_avx2.cpp file from Scons will solve it (I think!)

Please give it a try. I can't.
Comment 18 Ronald Klop freebsd_committer freebsd_triage 2024-11-07 15:40:45 UTC
Created attachment 255008 [details]
git diff fixing noavx

This patch should fix the build and hopefully gives a working mongodb on non-AVX systems.
yuri@ are you able to build mongodb70 using this patch and test if mongodb runs on your system?
Comment 19 commit-hook freebsd_committer freebsd_triage 2024-11-13 20:32:51 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=f8060500164aa5cb1b7b5b9d461f8390a0588ac0

commit f8060500164aa5cb1b7b5b9d461f8390a0588ac0
Author:     Ronald Klop <ronald@FreeBSD.org>
AuthorDate: 2024-11-07 15:48:44 +0000
Commit:     Ronald Klop <ronald@FreeBSD.org>
CommitDate: 2024-11-13 20:32:04 +0000

    databases/mongodb[78]0: fix build with NOAVX enabled

    Thanks to Yuri and Borja who helped creating patches and tested.

    PR:     282471

 databases/mongodb70/Makefile                                  |  7 ++++---
 ...src_third__party_mozjs_extract_mozglue_misc_SIMD.cpp (new) | 11 +++++++++++
 databases/mongodb80/Makefile                                  |  7 ++++---
 ...src_third__party_mozjs_extract_mozglue_misc_SIMD.cpp (new) | 11 +++++++++++
 4 files changed, 30 insertions(+), 6 deletions(-)
Comment 20 Ronald Klop freebsd_committer freebsd_triage 2024-11-15 20:32:37 UTC
(In reply to Yuri Victorovich from comment #3)
The official package 8.0.1_2 on 14/amd64 is build.
https://www.freshports.org/databases/mongodb80/#packages
Can you test it so I hopefully can close the issue?
Just a 'pkg install mongodb80' should install it.
Comment 21 Yuri Victorovich freebsd_committer freebsd_triage 2024-11-15 22:20:25 UTC
8.0.1_2 doesn't crash any more.

Thank you for fixing it!
Comment 22 Ronald Klop freebsd_committer freebsd_triage 2024-11-16 03:16:45 UTC
Thanks for all your help and effort.