Bug 237960 - lang/erlang: Does not build on arm with the NATIVE option enabled
Summary: lang/erlang: Does not build on arm with the NATIVE option enabled
Status: Closed Overcome By Events
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: arm Any
: --- Affects Some People
Assignee: freebsd-erlang (Nobody)
URL: https://bugs.erlang.org/projects/ERL/...
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2019-05-18 01:30 UTC by Bertrand Petit
Modified: 2023-10-30 15:41 UTC (History)
5 users (show)

See Also:
koobs: maintainer-feedback? (erlang)
koobs: merge-quarterly?


Attachments
Build log with Hipe and NATIVE (694.90 KB, text/plain)
2019-05-18 01:32 UTC, Bertrand Petit
no flags Details
Build log without Hipe and without NATIVE (448.81 KB, text/plain)
2019-05-18 01:34 UTC, Bertrand Petit
no flags Details
Path for arm disabling optimization and replacing linux syscall with builtins (2.82 KB, patch)
2019-05-22 05:07 UTC, Bertrand Petit
no flags Details | Diff
Patch fixing erlang build on arm by using clang80 and no NATIVE (3.42 KB, patch)
2019-05-28 14:21 UTC, Bertrand Petit
no flags Details | Diff
Patch fixing erlang build on arm by using clang80 and no NATIVE (3.53 KB, patch)
2019-05-28 15:48 UTC, Bertrand Petit
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Bertrand Petit 2019-05-18 01:30:19 UTC
The erland/OTP port (lang/erlang) fails to build on a RPI when the NATIVE
option is selected in addition to the pre-selected ones. Compilation halts on
an bad system call:

[...]
erlc -W  -DMERL_NO_TRANSFORM +debug_info -pa ../ebin -pa ./ -I../include +native +nowarn_shadow_vars +warn_unused_import  -o ./ merl_transform.erl
erlc -W +debug_info -pa ../ebin -pa ./ -I../include +native +nowarn_shadow_vars +warn_unused_import  -o../ebin merl_transform.erl
gmake[5]: *** [Makefile:74: ../ebin/merl_transform.beam] Bad system call (core dumped)
gmake[5]: Leaving directory '/usr/obj/ports/usr/ports/lang/erlang/work/otp-OTP-19.3.6.13/lib/syntax_tools/src'


This build was attempted on a RPI2 using a 11-STABLE base system built from svn revision 345570. The core dump does not look readily informative:

# gdb ../../../bin/armv6-portbld-freebsd11.2/beam.smp beam.smp.core
[...]
(gdb) bt
#0  0x001b1e30 in try_alloc ()
#1  0x001b1cf0 in hipe_alloc_code ()
#2  0x001aab04 in hipe_bifs_enter_code_2 ()
#3  0x00043e3c in $a.18 ()
#4  0x00043e3c in $a.18 ()


Since I'm new at bean/erlang/otp I have no idea on how to further investigate this issue. Having beam machines on the inexpensive RPIs would be extremely useful to experiment with the distributive features of erlang.

As I did suspect that some-thing might be amiss with the HiPE compiler on arm,
I also attempted a build without it and without NATIVE. That build also failed
for another reason which is oulined bellow:

erlc -W  +debug_info -I../include -I../../kernel/include -Werror -o../ebin dets_utils.erl
gmake[5]: *** [/usr/obj/ports/usr/ports/lang/erlang/work/otp-OTP-19.3.6.13/make/armv6-portbld-freebsd11.2/otp.mk:119: ../ebin/dets_utils.beam] Segmentation fault (core dumped)
gmake[5]: Leaving directory '/usr/obj/ports/usr/ports/lang/erlang/work/otp-OTP-19.3.6.13/lib/stdlib/src'

Again the stack trace in uninformative:

# gdb ../../../bin/armv6-portbld-freebsd11.2/beam.smp beam.smp.core
[...]
(gdb) bt
#0  0x00000004 in ?? ()


The only build that did complete and eventually install is with HiPE and without NATIVE. However, at that point, I'm left in a confused state: should I trust this virtual machine? I'm inclined to not give it my trust.
Comment 1 Bertrand Petit 2019-05-18 01:32:58 UTC
Created attachment 204437 [details]
Build log with Hipe and NATIVE
Comment 2 Bertrand Petit 2019-05-18 01:34:35 UTC
Created attachment 204438 [details]
Build log without Hipe and without NATIVE
Comment 3 Mikael Urankar freebsd_committer freebsd_triage 2019-05-19 15:06:10 UTC
(In reply to Bertrand Petit from comment #1)
erlang uses a linux syscall to flush cache [1] :/
you can use __clear_cache instead.

also, armv6 doesn't have smp extension (at least on the board we support), you should uncheck the SMP option.

I had to use CFLAGS+= -O0 (or 03) to build erlang, -O1 and -O2 leads to random crash.

https://github.com/erlang/otp/blob/master/erts/emulator/hipe/hipe_arm.c#L40
Comment 4 Bertrand Petit 2019-05-22 05:05:41 UTC
(In reply to mikael.urankar from comment #3)
Thank you Mikael for your valuable feedback. As suggested I did replace the linux syscall use with rt builtins, that fixed the SIGSYS issue. I confirm there is an issue with the base clang when using optimization, I had to disable it. I suppose that employing llvm80 may help resolve this issue, however I can't test it as I don't have enough storage on that host.

Please see the joined patch which permits an installation with both HIPE and NATIVE.
Comment 5 Bertrand Petit 2019-05-22 05:07:27 UTC
Created attachment 204533 [details]
Path for arm disabling optimization and replacing linux syscall with builtins
Comment 6 Bertrand Petit 2019-05-24 04:28:52 UTC
Please see also this related PR:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237465
Comment 7 Bertrand Petit 2019-05-24 05:11:55 UTC
I eventually managed to compile lang/erlang on armv6 with the HiPE and NATIVE options using clan80. The build has terminated without errors, this again confirms an issue with base clang on 11.2. I launched the test suite to exercise the build, I suppose this could take some days to complete on a RPi2. Meanwhile I will update the proposed patch.
Comment 8 Mikael Urankar freebsd_committer freebsd_triage 2019-05-24 11:00:47 UTC
(In reply to Bertrand Petit from comment #4)
armv7 supports SMP
Comment 9 Dave Cottlehuber freebsd_committer freebsd_triage 2019-05-24 12:52:21 UTC
Super bug hunting!

I'd strongly recommend submitting this upstream directly (patch or no patch),
they're very receptive to FreeBSD in general and IIRC do test continuously
internally.

https://bugs.erlang.org/

BTW my general advice for both HiPE and native is not to use them unless
you're hitting limits and you can see that your workload benefits from
them. In some cases, they don't help, and their broad community usage is
much less than "plain BEAM". OTP 22, for example, adds new opcodes to the VM
that are not supported by HiPE.
Comment 10 Bertrand Petit 2019-05-28 14:18:43 UTC
(In reply to Dave Cottlehuber from comment #9)
> BTW my general advice for both HiPE and native is not to use them unless
> you're hitting limits 

I concur with you and add that HiPE is unusable on armv6. I prematurely interrupted the test suite when I observed the occurrence of countless beam segfaults in the system log.

After exploring part of the port configuration space on an RPi, I came to the conclusion that base clang is unsuitable to build OTP and that NATIVE should not be used at all.

I propose a patch which include:
- a fix for the linux-only system call in HiPE.
- constrain options availability on armv6
- build OTP with clang80 on armv6
- add a test target that builds and run the full test suite.

There is still an issue I've not addressed: the test suite directly executes make(1) but it should execute gmake(1) instead.
Comment 11 Bertrand Petit 2019-05-28 14:21:40 UTC
Created attachment 204672 [details]
Patch fixing erlang build on arm by using clang80 and no NATIVE
Comment 12 Bertrand Petit 2019-05-28 15:17:20 UTC
The upstreams bug report is https://bugs.erlang.org/projects/ERL/issues/ERL-958
Comment 13 Mikael Urankar freebsd_committer freebsd_triage 2019-05-28 15:31:21 UTC
(In reply to Bertrand Petit from comment #11)
can you also disable NATIVE on armv7? thanks.
Comment 14 Bertrand Petit 2019-05-28 15:48:00 UTC
Created attachment 204673 [details]
Patch fixing erlang build on arm by using clang80 and no NATIVE

The native option is excluded for both armv6 and armv7.
Comment 15 Kubilay Kocak freebsd_committer freebsd_triage 2019-08-29 04:37:18 UTC
Have we root caused the issue with clang on 11.x ? It would be nice to create a separate issue to resolve this in base.
Comment 16 Dimitry Andric freebsd_committer freebsd_triage 2019-08-29 05:35:10 UTC
(In reply to Kubilay Kocak from comment #15)

stable/11, stable/12 and head now all have clang 8.0.1, so if llvm80 worked previously, this should also have resolved the problem.
Comment 17 Dave Cottlehuber freebsd_committer freebsd_triage 2021-10-01 14:20:39 UTC
NB https://bugs.erlang.org/projects/ERL/issues/ERL-958 should have resolved this upstream now.
Comment 18 Bertrand Petit 2021-10-01 22:13:25 UTC
(In reply to Dave Cottlehuber from comment #17)
I get mixed results. I attempted to build erlang-runtime{22|23|24} on a 12.2 armv7 host (RPi2B) with clang 10. I did not run the test suite, although I can do it if required.

Could the execution of the test suite be integrated to the ports makefiles?

$ clang --version
FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
Target: armv7-unknown-freebsd12.2-gnueabihf
Thread model: posix
InstalledDir: /usr/bin


erlang-runtime22 (22.3.4.20_1) is built without issue
===============================

Port configuration is

===> The following configuration options are available for erlang-runtime22-22.3.4.20_1:
     CORBA=off: Enable Corba support
     DIRTY=on: Enable Dirty schedulers
     HIPE=on: Build native HiPE compiler
     JAVA=off: Java platform support
     KQUEUE=on: Enable Kernel Poll (kqueue) support
     NATIVE=on: Enable native libraries
     ODBC=off: ODBC database backend
     OPENSSL=on: SSL/TLS support via OpenSSL
     SCTP=on: Enable SCTP support
     THREADS=on: Threading support
     WX=off: Enable WX application



erlang-runtime23 (23.3.4.4) fails to build
============================

gmake[4]: Entering directory '/usr/obj/ports/usr/ports/lang/erlang-runtime23/work/otp-OTP-23.3.4.4/lib/syntax_tools'
=== Entering application syntax_tools
gmake[5]: Entering directory '/usr/obj/ports/usr/ports/lang/erlang-runtime23/work/otp-OTP-23.3.4.4/lib/syntax_tools/src'
erlc -W -Werror +debug_info -DUSE_ESOCK=true -pa ../ebin -pa ./ -I../include +native +nowarn_shadow_vars +warn_unused_import  -o../ebin erl_syntax.erl
<HiPE (v 4.0.1)> EXITED with reason {'trans_fun/2',{bs_start_match4,{atom,no_fail},{u,3},{x,0},{x,3}}} @hipe_beam_to_icode:1221
<HiPE (v 4.0.1)> Error: [hipe:859]: INTERNAL ERROR
while compiling erl_syntax
crash reason: {hipe_beam_to_icode,1221,
                  {'trans_fun/2',
                      {bs_start_match4,{atom,no_fail},{u,3},{x,0},{x,3}}}}
  in function  hipe_beam_to_icode:trans_fun/2 (hipe_beam_to_icode.erl, line 1221)
  in call from hipe_beam_to_icode:trans_fun/2 (hipe_beam_to_icode.erl, line 430)
  in call from hipe_beam_to_icode:trans_fun/2 (hipe_beam_to_icode.erl, line 349)
  in call from hipe_beam_to_icode:trans_fun/2 (hipe_beam_to_icode.erl, line 1123)
  in call from hipe_beam_to_icode:trans_fun/2 (hipe_beam_to_icode.erl, line 290)
  in call from hipe_beam_to_icode:trans_fun/2 (hipe_beam_to_icode.erl, line 312)
  in call from hipe_beam_to_icode:trans_fun/2 (hipe_beam_to_icode.erl, line 297)
  in call from hipe_beam_to_icode:trans_fun/2 (hipe_beam_to_icode.erl, line 334)
erl_syntax.erl: internal error in native_compile;
crash reason: {hipe_beam_to_icode,1221,
    {'trans_fun/2',{bs_start_match4,{atom,no_fail},{u,3},{x,0},{x,3}}}}
gmake[5]: *** [Makefile:74: ../ebin/erl_syntax.beam] Error 1


Port configuration is

===> The following configuration options are available for erlang-runtime23-23.3.4.4_1:
     CORBA=off: Enable Corba support
     DIRTY=on: Enable Dirty schedulers
     HIPE=on: Build native HiPE compiler
     JAVA=off: Java platform support
     KQUEUE=on: Enable Kernel Poll (kqueue) support
     NATIVE=on: Enable native libraries
     ODBC=off: ODBC database backend
     OPENSSL=on: SSL/TLS support via OpenSSL
     SCTP=on: Enable SCTP support
     THREADS=on: Threading support
     WX=off: Enable WX application


erland-runtime24 (24.0.3) is built without issue
=========================

Port configuration does not provide HIPE nor NATIVE knobs as shown below.

===> The following configuration options are available for erlang-runtime24-24.0.3:
     CHUNKS=on: Enable in-line documentation in erlang console
     CORBA=off: Enable Corba support
     JAVA=off: Java platform support
     ODBC=off: ODBC database backend
     OPENSSL=on: SSL/TLS support via OpenSSL
     SCTP=on: Enable SCTP support
     SHARING=on: Enable term copy-and-share support
     WX=off: Enable WX application
Comment 19 Bertrand Petit 2021-10-01 22:16:14 UTC
erlang-runtime23 is still affected
Comment 20 Bertrand Petit 2023-10-30 15:41:15 UTC
erlang19 is no longer part of the ports tree.