Bug 247421 - lang/smlnj fails to build using poudriere on 12.1 Stable amd64
Summary: lang/smlnj fails to build using poudriere on 12.1 Stable amd64
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Mark Linimon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-19 14:30 UTC by Robert Cina
Modified: 2020-07-06 18:55 UTC (History)
2 users (show)

See Also:
bugzilla: maintainer-feedback? (joemann)


Attachments
Fix PR 247421 by not building runtime.so when compiler is Clang >= 10 (4.90 KB, patch)
2020-06-26 15:06 UTC, Johannes 5
joemann: maintainer-approval+
Details | Diff
update to 110.97, and fix PR 247421 by not building runtime.so when compiler is Clang >= 10 (20.97 KB, patch)
2020-07-06 18:55 UTC, Johannes 5
joemann: maintainer-approval+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Cina 2020-06-19 14:30:46 UTC
The port lang/smlnj fails to build for me using poudriere on 12.1 Stable amd64.  

An excerpt of the build log with the error is shown below:

./config/install.sh: Installation complete.
(* Installing man pages. *)
(* Stripping runtime executable: *)
MLARCHOPSYS=`/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.arch-n-opsys` &&  ( eval ${MLARCHOPSYS} ;  /usr/bin/strip "/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.${ARCH}-${OPSYS}"  "/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.${ARCH}-${OPSYS}.so" )
strip: open /wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.amd64-freebsd.so failed: No such file or directory
*** Error code 1

Stop.
make: stopped in /usr/ports/lang/smlnj
=>> Cleaning up wrkdir
===>  Cleaning for smlnj-110.96
build of lang/smlnj | smlnj-110.96 ended at Fri Jun 19 09:37:44 EDT 2020
build time: 00:03:15
!!! build failure encountered !!!
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2020-06-19 17:29:49 UTC
With which version of the Makefile?  (I did a recent commit to that port but it should not have broken it).

The poudriere builds on amd64 on FreeBSD.org do not (yet?) show this error.
Comment 2 Robert Cina 2020-06-19 17:42:25 UTC
=>> Building lang/smlnj
build started at Fri Jun 19 09:34:29 EDT 2020
port directory: /usr/ports/lang/smlnj
package name: smlnj-110.96
building for: FreeBSD 12amd64-HEAD-job-03 12.1-STABLE FreeBSD 12.1-STABLE 1201517 amd64
maintained by: joemann@beefree.free.de
Makefile ident:      $FreeBSD: head/lang/smlnj/Makefile 539431 2020-06-17 17:02:35Z linimon $
Poudriere version: 3.3.4
Host OSVERSION: 1201517
Jail OSVERSION: 1201517
Job Id: 03
Comment 3 Johannes 5 2020-06-20 19:43:23 UTC
(In reply to Robert Cina from comment #0)

> The port lang/smlnj fails to build for me using poudriere on 12.1
> Stable amd64.

Thanks for the report and sorry for the inconvenience. The problem was
not present when I submitted PR 242728 in December. But now I can
reproduce it - on a system with Clang 10.0.0:

% cc --version
FreeBSD clang version 10.0.0 (git@github.com:llvm/llvm-project.git llvmorg-10.0.0-0-gd32170dbd5b)
Target: x86_64-unknown-freebsd12.1
Thread model: posix
InstalledDir: /usr/bin

Interestingly, the port builds correctly with Clang 9.0.1 ...

% cc --version
FreeBSD clang version 9.0.1 (git@github.com:llvm/llvm-project.git c1a0a213378a458fbea1a5c77b315c7dce08fd05) (based on LLVM 9.0.1)
Target: x86_64-unknown-freebsd12.1
Thread model: posix
InstalledDir: /usr/bin

Maybe that's the reason why linimon@ mentioned:

> The poudriere builds on amd64 on FreeBSD.org do not (yet?) show this
> error.

If the Clang version is the root cause of the problem, that would be
consistent with failure notices from pkg-fallout@FreeBSD.org when the
build server tries to build lang/smlnj on 13.0-CURRENT:

> #### /usr/ports/Mk/Scripts/ports_env.sh ####
> _CCVERSION_921dbbb2=FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-rc1-0-gf79cd71e145) Target: x86_64-unknown-freebsd13.0 Thread model: posix InstalledDir: /usr/bin

The resulting failure there is the same as yours:

> An excerpt of the build log with the error is shown below:
> 
> ./config/install.sh: Installation complete.
> (* Installing man pages. *)
> (* Stripping runtime executable: *)
> MLARCHOPSYS=`/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.arch-n-opsys` &&  ( eval ${MLARCHOPSYS} ;  /usr/bin/strip "/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.${ARCH}-${OPSYS}"  "/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.${ARCH}-${OPSYS}.so" )
> strip: open /wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.amd64-freebsd.so failed: No such file or directory
> *** Error code 1
> Stop.
> make: stopped in /usr/ports/lang/smlnj

The reason for .../run.amd64-freebsd.so not being present is further up
in the log:

[...]
cc -x assembler-with-cpp -E -P -D_ASM_ -DASSERT_ON  -DARCH_AMD64 -DSIZE_64  -DOPSYS_UNIX -DOPSYS_FREEBSD -D_GNU_SOURCE -DGNU_ASSEMBLER -DDLOPEN  -DINDIRECT_CFUNC -I../objs -I../include ../mach-dep/AMD64.prim.asm > prim.s
cc -x assembler -c -fPIC -o prim.o prim.s
[...]
cc -o run.amd64-freebsd.so -O2 -pipe  -fPIC -fstack-protector-strong -fno-strict-aliasing  -shared -Wl,-z,notext main.o c-libraries.o unix-raise-syserr.o ml-options.o  boot.o load-ml.o run-ml.o globals.o ml-state.o  error.o timers.o unix-timers.o  qualify-name.o swap-bytes.o  unix-fault.o signal-util.o unix-signal.o unix-prof.o prim.o     ../c-libs/posix-os/libposix-os.a  ../c-libs/smlnj-runtime/libsmlnj-runt.a  ../c-libs/smlnj-signals/libsmlnj-sig.a  ../c-libs/smlnj-prof/libsmlnj-prof.a  ../c-libs/smlnj-sockets/libsmlnj-sock.a  ../c-libs/smlnj-time/libsmlnj-time.a  ../c-libs/smlnj-date/libsmlnj-date.a  ../c-libs/smlnj-math/libsmlnj-math.a  ../c-libs/posix-process/libposix-process.a  ../c-libs/posix-procenv/libposix-procenv.a  ../c-libs/posix-filesys/libposix-filesys.a  ../c-libs/posix-io/libposix-io.a  ../c-libs/posix-sysdb/libposix-sysdb.a  ../c-libs/posix-signal/libposix-signal.a  ../c-libs/posix-tty/libposix-tty.a  ../c-libs/posix-error/libposix-error.a ../gc/libgc.a  ../memory/libmem.a ../c-libs/dl/libunix-dynload.a -ldl -lm
ld: error: relocation R_X86_64_PC32 cannot be used against symbol saveregs; recompile with -fPIC
>>> defined in prim.o
>>> referenced by prim.o:(.text+0x232)
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** Error code 1

This does not happen with Clang 9.0.1, and looking at the prim.o files
from the two versions ...

Clang 9.0.1 (.o without RELOC):
# objdump -h -r prim.o
prim.o:     file format elf64-x86-64-freebsd
Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000610  0000000000000000  0000000000000000  00000040  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
Clang 10.0.0 (.o with RELOC):
# objdump -h -r prim.o
prim.o:     file format elf64-x86-64-freebsd
Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000640  0000000000000000  0000000000000000  00000040  2**4
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE 
0000000000000020 R_X86_64_PLT32    set_request-0x0000000000000004
0000000000000039 R_X86_64_PLT32    set_request-0x0000000000000004
0000000000000060 R_X86_64_PLT32    set_request-0x0000000000000004
0000000000000079 R_X86_64_PLT32    set_request-0x0000000000000004
0000000000000091 R_X86_64_PLT32    set_request-0x0000000000000004
00000000000000c0 R_X86_64_PLT32    set_request-0x0000000000000004
00000000000000e1 R_X86_64_PLT32    set_request-0x0000000000000004
00000000000000fe R_X86_64_PLT32    saveregs-0x0000000000000004
000000000000010d R_X86_64_PLT32    set_request-0x0000000000000004
000000000000012e R_X86_64_PLT32    saveregs-0x0000000000000004
000000000000013d R_X86_64_PLT32    set_request-0x0000000000000004
000000000000015e R_X86_64_PLT32    saveregs-0x0000000000000004
000000000000016d R_X86_64_PLT32    set_request-0x0000000000000004
0000000000000232 R_X86_64_PC32     saveregs-0x0000000000000004
00000000000002ce R_X86_64_PLT32    saveregs-0x0000000000000004
000000000000033a R_X86_64_PLT32    set_request-0x0000000000000004
000000000000034e R_X86_64_PLT32    saveregs-0x0000000000000004
00000000000003a9 R_X86_64_PLT32    set_request-0x0000000000000004
00000000000003be R_X86_64_PLT32    saveregs-0x0000000000000004
0000000000000421 R_X86_64_PLT32    set_request-0x0000000000000004
000000000000043e R_X86_64_PLT32    saveregs-0x0000000000000004
00000000000004af R_X86_64_PLT32    set_request-0x0000000000000004
00000000000004ce R_X86_64_PLT32    saveregs-0x0000000000000004
0000000000000545 R_X86_64_PLT32    set_request-0x0000000000000004
00000000000005ce R_X86_64_PLT32    saveregs-0x0000000000000004

... reveals the "RELOC" difference. But prim.s, which is the source for
the prim.o files, is the same with both Clang versions (MD5 (prim.s) =
90b1a8d7069ef57256c814b83a7ebc16), so it seems reasonable to assume a
change in Clang's assembler as the root cause of the observed linker
failure, right?

For a direct solution to the problem one probably should have knowledge
about things like assembler, linker, and object file formats. Not me,
unfortunately, as I'm just showing up here because I've heard that SML
is good for "mathematical rigour" [1], and not for diving into assembler
etc. details;-)

But before giving up, I'll point out the final result of my journey into
the toolchain, maybe someone knows how to interpret all that ... Using
# /usr/local/bin/objcopy --version
GNU objcopy (GNU Binutils) 2.33.1
... to remove the relocations from prim.o ...
# /usr/local/bin/objcopy --remove-relocations=.text prim.o
results in an object file ...
# objdump -h -r prim.o
prim.o:     file format elf64-x86-64-freebsd
Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000640  0000000000000000  0000000000000000  00000040  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
... that looks just like the one produced by the Clang 9.0.1 assembler 
(except for the size) - and the Clang 10.0.0 linker no longer complains
about (this) prim.o, but produces run.amd64-freebsd.so! One might use this
for a workaround on systems with Clang >= 10, but that's ugly and would
introduce a dependency on devel/binutils, because the base system has ...
# /usr/bin/objcopy --version
objcopy (elftoolchain r3769)
# /usr/bin/objcopy --remove-relocations=.text prim.o
objcopy: unrecognized option `--remove-relocations=.text'

Suggestions anyone? I'd prefer to have a solution to the problem before
starting work on SML/NJ 110.97 :-)
Johannes

[1] <https://smlfamily.github.io/sml97-defn.pdf>
Comment 4 Mark Linimon freebsd_committer freebsd_triage 2020-06-20 22:33:07 UTC
OK, as of the very latest amd64-13 run, this now shows up.

This may have been something I broke in r539431 and fixed in r539692.  Can you please try the latter to see if it now works?

Thanks.
Comment 5 Robert Cina 2020-06-20 22:56:12 UTC
I updated my ports tree to r539692 but the port still fails with the same error when building the port. 

=>> Building lang/smlnj
build started at Sat Jun 20 18:49:44 EDT 2020
port directory: /usr/ports/lang/smlnj
package name: smlnj-110.96
building for: FreeBSD 12amd64-HEAD-job-01 12.1-STABLE FreeBSD 12.1-STABLE 1201517 amd64
maintained by: joemann@beefree.free.de
Makefile ident:      $FreeBSD: head/lang/smlnj/Makefile 539692 2020-06-20 02:09:12Z linimon $
Poudriere version: 3.3.4
Host OSVERSION: 1201517
Jail OSVERSION: 1201517
Job Id: 01


[code: 229, data: 29, env: 39 bytes]
./config/install.sh: Installation complete.
(* Installing man pages. *)
(* Stripping runtime executable: *)
MLARCHOPSYS=`/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.arch-n-opsys` &&  ( eval ${MLARCHOPSYS} ;  /usr/bin/strip "/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.${ARCH}-${OPSYS}"  "/wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.${ARCH}-${OPSYS}.so" )
strip: open /wrkdirs/usr/ports/lang/smlnj/work/stage/usr/local/smlnj/bin/.run/run.amd64-freebsd.so failed: No such file or directory
*** Error code 1

Stop.
make: stopped in /usr/ports/lang/smlnj
=>> Cleaning up wrkdir
===>  Cleaning for smlnj-110.96
build of lang/smlnj | smlnj-110.96 ended at Sat Jun 20 18:50:25 EDT 2020
build time: 00:00:41
!!! build failure encountered !!!
Comment 6 Mark Linimon freebsd_committer freebsd_triage 2020-06-20 23:07:53 UTC
(In reply to Robert Cina from comment #5)

I can confirm that by reverting to r512488 of 20190921 tht the port builds.

I am thinking that neither of my later commits were the source of the error, instead, it seems that the previous commit of r539386 of 20190616 fails when I rolled the port back.

I am adding committer lwhsu of r539386 to the Cc: for any comments.
Comment 7 Johannes 5 2020-06-26 15:06:00 UTC
Created attachment 215961 [details]
Fix PR 247421 by not building runtime.so when compiler is Clang >= 10

(In reply to Mark Linimon from comment #6)

> [...]
> I am thinking that neither of my later commits were the source of
> the error,
I agree: the source is the Clang version used to build the port (as
stated in comment #3).

> instead, it seems that the previous commit of r539386 of
> 20190616 fails when I rolled the port back.
> I am adding committer lwhsu of r539386 to the Cc: for any comments.
Well, I'm glad that lwhsu committed PR 242728 and I would prefer that
lang/smlnj is not rolled back to 110.91. Instead I suggest a small
patch based on the analysis of comment #3: it simply does not build the
runtime.so when the compiler is Clang with version >= 10. That's a
workaround, but it won't hurt because by default on FreeBSD the
heap2exec utility of SML/NJ uses runtime.a to link with (in order to
create a standalone executable from a user's program), not runtime.so.

After applying the attached patch, which is also available here ...

	<ftp://wrap7.free.de/pub/patch/smlnj.patch.20200626>
	MD5 (smlnj.patch.20200626) = 15f7d5439722037ffa482115b52aa87f

..., to the current state of lang/smlnj (Makefile r539692) poudriere
can then build the smlnj-110.96 package on 12.1-STABLE using either
Clang 9.0.1 or 10.0.0. Therefore I assume that Robert and the amd64-13
build server will succeed with the patched version as well. FreeBSD
versions before the import of Clang 10 should not be affected by this
patch, i.e. should continue to produce a package.

I'll continue to build und test the patched port on different machines,
and report failures, if any, immediately. I suggest that you try the
patch too, as your machines are a lot faster than mine:-)

Thanks!
Johannes
Comment 8 Robert Cina 2020-06-26 15:28:07 UTC
I applied the provided patch but building the port with poudriere failed.


The following is the output of the failure:

===>  Building for smlnj-110.96
cd /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96 && unset PWD &&  FILESDIR="/usr/ports/lang/smlnj/files" PATCH="/usr/bin/patch" PATCH_ARGS="-d /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96 --forward --quiet -E -p0 --batch -V simple --suffix .orig"  MLNORUNTIMECLEAN=yes  MLRUNTIMEPATCHES=`cd /usr/ports/lang/smlnj/files &&  ( /bin/ls do-patch-base_runtime_* 2>&- ||  true )`  MLSTANDARDPATCHES=`cd /usr/ports/lang/smlnj/files &&  ( for srcdir in asdl cml doc heap2asm ml-burg ml-lex ml-lpt ml-yacc  nlffi smlnj-lib ;  do /bin/ls do-patch-${srcdir}_* 2>&- ;  done ) || true`  MLSTANDARDPATCHDIRS=`cd /usr/ports/lang/smlnj/files &&  ( for srcdir in asdl cml doc heap2asm ml-burg ml-lex ml-lpt ml-yacc  nlffi smlnj-lib ;  do if /bin/ls do-patch-${srcdir}_* 1>&- 2>&- ;  then echo -n ${srcdir} " " ; fi ;  done ) || true`  MLSOURCEPATCHES=`true`  CFLAGS='-O2 -pipe  -fPIC -fstack-protector-strong -fno-strict-aliasing ' LDFLAGS=' -fstack-protector-strong '  AS='cc -x assembler -c' ASFLAGS='-fPIC' EXTRA_DEFS=''  ./config/install.sh -default 64
./config/install.sh: Using shell /bin/sh.
./config/install.sh: SML root is /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96.
./config/install.sh: Installation directory is /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96.
./config/install.sh: Installing version 110.96.
./config/install.sh: URL of source archive is http://smlnj.cs.uchicago.edu/dist/working/110.96/.
./config/install.sh: installing /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96/bin/.arch-n-opsys
./config/install.sh: Script /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96/bin/.arch-n-opsys reports ARCH=amd64; OPSYS=freebsd; HEAP_SUFFIX=amd64-bsd.
./config/install.sh: installing /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96/bin/.run-sml
./config/install.sh: installing /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96/bin/.link-sml
./config/install.sh: installing /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96/bin/ml-makedepend
./config/install.sh: installing /wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96/bin/heap2exec
/wrkdirs/usr/ports/lang/smlnj/work/smlnj-110.96/config/unpack: Un-GZIP-ing and un-TAR-ing run-time archive.
Ignoring previously applied (or reversed) patch.
3 out of 3 hunks ignored--saving rejects to base/runtime/objs/mk.amd64-freebsd.rej
./config/install.sh: !!! patch file /usr/ports/lang/smlnj/files/do-patch-base_runtime_objs_mk.amd64-freebsd.orig failed to patch.
*** Error code 1

Stop.
make: stopped in /usr/ports/lang/smlnj
=>> Cleaning up wrkdir
===>  Cleaning for smlnj-110.96
build of lang/smlnj | smlnj-110.96 ended at Fri Jun 26 11:25:59 EDT 2020
build time: 00:00:02
!!! build failure encountered !!!
Comment 9 Johannes 5 2020-06-26 16:09:11 UTC
(In reply to Robert Cina from comment #8)

> I applied the provided patch but building the port with poudriere
> failed.
> [...]
> ./config/install.sh: !!! patch file
> /usr/ports/lang/smlnj/files/do-patch-base_runtime_objs_mk.amd64-freebsd.orig failed to patch.
> *** Error code 1
There should be no patch file ending in ".orig" - maybe you used
`patch` (not `svnlite patch`)? If so, the leftovers from `patch` have
to be removed manually:

	find . \( -name '*.orig' -o -size 0c \) -print -delete

Enjoy!
Johannes
Comment 10 Robert Cina 2020-06-26 16:15:02 UTC
Thanks for the help clarifying how to patch.  I can confirm the patch applies correctly and now builds correctly using poudriere on 12.1 Stable for me.
Comment 11 Johannes 5 2020-07-06 18:55:05 UTC
Created attachment 216263 [details]
update to 110.97, and fix PR 247421 by not building runtime.so when compiler is Clang >= 10

(In reply to Robert Cina from comment #10)

> [...]  I can confirm the patch applies correctly and now builds
> correctly using poudriere on 12.1 Stable for me.
Thanks for the good news! If you have the opportunity to also try
the following patch, then let us know whether it worked for you.

The attached patch to lang/smlnj should be identical to

	<ftp://wrap7.free.de/pub/patch/smlnj.patch.20200704>
	MD5 (smlnj.patch.20200704) = bae9bc9abfb55e318815aea3de6c9e77

and does the following:

- Work around for PR 247421, by not building runtime.so if the
  compiler is Clang >= 10.
- Update to SML/NJ 110.97, upstream release notes [1] and
  changelog [2]. (Mostly bug fixes, additions, and minor changes to the
  SML/NJ Library.)

@committers: output from the poudriere testport runs for the patch is
available [3]. Is there anything else I should do to help with the
resolution of this PR?

The patch includes the one from comment #7 but also updates
lang/smlnj to the latest upstream version 110.97. I've bundled the two
unrelated parts (the workaround for the problem that caused this PR,
and the new upstream version) into one patch mainly because it's a lot
of work for me to run poudriere testport for all architecture / OS
version / option combinations [3]. But now I'm confident that the patch
will work for you and for the FreeBSD build machines. So I hope that a
committer will test it and apply it to the ports tree.

Thanks for considering this patch!
Johannes

[1] <http://www.smlnj.org/dist/working/110.97/110.97-README.html>
[2] <http://smlnj.org/dist/working/110.97/HISTORY.html>
[3] <http://mesh-j-3.free.de/poudriere/smlnj/110.97/>