Bug 269206 - lang/zig: Illegal instruction (core dumped)
Summary: lang/zig: Illegal instruction (core dumped)
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Dave Cottlehuber
URL: https://reviews.freebsd.org/D38284
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-28 16:07 UTC by Yonas Yanfa
Modified: 2023-02-06 12:30 UTC (History)
2 users (show)

See Also:
bugzilla: maintainer-feedback? (dch)


Attachments
Core dump - part 1 (900.00 KB, application/octet-stream)
2023-01-28 16:18 UTC, Yonas Yanfa
no flags Details
Core dump - part 2 (443.80 KB, application/octet-stream)
2023-01-28 16:19 UTC, Yonas Yanfa
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yonas Yanfa 2023-01-28 16:07:29 UTC
$ zig
Illegal instruction (core dumped)


FreeBSD 13.1-RELEASE-p5
zig-0.10.1
Comment 1 Yonas Yanfa 2023-01-28 16:18:13 UTC
Created attachment 239766 [details]
Core dump - part 1
Comment 2 Yonas Yanfa 2023-01-28 16:19:18 UTC
Created attachment 239767 [details]
Core dump - part 2
Comment 3 Yonas Yanfa 2023-01-28 16:21:26 UTC
To get the core dump, first join the two parts using `cat zig-core.tar.zst.parta* > zig-core.tar.zst` and then use tar/zstd to decompress.
Comment 4 Mina Galić freebsd_triage 2023-01-28 16:28:14 UTC
can you just open the core file in lldb and run bt?
Comment 5 Yonas Yanfa 2023-01-28 17:13:43 UTC
(In reply to Mina Galić from comment #4)

It doesn't produce any backtrace:

$ lldb zig.core 
(lldb) target create "zig.core"
Current executable set to '/home/yonas/zig.core' (x86_64).
(lldb) bt
error: invalid process
(lldb) bt ?
error: bt [<digit> | all]
(lldb) bt all
error: invalid process
(lldb) bt 0
error: invalid process
(lldb) run
error: 'A' packet returned an error: 8
(lldb) bt
error: invalid thread
(lldb) quit
Comment 6 Dave Cottlehuber freebsd_committer freebsd_triage 2023-01-28 20:14:49 UTC
SIGILL may be that the cpu you're building this on is too old for zig?

My relatively old i7 is ok (running current but that shouldn't matter)
both on zig 0.10 and the latest 0.10.1
Can you send output of these please, and then run zig under lldb, see https://github.com/ziglang/zig/issues/8562#issue-860482061 and let us
know what you come back with.

$ sysctl hw.model
hw.model: Intel(R) Core(TM) i7-7560U CPU @ 2.40GHz

$ grep Features /var/run/dmesg.boot
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffafbbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x29c67af<FSGSBASE,TSCADJ,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,NFPUSG,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PROCTRACE>
  Structured Extended Features3=0x9c002400<MD_CLEAR,TSXFA,IBPB,STIBP,L1DFL,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffafbbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x29c67af<FSGSBASE,TSCADJ,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,NFPUSG,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PROCTRACE>

I'm guessing this is an older laptop or desktop missing a specific instruction set?

  Structured Extended Features3=0xbc002e00<MCUOPT,MD_CLEAR,TSXFA,IBPB,STIBP,L1DFL,ARCH_CAP,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
Comment 7 Dave Cottlehuber freebsd_committer freebsd_triage 2023-01-28 20:18:36 UTC
perhaps SSE4.1 or similar, see https://github.com/ziglang/zig/issues/8562
Comment 8 Yonas Yanfa 2023-01-29 18:59:20 UTC
(In reply to Dave Cottlehuber from comment #6)

> SIGILL may be that the cpu you're building this on is too old for zig?

> I'm guessing this is an older laptop or desktop missing a specific instruction set?

zig works fine when I compile it myself.

Perhaps the package was compiled to suite the host environment, and may need tweaking to work for a larger audience. ⚡


$ sysctl hw.model
hw.model: Intel(R) Xeon(R) CPU           L5520  @ 2.27GHz


$ grep Features /var/run/dmesg.boot
  Features=0x783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2>
  Features2=0x81b8a221<SSE3,VMX,SSSE3,CX16,PDCM,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,HV>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x2<TSCADJ>
  Structured Extended Features2=0x4<UMIP>
  Structured Extended Features3=0x20000000<ARCH_CAP>
  Features=0x783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2>
  Features2=0x81b8a221<SSE3,VMX,SSSE3,CX16,PDCM,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,HV>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x2<TSCADJ>
  Structured Extended Features2=0x4<UMIP>
  Structured Extended Features3=0x20000000<ARCH_CAP>
Comment 9 Yonas Yanfa 2023-01-29 19:08:32 UTC
(In reply to Dave Cottlehuber from comment #6)

$ doas gdb --args zig
GNU gdb (GDB) 12.1 [GDB v12.1 for FreeBSD]
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.1".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from zig...
(No debugging symbols found in zig)
(gdb) run
Starting program: /usr/local/bin/zig 
warning: Could not load shared library symbols for [vdso].
Do you need "set solib-search-path" or "set sysroot"?

Program received signal SIGILL, Illegal instruction.
Privileged opcode.
0x0000000000811662 in memcpy ()
(gdb) bt
#0  0x0000000000811662 in memcpy ()
#1  0x00000000006aeb3b in ?? ()
#2  0x00000008011bd0fd in ?? () from /libexec/ld-elf.so.1
#3  0x0000000000000000 in ?? ()
Comment 10 Mina Galić freebsd_triage 2023-01-29 19:11:09 UTC
(In reply to Yonas Yanfa from comment #9)
in memcpy()?

how is zig the only application that's showing this behaviour?
Comment 11 Mina Galić freebsd_triage 2023-01-29 19:17:22 UTC
can you rebuild the port with WITH_DEBUG_PORTS=lang/zig and see if we can resolve those three ?? lines?
Comment 12 Jason W. Bacon freebsd_committer freebsd_triage 2023-01-30 00:10:24 UTC
Also seeing this when attempting to build biology/vcflib:

[  0%] Performing build step for 'ZIG-EXT'
cd /usr/ports/biology/vcflib/work/vcflib-1.0.5/src/zig && /usr/ports/biology/vcflib/work/vcflib-1.0.5/src/zig/compile.sh -Drelease-fast=true -freference-trace
Illegal instruction (core dumped)
gmake[3]: *** [CMakeFiles/ZIG-EXT.dir/build.make:89: ZIG-EXT-prefix/src/ZIG-EXT-stamp/ZIG-EXT-build] Error 132
gmake[3]: Leaving directory '/usr/ports/biology/vcflib/work/.build'
gmake[2]: *** [CMakeFiles/Makefile2:342: CMakeFiles/ZIG-EXT.dir/all] Error 2
gmake[2]: Leaving directory '/usr/ports/biology/vcflib/work/.build'
gmake[1]: *** [Makefile:149: all] Error 2
gmake[1]: Leaving directory '/usr/ports/biology/vcflib/work/.build'
*** Error code 1

/usr/ports/biology/vcflib/work/vcflib-1.0.5/src/zig/compile.sh shows the exact zig commands.
Comment 13 Dave Cottlehuber freebsd_committer freebsd_triage 2023-01-30 21:05:42 UTC
this looks like your CPUs don't have AVX2 (or similar) support

from another PR via IRC:

PU: Intel(R) Core(TM) i7-2860QM CPU @ 2.50GHz (2492.05-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x206a7  Family=0x6  Model=0x2a  Stepping=7
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x1fbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX>
  AMD Features=0x28000800<SYSCALL,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features3=0x9c000000<IBPB,STIBP,L1DFL,SSBD>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 17179869184 (16384 MB)

After your `bt` run `disas` to show the instructions, we should see something like:

(gdb) disas
Dump of assembler code for function memset:
   0x00000000007f0100 <+0>:	mov    %rdi,%rax
   0x00000000007f0103 <+3>:	test   %rdx,%rdx
...
   0x00000000007f02e6 <+486>:	lea    (%rax,%r8,1),%rdi
=> 0x00000000007f02ea <+490>:	vpbroadcastb %esi,%xmm0 <====================
   0x00000000007f02f0 <+496>:	vmovdqu %xmm0,(%rax,%r10,1)
...
   0x00000000007f032f <+559>:	ret
End of assembler dump.
(gdb)


https://www.felixcloutier.com/x86/vpbroadcastb:vpbroadcastw:vpbroadcastd:vpbroadcastq

Which appears to be an AVX2 or AVX512 instruction.

#zig on irc suggests using `-DZIG_TARGET_MCPU=baseline` to address this.
Comment 14 Dave Cottlehuber freebsd_committer freebsd_triage 2023-01-30 21:10:54 UTC
Yonas please try out https://reviews.freebsd.org/D38284 which should address this, assuming your Xeon is not a true antique.
Comment 15 Dave Cottlehuber freebsd_committer freebsd_triage 2023-01-30 21:11:07 UTC
its based off https://reviews.freebsd.org/D38284
Comment 16 Jason W. Bacon freebsd_committer freebsd_triage 2023-01-30 21:21:41 UTC
(In reply to Dave Cottlehuber from comment #13)

Your deduction is correct:

CPU: Intel(R) Core(TM) i5-3470T CPU @ 2.90GHz (2893.60-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306a9  Family=0x6  Model=0x3a  Stepping=9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,C
MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7fbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics

The baseline knob setting sounds like a reasonable solution, assuming it disables CPU feature detection during build, which is what it sounds like.

I'm not sure if the project has an official minimum set of CPU features, but I think the assumption is that binary packages should work on any 64-bit CPU.  I run into this issue frequently with scientific software that has vector optimizations hard-coded in the upstream build system.  Sometimes I have to just patch them out.  The user can always build the port from source if they want non-portable optimizations.
Comment 17 Yonas Yanfa 2023-01-30 22:28:10 UTC
(In reply to Dave Cottlehuber from comment #14)

> Yonas please try out https://reviews.freebsd.org/D38284

It works, thanks!
Comment 18 Oleh Vinichenko 2023-02-03 14:11:33 UTC
i can confirm, that pkg build with baseline change now working fine.
Comment 19 commit-hook freebsd_committer freebsd_triage 2023-02-04 23:29:29 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=7efc3219b1495d2e44ed68bd9edd8d84632267bd

commit 7efc3219b1495d2e44ed68bd9edd8d84632267bd
Author:     Dave Cottlehuber <dch@FreeBSD.org>
AuthorDate: 2023-02-04 23:28:53 +0000
Commit:     Dave Cottlehuber <dch@FreeBSD.org>
CommitDate: 2023-02-04 23:28:53 +0000

    lang/zig: build for lowest common denominator CPU

    Systems that don't have AVX2 or newer CPU instructions will SIGILL. This
    seems particularly cruel for a language with the moniker "Gotta Go Fast".

    PR:             269206
    Differential Revision: https://reviews.freebsd.org/D38284

    Reported by:    Yonas Yanfa
    Tested by:      Oleh Vinichenko

 lang/zig/Makefile                         |  3 ++-
 lang/zig/files/patch-CMakeLists.txt (new) | 10 ++++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)
Comment 20 Dave Cottlehuber freebsd_committer freebsd_triage 2023-02-04 23:32:08 UTC
jwb: for your downstream port, see:

> Does this obsolete -Dcpu=${CPUTYPE:Ubaseline} in consumers like x11-wm/river?
> - jbeich

I think that we would still need this in consumer ports.
Perhaps its time for a USES=zig that does this sort of thing?

jbeich, jwb, feel free to amend devel/zig with your better
knowledge of cmake etc!

Thanks Oleh and Yonas for testing.
Comment 21 Jason W. Bacon freebsd_committer freebsd_triage 2023-02-06 12:30:22 UTC
(In reply to Dave Cottlehuber from comment #20)

Are you saying that the zig compiler is no longer built with nonportable instructions, but it will still generate them by default?  If so, that should be changed, so the package build servers don't cause illegal instruction errors in consumers as it did with zig itself.

Thanks for your work on this...