Bug 259921

Summary: dns/bind916: Segmentation fault with LLVM13 on 14-CURRENT
Product: Ports & Packages Reporter: O. Hartmann <ohartmann>
Component: Individual Port(s)Assignee: Mathieu Arnold <mat>
Status: Closed FIXED    
Severity: Affects Many People CC: arcade, chris, dim, emaste, jrtc27, nino80, pi
Priority: --- Flags: bugzilla: maintainer-feedback? (mat)
Version: Latest   
Hardware: Any   
OS: Any   

Description O. Hartmann 2021-11-18 17:26:02 UTC
Host is 14.0-CURRENT FreeBSD 14.0-CURRENT #11 main-n250822-4fdc5b8494b: Thu Nov 18 06:02:28 CET 2021 amd64, fresh OS compilation with LLVM13, but some ports still LLVM12 compiled, lates update of dns/916 results in a segmentation fault when starting.

[...] named_flags="-d4 -g" [...]
:~ # service named restart
named not running? (check /var/run/named/pid).
Starting named.
18-Nov-2021 17:21:46.798 starting BIND 9.16.23 (Extended Support Version) <id:fde3b1f>
[...]
18-Nov-2021 17:21:46.873 built with '--disable-linux-caps' '--localstatedir=/var' '--sysconfdir=/usr/local/etc/namedb' '--with-dlopen=yes' '--with-libxml2' '--with-openssl=/usr' '--with-readline=-L/usr/local/lib -ledit' '--with-dlz-filesystem=yes' '--with-dlz-ldap=yes' '--with-dlz-postgres=yes' '--with-dlz-stub=yes' '--disable-dnstap' '--disable-fixed-rrset' '--enable-geoip' '--with-maxminddb' '--with-gssapi=/usr' 'CFLAGS=-I/usr/include -O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong -DLDAP_DEPRECATED -isystem /usr/local/include -fno-strict-aliasing ' 'LDFLAGS= -L/usr/local/lib -ljson-c -L/usr/local/lib -fstack-protector-strong ' 'LIBS=-lkrb5 -lgssapi -lgssapi_krb5 -L/usr/local/lib' 'KRB5CONFIG=/usr/bin/krb5-config' '--with-libidn2=/usr/local' '--with-json-c' '--enable-largefile' '--with-lmdb=/usr/local' '--disable-native-pkcs11' '--without-python' '--disable-querytrace' '--enable-tcp-fastopen' '--disable-symtable' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd14.0' 'build_alias=amd64-portbld-freebsd14.0' 'CC=cc' 'CPPFLAGS=-I/usr/local/include -DLIBICONV_PLUG -isystem /usr/local/include' 'CPP=cpp' 'PKG_CONFIG=pkgconf'
18-Nov-2021 17:21:46.873 running as: named -d4 -g -u bind -c /usr/local/etc/namedb/named.conf
18-Nov-2021 17:21:46.873 compiled by CLANG FreeBSD Clang 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303)
18-Nov-2021 17:21:46.873 compiled with OpenSSL version: OpenSSL 1.1.1l-freebsd  24 Aug 2021
18-Nov-2021 17:21:46.873 linked to OpenSSL version: OpenSSL 1.1.1l-freebsd  24 Aug 2021
18-Nov-2021 17:21:46.873 compiled with libxml2 version: 2.9.12
18-Nov-2021 17:21:46.873 linked to libxml2 version: 20912
18-Nov-2021 17:21:46.873 compiled with json-c version: 0.15
18-Nov-2021 17:21:46.873 linked to json-c version: 0.15
18-Nov-2021 17:21:46.873 compiled with zlib version: 1.2.11
18-Nov-2021 17:21:46.873 linked to zlib version: 1.2.11
18-Nov-2021 17:21:46.873 ----------------------------------------------------
18-Nov-2021 17:21:46.873 BIND 9 is maintained by Internet Systems Consortium,
18-Nov-2021 17:21:46.873 Inc. (ISC), a non-profit 501(c)(3) public-benefit 
18-Nov-2021 17:21:46.873 corporation.  Support and training for BIND 9 are 
18-Nov-2021 17:21:46.873 available at https://www.isc.org/support
18-Nov-2021 17:21:46.873 ----------------------------------------------------
18-Nov-2021 17:21:46.873 found 8 CPUs, using 8 worker threads
18-Nov-2021 17:21:46.874 using 8 UDP listeners per interface
18-Nov-2021 17:21:46.879 using up to 21000 sockets
Segmentation fault
/usr/local/etc/rc.d/named: WARNING: failed to start named
Comment 1 O. Hartmann 2021-11-18 21:31:38 UTC
According  to
To: freebsd-current@freebsd.org
From: Marcin Wojtas <mw@semihalf.com>
Cc: Fabien Thomas <fabien.thomas@stormshield.eu>,   MARECHAL Boris <boris.marechal@stormshield.eu>, Rafal Jaworowski <raj@semihalf.com>,   Damien DEVILLE <damien.deville@stormshield.eu>
Subject: HEADS-UP: ASLR for 64-bit executables enabled by default on main
Date: Tue, 16 Nov 2021 23:30:19 +0100
Sender: owner-freebsd-current@freebsd.org

on list freebsd-current@

disabling ASLR via
 
sysctl kern.elf64.aslr.enable=0
sysctl kern.elf64.aslr.pie_enable=0

doesn't change anything.
Comment 2 O. Hartmann 2021-12-09 19:25:03 UTC
We are still stuck with this nasty problem. Affected are all 14-CURRENT boxes, amd64, running older IvyBridge Intel CPUs (it seems, I have no reference):

14.0-CURRENT #52 main-n251260-156fbc64857: Thu Dec  2 14:45:55 CET 2021 amd64

The recent config of the port snd/bind916 is:

[bind916]: make showconfig
===> The following configuration options are available for bind916-9.16.23:
     DNSTAP=on: Provides fast passive logging of DNS messages
     DOCS=on: Build and/or install documentation
     FIXED_RRSET=off: Enable fixed rrset ordering
     GEOIP=off: GeoIP IP location support
     IDN=on: International Domain Names support
     JSON=on: JSON file/format/parser support
     LARGE_FILE=on: 64-bit file support
     LMDB=on: Use LMDB for zone management
     MANPAGES=on: Build and/or install manual pages
     OVERRIDECACHE=off: Use the override-cache patch
     PORTREVISION=off: Show PORTREVISION in the version string
     QUERYTRACE=off: Enable the very verbose query tracelogging
     START_LATE=off: Start BIND late in the boot process (see help)
     TCP_FASTOPEN=on: RFC 7413 support
====> Dynamically Loadable Zones
     DLZ_BDB=off: DLZ BDB driver
     DLZ_FILESYSTEM=on: DLZ filesystem driver
     DLZ_LDAP=on: DLZ LDAP driver
     DLZ_MYSQL=off: DLZ MySQL driver (no threading)
     DLZ_POSTGRESQL=on: DLZ Postgres driver
     DLZ_STUB=off: DLZ stub driver
====> GSSAPI Security API support: you have to select exactly one of them
     GSSAPI_BASE=on: Using Heimdal in base
     GSSAPI_HEIMDAL=off: Using security/heimdal
     GSSAPI_MIT=off: Using security/krb5
     GSSAPI_NONE=off: Disable
====> Choose which crypto engine to use: you can only select none or one of them
     NATIVE_PKCS11=off: Use PKCS#11 native API (**READ HELP**)
===> Use 'make config' to modify these settings

We also compiled a "vanilla" dns/bind916 after "make rmconfig".


Several runs of "portmaster -df" did not solve the problem so far, today we ran (still running) a portmaster -df on all affected hosts with the most recent ports tree as of today :

[...] (git pull)
From https://git.freebsd.org/ports
   b89ab73481f5..07cbc65d84e6  main       -> origin/main
Updating b89ab73481f5..07cbc65d84e6

So, I think things are up to date.
Comment 3 Dimitry Andric freebsd_committer freebsd_triage 2021-12-10 21:06:39 UTC
I managed to narrow this down to https://github.com/llvm/llvm-project/commit/e96df3e531f5 (aka https://reviews.llvm.org/D94355):

commit e96df3e531f506eea75da0f13d0f8aa9a267f975
Author: Gulfem Savrun Yeniceri <gulfem@google.com>
Date:   Tue Dec 29 21:32:13 2020 +0000

    [Passes] Add relative lookup table converter pass

    Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
    https://bugs.llvm.org/show_bug.cgi?id=45244

    This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly.

    Differential Revision: https://reviews.llvm.org/D94355

It turns out that a part of the isc_log_doit() function appears to have this new relative lookup tables inserted, after this commit. The function starts as:

  1490  static void
  1491  isc_log_doit(isc_log_t *lctx, isc_logcategory_t *category,
  1492               isc_logmodule_t *module, int level, bool write_once,
  1493               const char *format, va_list args) {
...
  1539          isc_logconfig_t *lcfg = lctx->logconfig;
  1540
  1541          category_channels = ISC_LIST_HEAD(lcfg->channellists[category->id]);

and it segfaults just after line 1541. It seems there is something in the do-while loop that follows, in particular the [-level] accesses, that confuse the llvm relative lookup table converter pass, e.g:

  1625                          } else {
  1626                                  snprintf(level_string, sizeof(level_string),
  1627                                           "%s: ", log_level_strings[-level]);
  1628                          }

Before llvm commit e96df3e531f5, the code around line 1541 is compiled as:

.Ltmp661:
        #DEBUG_VALUE: isc_log_doit:category_channels <- $r12
        .loc    3 0 58                          # log.c:0:58
        xorl    %eax, %eax
        testl   %r15d, %r15d
        setg    %al
        movl    %r15d, %ecx
        negl    %ecx
        movq    %rcx, -840(%rbp)                # 8-byte Spill
        leaq    8328(%r13), %rcx
        #DEBUG_VALUE: isc_log_doit:matched <- 0
        movq    %rcx, -808(%rbp)                # 8-byte Spill
.Ltmp662:
        .loc    3 1552 25 is_stmt 1             # log.c:1552:25
        testq   %r12, %r12
        sete    %cl
        leal    5(%r15), %edx
        movl    %edx, -824(%rbp)                # 4-byte Spill
        leal    2(%rax,%rax,4), %eax
        movl    %eax, -820(%rbp)                # 4-byte Spill
        movb    $1, %al
        leaq    default_channel(%rip), %rdi
        xorl    %esi, %esi
        movq    %r15, -816(%rbp)                # 8-byte Spill
        movq    %rbx, -752(%rbp)                # 8-byte Spill
        movq    %r13, -744(%rbp)                # 8-byte Spill
        movq    %r9, -736(%rbp)                 # 8-byte Spill

After llvm commit e96df3e531f5, the result becomes:

.Ltmp661:
        #DEBUG_VALUE: isc_log_doit:category_channels <- $r12
        .loc    3 0 58                          # log.c:0:58
        xorl    %eax, %eax
        testl   %r15d, %r15d
        setg    %al
        movl    %r15d, %edx
        negl    %edx
        leaq    reltable.isc_log_doit(%rip), %rcx
        movq    %rdx, -848(%rbp)                # 8-byte Spill
        movslq  (%rcx,%rdx,4), %rdx
        addq    %rcx, %rdx
        movq    %rdx, -840(%rbp)                # 8-byte Spill
        leaq    8328(%r13), %rcx
        #DEBUG_VALUE: isc_log_doit:matched <- 0
        movq    %rcx, -808(%rbp)                # 8-byte Spill
.Ltmp662:
        .loc    3 1552 25 is_stmt 1             # log.c:1552:25
        testq   %r12, %r12
        sete    %cl
        leal    5(%r15), %edx
        movl    %edx, -824(%rbp)                # 4-byte Spill
        leal    2(%rax,%rax,4), %eax
        movl    %eax, -820(%rbp)                # 4-byte Spill
        movb    $1, %al
        leaq    default_channel(%rip), %rdi
        xorl    %esi, %esi
        movq    %r15, -816(%rbp)                # 8-byte Spill
        movq    %rbx, -752(%rbp)                # 8-byte Spill
        movq    %r13, -744(%rbp)                # 8-byte Spill
        movq    %r9, -736(%rbp)                 # 8-byte Spill

with the lookup table looking like:

        .type   reltable.isc_log_doit,@object   # @reltable.isc_log_doit
        .section        .rodata,"a",@progbits
        .p2align        2
reltable.isc_log_doit:
        .long   .L.str.95-reltable.isc_log_doit
        .long   .L.str.96-reltable.isc_log_doit
        .long   .L.str.97-reltable.isc_log_doit
        .long   .L.str.98-reltable.isc_log_doit
        .long   .L.str.99-reltable.isc_log_doit
        .long   .L.str.100-reltable.isc_log_doit
        .size   reltable.isc_log_doit, 24

The problem is that the "movslq  (%rcx,%rdx,4), %rdx" statement is invoked with a negative value of rcx, so it attempts to access data from before reltable.isc_log_doit, and it segfaults.
Comment 4 Jessica Clarke freebsd_committer freebsd_triage 2021-12-11 00:30:44 UTC
Should be fixed by either this simple patch to LLVM:

--- a/llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
+++ b/llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
@@ -137,8 +137,8 @@ static void convertToRelLookupTable(GlobalVariable &LookupTable) {
   // Generate an array that consists of relative offsets.
   GlobalVariable *RelLookupTable = createRelLookupTable(Func, LookupTable);
 
-  // Place new instruction sequence before GEP.
-  Builder.SetInsertPoint(GEP);
+  // Place new instruction sequence where the GEP is.
+  Builder.SetInsertPoint(Load);
   Value *Index = GEP->getOperand(2);
   IntegerType *IntTy = cast<IntegerType>(Index->getType());
   Value *Offset =

or by this alternative patch:

--- a/llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
+++ b/llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
@@ -137,13 +137,15 @@ static void convertToRelLookupTable(GlobalVariable &LookupTable) {
   // Generate an array that consists of relative offsets.
   GlobalVariable *RelLookupTable = createRelLookupTable(Func, LookupTable);
 
-  // Place new instruction sequence before GEP.
-  Builder.SetInsertPoint(GEP);
+  // Place the new index scaling where the GEP is.
+  Builder.SetInsertPoint(Load);
   Value *Index = GEP->getOperand(2);
   IntegerType *IntTy = cast<IntegerType>(Index->getType());
   Value *Offset =
       Builder.CreateShl(Index, ConstantInt::get(IntTy, 2), "reltable.shift");
 
+  // Place the remaining instruction sequence where the existing load is.
+  Builder.SetInsertPoint(Load);
   Function *LoadRelIntrinsic = llvm::Intrinsic::getDeclaration(
       &M, Intrinsic::load_relative, {Index->getType()});
   Value *Base = Builder.CreateBitCast(RelLookupTable, Builder.getInt8PtrTy());

Don't know which will give better code generation, may not even matter if later passes are happy to hoist/sink the shift as desired.
Comment 5 Jessica Clarke freebsd_committer freebsd_triage 2021-12-11 00:31:40 UTC
Second patch was meant to be:

--- a/llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
+++ b/llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
@@ -137,13 +137,15 @@ static void convertToRelLookupTable(GlobalVariable &LookupTable) {
   // Generate an array that consists of relative offsets.
   GlobalVariable *RelLookupTable = createRelLookupTable(Func, LookupTable);
 
-  // Place new instruction sequence before GEP.
+  // Place the new index scaling where the GEP is.
   Builder.SetInsertPoint(GEP);
   Value *Index = GEP->getOperand(2);
   IntegerType *IntTy = cast<IntegerType>(Index->getType());
   Value *Offset =
       Builder.CreateShl(Index, ConstantInt::get(IntTy, 2), "reltable.shift");
 
+  // Place the remaining instruction sequence where the existing load is.
+  Builder.SetInsertPoint(Load);
   Function *LoadRelIntrinsic = llvm::Intrinsic::getDeclaration(
       &M, Intrinsic::load_relative, {Index->getType()});
   Value *Base = Builder.CreateBitCast(RelLookupTable, Builder.getInt8PtrTy());

*sigh*
Comment 6 Dimitry Andric freebsd_committer freebsd_triage 2021-12-11 09:40:59 UTC
And what about upstream's https://reviews.llvm.org/D115571 ? :)
Comment 7 Dimitry Andric freebsd_committer freebsd_triage 2021-12-11 09:41:57 UTC
(In reply to Dimitry Andric from comment #6)
Ah well that is equivalent to your second patch, so I'll probably take that one.
Comment 8 Dimitry Andric freebsd_committer freebsd_triage 2021-12-11 19:55:38 UTC
I can at least confirm that the upstream patch fixes it. I'll wait a little while until upstream commits it, which will probably be on Monday. (Unless the author works on weekends :)
Comment 9 commit-hook freebsd_committer freebsd_triage 2021-12-15 21:28:43 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=5a925e4644665b9a7a5cdd664764fb0a4d1c5797

commit 5a925e4644665b9a7a5cdd664764fb0a4d1c5797
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2021-12-15 19:56:12 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2021-12-15 21:27:39 +0000

    Apply fix for clang incorrectly optimizing part of dns/bind916

    Merge commit e5a8af7a90c6 from llvm git (by Gulfem Savrun Yeniceri):

      [Passes] Fix relative lookup table converter pass

      This patch fixes the relative table converter pass for the lookup table
      accesses that are resulted in an instruction sequence, where gep is not
      immediately followed by a load, such as gep being hoisted outside the loop
      or another instruction is inserted in between them. The fix inserts the
      call to load.relative.instrinsic in the original place of load instead of gep.
      Issue is reported by FreeBSD via https://bugs.freebsd.org/259921.

      Differential Revision: https://reviews.llvm.org/D115571

    PR:             259921
    Reported by:    O. Hartmann <freebsd@walstatt-de.de>
    MFC after:      3 days

 .../llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp             | 4 ++++
 1 file changed, 4 insertions(+)
Comment 10 Volodymyr Kostyrko 2021-12-17 08:37:35 UTC
That's the thing for STABLE-13 also. Can you please consider MFC?
Comment 11 Dimitry Andric freebsd_committer freebsd_triage 2021-12-17 08:52:22 UTC
(In reply to Volodymyr Kostyrko from comment #10)
Yes, it'll be MFC'd tomorrow, then this ticket can be closed.
Comment 12 commit-hook freebsd_committer freebsd_triage 2021-12-18 11:25:12 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=c272602429d5d7efcea95d97a2ceb48f9d9d2b3c

commit c272602429d5d7efcea95d97a2ceb48f9d9d2b3c
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2021-12-15 19:56:12 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2021-12-18 11:24:06 +0000

    Apply fix for clang incorrectly optimizing part of dns/bind916

    Merge commit e5a8af7a90c6 from llvm git (by Gulfem Savrun Yeniceri):

      [Passes] Fix relative lookup table converter pass

      This patch fixes the relative table converter pass for the lookup table
      accesses that are resulted in an instruction sequence, where gep is not
      immediately followed by a load, such as gep being hoisted outside the loop
      or another instruction is inserted in between them. The fix inserts the
      call to load.relative.instrinsic in the original place of load instead of gep.
      Issue is reported by FreeBSD via https://bugs.freebsd.org/259921.

      Differential Revision: https://reviews.llvm.org/D115571

    PR:             259921
    Reported by:    O. Hartmann <freebsd@walstatt-de.de>

    (cherry picked from commit 5a925e4644665b9a7a5cdd664764fb0a4d1c5797)

 .../llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp             | 4 ++++
 1 file changed, 4 insertions(+)
Comment 13 commit-hook freebsd_committer freebsd_triage 2021-12-25 11:57:27 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7276936745949711ca7ee66589728eca167360ba

commit 7276936745949711ca7ee66589728eca167360ba
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2021-12-15 19:56:12 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2021-12-25 11:51:44 +0000

    Apply fix for clang incorrectly optimizing part of dns/bind916

    Merge commit e5a8af7a90c6 from llvm git (by Gulfem Savrun Yeniceri):

      [Passes] Fix relative lookup table converter pass

      This patch fixes the relative table converter pass for the lookup table
      accesses that are resulted in an instruction sequence, where gep is not
      immediately followed by a load, such as gep being hoisted outside the loop
      or another instruction is inserted in between them. The fix inserts the
      call to load.relative.instrinsic in the original place of load instead of gep.
      Issue is reported by FreeBSD via https://bugs.freebsd.org/259921.

      Differential Revision: https://reviews.llvm.org/D115571

    PR:             259921
    Reported by:    O. Hartmann <freebsd@walstatt-de.de>
    MFC after:      3 days

    (cherry picked from commit 5a925e4644665b9a7a5cdd664764fb0a4d1c5797)

 .../llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp             | 4 ++++
 1 file changed, 4 insertions(+)
Comment 14 Norikatsu Shigemura freebsd_committer freebsd_triage 2023-06-14 08:08:08 UTC
*** Bug 234494 has been marked as a duplicate of this bug. ***