Bug 237640 - dns/bind914 aborts with __c11_atomic_load
Summary: dns/bind914 aborts with __c11_atomic_load
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Some People
Assignee: Mathieu Arnold
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-29 08:48 UTC by Dirk Meyer
Modified: 2019-06-03 12:47 UTC (History)
1 user (show)

See Also:
dinoex: maintainer-feedback+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Meyer freebsd_committer 2019-04-29 08:48:21 UTC
FreeBSD 1.2 amd64
running bind in dedicated jail without NAT.


Trying to upgrade a running bind912 to bind914.

starting the new version it crashes on start.

I see this on 3 of 5 Instances,
some with base openssl, with openssl or openssl111 from ports.


logfile:

starting BIND 9.14.1 (Stable Release) <id:d4c1008>
built with '--localstatedir=/var' '--disable-linux-caps' '--with-libxml2=/usr/local' '--with-readline=-L/usr/local/lib -ledit' '--with-dlopen=yes' '--with-openssl=/usr/local' '--sysconfdir=/usr/local/etc/namedb' '--with-dlz-filesystem=yes' '--disable-dnstap' '--disable-fixed-rrset' '--without-gssapi' '--with-libidn2=/usr/local' '--with-libjson=/usr/local' '--disable-largefile' '--with-lmdb=/usr/local' '--disable-native-pkcs11' '--without-python' '--disable-querytrace' 'STD_CDEFINES=-DDIG_SIGCHASE=1' '--enable-tcp-fastopen' '--with-tuning=default' '--disable-symtable' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd11.2' 'build_alias=amd64-portbld-freebsd11.2' 'CC=cc' 'CFLAGS=-O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing ' 'LDFLAGS= -Wl,-rpath,/usr/local/lib -fstack-protector-strong ' 'LIBS=-L/usr/local/lib' 'CPPFLAGS=-DLIBICONV_PLUG -isystem /usr/local/includ
compiled by CLANG 4.2.1 Compatible FreeBSD Clang 6.0.0 (tags/RELEASE_600/final 326565)
compiled with OpenSSL version: OpenSSL 1.0.2r  26 Feb 2019
linked to OpenSSL version: OpenSSL 1.0.2r  26 Feb 2019
compiled with libxml2 version: 2.9.8
linked to libxml2 version: 20908
compiled with libjson-c version: 0.13.1
linked to libjson-c version: 0.13.1
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
[...]
found 2 CPUs, using 2 worker threads
using 2 UDP listeners per interface
using up to 4096 sockets
[...]
all zones loaded
running
[...]
zone ******.***.org/IN: sending notifies (serial 20********)
zone ******.***.org/IN: sending notifies (serial 20********)
zone ******.***.org/IN: sending notifies (serial 20********)
[...]socket.c:2601: REQUIRE((uint_fast32_t)__c11_atomic_load(&sock->references, memory_order_acquire) == 1) failed, back trace
#0 0x439200 in ??
#1 0x6107da in ??
#2 0x63699d in ??
#3 0x4c2c27 in ??
#4 0x4bf758 in ??

all 15 notifies had been logged.

with bind912 I get this line after the last notify:
    resolver priming query complete
Comment 1 Mathieu Arnold freebsd_committer 2019-05-15 10:34:03 UTC
I have had this problem since the first 9.14.0 release candidate.  I have a bug open with the ISC about this, but we are all struggling to reproduce out of my one instance where it crashes.

Could you provide the configuration file and the workload for that server ?
Comment 2 Mathieu Arnold freebsd_committer 2019-05-16 08:36:28 UTC
Also, is the host running on bare metal or is it virtualized?
And what version of FreeBSD are you running?
Comment 3 wcarson.bugzilla 2019-05-17 00:02:25 UTC
I'm having the same issue on a FreeBSD AWS EC2 instance running 12.0-p3. It is a slave/secondary server hosting ~20 domains and very light load. I'm a little hesitant at this time to provide the config file, as I'd have to do a lot of sanitizing first. Here is the log though: 


May 16 18:53:12 aws pkg[91493]: bind914 reinstalled: 9.14.1 -> 9.14.1
May 16 18:53:16 aws named[91533]: starting BIND 9.14.1 (Stable Release) <id:d4c1008>
May 16 18:53:16 aws named[91533]: running on FreeBSD amd64 12.0-RELEASE-p3 FreeBSD 12.0-RELEASE-p3 GENERIC
May 16 18:53:16 aws named[91533]: built with '--localstatedir=/var' '--disable-linux-caps' '--with-libxml2=/usr/local' '--with-readline=-L/usr/local/lib -ledit' '--with-dlopen=yes' '--with-openssl=/usr/local' '--sysconfdir=/usr/local/etc/namedb' '--with-dlz-filesystem=yes' '--disable-dnstap' '--disable-fixed-rrset' '--without-gssapi' '--with-libidn2=/usr/local' '--with-libjson=/usr/local' '--disable-largefile' '--with-lmdb=/usr/local' '--disable-native-pkcs11' '--without-python' '--disable-querytrace' 'STD_CDEFINES=-DDIG_SIGCHASE=1' '--enable-tcp-fastopen' '--with-tuning=default' '--disable-symtable' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd12.0' 'build_alias=amd64-portbld-freebsd12.0' 'CC=cc' 'CFLAGS=-O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing ' 'LDFLAGS= -Wl,-rpath,/usr/local/lib -fstack-protector-strong ' 'LIBS=-L/usr/local/lib' 'CPPFLAGS=-DLIBICONV_PLUG -isystem /usr/local/include' 'CPP=cpp'
May 16 18:53:16 aws named[91533]: running as: named -u bind -c /usr/local/etc/namedb/named.conf
May 16 18:53:16 aws named[91533]: compiled by CLANG 4.2.1 Compatible FreeBSD Clang 6.0.1 (tags/RELEASE_601/final 335540)
May 16 18:53:16 aws named[91533]: compiled with OpenSSL version: LibreSSL 2.9.1
May 16 18:53:16 aws named[91533]: linked to OpenSSL version: LibreSSL 2.9.1
May 16 18:53:16 aws named[91533]: compiled with libxml2 version: 2.9.8
May 16 18:53:16 aws named[91533]: linked to libxml2 version: 20908
May 16 18:53:16 aws named[91533]: compiled with libjson-c version: 0.13.1
May 16 18:53:16 aws named[91533]: linked to libjson-c version: 0.13.1
May 16 18:53:16 aws named[91533]: compiled with zlib version: 1.2.11
May 16 18:53:16 aws named[91533]: linked to zlib version: 1.2.11
May 16 18:53:16 aws named[91533]: ----------------------------------------------------
May 16 18:53:16 aws named[91533]: BIND 9 is maintained by Internet Systems Consortium,
May 16 18:53:16 aws named[91533]: Inc. (ISC), a non-profit 501(c)(3) public-benefit
May 16 18:53:16 aws named[91533]: corporation.  Support and training for BIND 9 are
May 16 18:53:16 aws named[91533]: available at https://www.isc.org/support
May 16 18:53:16 aws named[91533]: ----------------------------------------------------
May 16 18:53:16 aws named[91533]: /usr/local/etc/namedb/named.conf:39: dnssec-lookaside 'auto' is no longer supported
May 16 18:53:16 aws named[91533]: command channel listening on 127.0.0.1#953
May 16 18:53:16 aws named[91533]: command channel listening on ::1#953
May 16 18:53:16 aws named[91533]: all zones loaded
May 16 18:53:16 aws named[91533]: running
May 16 18:53:16 aws named[91533]: socket.c:2601: REQUIRE((uint_fast32_t)__c11_atomic_load(&sock->references, memory_order_acquire) == 1) failed, back trace
May 16 18:53:16 aws kernel: pid 91533 (named), uid 53: exited on signal 6
May 16 18:53:16 aws named[91533]: #0 0x2ba860 in ??
May 16 18:53:16 aws named[91533]: #1 0x49229a in ??
May 16 18:53:16 aws named[91533]: #2 0x4b827d in ??
May 16 18:53:16 aws named[91533]: #3 0x344627 in ??
May 16 18:53:16 aws named[91533]: #4 0x341158 in ??
May 16 18:53:16 aws named[91533]: #5 0x3ec7bd in ??
May 16 18:53:16 aws named[91533]: #6 0x3eaf57 in ??
May 16 18:53:16 aws named[91533]: #7 0x3e9136 in ??
May 16 18:53:16 aws named[91533]: #8 0x3ec0b1 in ??
May 16 18:53:16 aws named[91533]: #9 0x4aeb2d in ??
May 16 18:53:16 aws named[91533]: #10 0x800957776 in ??
May 16 18:53:16 aws named[91533]: exiting (due to assertion failure)

And I built it using these options:

# This file is auto-generated by 'make config'.
# Options for bind914-9.14.1
_OPTIONS_READ=bind914-9.14.1
_FILE_COMPLETE_OPTIONS_LIST=DNSTAP DOCS FIXED_RRSET IDN JSON LARGE_FILE LMDB MINCACHE PORTREVISION QUERYTRACE SIGCHASE START_LATE TCP_FASTOPEN TUNING_LARGE GSSAPI_BASE GSSAPI_HEIMDAL GSSAPI_MIT GSSAPI_NONE NATIVE_PKCS11 DLZ_POSTGRESQL DLZ_MYSQL DLZ_BDB DLZ_LDAP DLZ_FILESYSTEM DLZ_STUB
OPTIONS_FILE_UNSET+=DNSTAP
OPTIONS_FILE_SET+=DOCS
OPTIONS_FILE_UNSET+=FIXED_RRSET
OPTIONS_FILE_SET+=IDN
OPTIONS_FILE_SET+=JSON
OPTIONS_FILE_UNSET+=LARGE_FILE
OPTIONS_FILE_SET+=LMDB
OPTIONS_FILE_UNSET+=MINCACHE
OPTIONS_FILE_UNSET+=PORTREVISION
OPTIONS_FILE_UNSET+=QUERYTRACE
OPTIONS_FILE_SET+=SIGCHASE
OPTIONS_FILE_UNSET+=START_LATE
OPTIONS_FILE_SET+=TCP_FASTOPEN
OPTIONS_FILE_UNSET+=TUNING_LARGE
OPTIONS_FILE_UNSET+=GSSAPI_BASE
OPTIONS_FILE_UNSET+=GSSAPI_HEIMDAL
OPTIONS_FILE_UNSET+=GSSAPI_MIT
OPTIONS_FILE_SET+=GSSAPI_NONE
OPTIONS_FILE_UNSET+=NATIVE_PKCS11
OPTIONS_FILE_UNSET+=DLZ_POSTGRESQL
OPTIONS_FILE_UNSET+=DLZ_MYSQL
OPTIONS_FILE_UNSET+=DLZ_BDB
OPTIONS_FILE_UNSET+=DLZ_LDAP
OPTIONS_FILE_SET+=DLZ_FILESYSTEM
OPTIONS_FILE_UNSET+=DLZ_STUB

And this make.conf:

DEFAULT_VERSIONS+=ssl=libressl
Comment 4 wcarson.bugzilla 2019-05-17 00:03:07 UTC
Oh, in case it's relevant, I also upgraded from bind912 -> bind914.
Comment 5 Mathieu Arnold freebsd_committer 2019-05-17 10:20:41 UTC
As for the config file, you could open a bug report on the ISC's bug tracker https://gitlab.isc.org/isc-projects/bind9/issues/ and ask it to be made confidential. Or you could join the https://gitlab.isc.org/isc-projects/bind9/issues/943 issue, that is also confidential. (But I am unsure on how to add you there.)
Comment 6 wcarson.bugzilla 2019-05-19 15:30:01 UTC
If I keep trying to start the daemon, it eventually doesn't crash. I registered at ISC's GitLab as wcarson, but I'm unable to see the existing issue.
Comment 7 Dirk Meyer freebsd_committer 2019-05-22 18:56:32 UTC
(In reply to Mathieu Arnold from comment #2)

I had crashes on bare metal and virtualized Guests XEN: Hypervisor version 4.7

affected 11.2-RELEASE-p3 amd64
affected 11.2-RELEASE-p9 amd64

Is crashed every time in starting the daemon,
so I was forced to downgrade to 9.12

Then I was migrating from from openssl-1.0.2r to openssl111-1.1.1b_1
Then after upgrade bind912-9.12.4P1 to bind914-9.14.2,
the daemon did start again on all my instances.
Comment 8 Dirk Meyer freebsd_committer 2019-05-22 19:01:23 UTC
(In reply to Dirk Meyer from comment #7)

Sorry, one bare metal still crashes with bind914-9.14.2 and openssl111-1.1.1b_1.
Comment 9 Dirk Meyer freebsd_committer 2019-05-22 19:08:03 UTC
(In reply to Dirk Meyer from comment #8)

named[75816]: starting BIND 9.14.2 (Stable Release) <id:7a62b30>
named[75816]: running on FreeBSD amd64 11.2-RELEASE-p3 FreeBSD 11.2-RELEASE-p3 #2 r338678: Fri Sep 14 18:53:17 CEST 2018     root@XXX:/usr/obj/usr/src/sys/GENERIC
named[75816]: built with '--localstatedir=/var' '--disable-linux-caps' '--with-libxml2=/usr/local' '--with-readline=-L/usr/local/lib -ledit' '--with-dlopen=yes' '--with-openssl=/usr/local' '--sysconfdir=/usr/local/etc/namedb' '--with-dlz-filesystem=yes' '--disable-dnstap' '--disable-fixed-rrset' '--without-gssapi' '--with-libidn2=/usr/local' '--with-libjson=/usr/local' '--disable-largefile' '--with-lmdb=/usr/local' '--disable-native-pkcs11' '--without-python' '--disable-querytrace' 'STD_CDEFINES=-DDIG_SIGCHASE=1' '--enable-tcp-fastopen' '--with-tuning=default' '--disable-symtable' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd11.2' 'build_alias=amd64-portbld-freebsd11.2' 'CC=cc' 'CFLAGS=-O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing ' 'LDFLAGS= -Wl,-rpath,/usr/local/lib -fstack-protector-strong ' 'LIBS=-L/usr/local/lib' 'CPPFLAGS=-DLIBICONV_PLUG -isystem /usr/local/includ
named[75816]: running as: named -u bind -c /usr/local/etc/namedb/named.conf
named[75816]: compiled by CLANG 4.2.1 Compatible FreeBSD Clang 6.0.0 (tags/RELEASE_600/final 326565)
named[75816]: compiled with OpenSSL version: OpenSSL 1.1.1b  26 Feb 2019
named[75816]: linked to OpenSSL version: OpenSSL 1.1.1b  26 Feb 2019
named[75816]: compiled with libxml2 version: 2.9.8
named[75816]: linked to libxml2 version: 20908
named[75816]: compiled with libjson-c version: 0.13.1
named[75816]: linked to libjson-c version: 0.13.1
named[75816]: compiled with zlib version: 1.2.11
named[75816]: linked to zlib version: 1.2.11
named[75816]: ----------------------------------------------------
named[75816]: BIND 9 is maintained by Internet Systems Consortium,
named[75816]: Inc. (ISC), a non-profit 501(c)(3) public-benefit 
named[75816]: corporation.  Support and training for BIND 9 are 
named[75816]: available at https://www.isc.org/support
named[75816]: ----------------------------------------------------
named[75816]: found 2 CPUs, using 2 worker threads
named[75816]: using 2 UDP listeners per interface
named[75816]: using up to 4096 sockets
named[75816]: loading configuration from '/usr/local/etc/namedb/named.conf'
named[75816]: reading built-in trust anchors from file '/usr/local/etc/namedb/bind.keys'
named[75816]: using default UDP/IPv4 port range: [49152, 65535]
named[75816]: using default UDP/IPv6 port range: [49152, 65535]
named[75816]: listening on IPv6 interfaces, port 53
named[75816]: listening on IPv4 interface em0, 217.29.33.74#53
named[75816]: generating session key for dynamic DNS
named[75816]: sizing zone task pool based on 21 zones
named[75816]: none:100: 'max-cache-size 90%' - setting to 3654MB (out of 4060MB)
named[75816]: obtaining root key for view _default from '/usr/local/etc/namedb/bind.keys'
named[75816]: set up managed keys zone for view _default, file 'managed-keys.bind'
named[75816]: automatic empty zone: 10.IN-ADDR.ARPA
[...]
named[75816]: automatic empty zone: HOME.ARPA
named[75816]: none:100: 'max-cache-size 90%' - setting to 3654MB (out of 4060MB)
named[75816]: command channel listening on X.X.X.X#953
named[75816]: managed-keys-zone: loaded serial 2
named[75816]: zone XXXXX/IN: loaded serial 2019010301
[....]
named[75816]: all zones loaded
named[75816]: running
named[75816]: zone XXXX/IN: sending notifies (serial 2019010301)
[....]
named[75816]: socket.c:2601: REQUIRE((uint_fast32_t)__c11_atomic_load(&sock->references, memory_order_acquire) == 1) failed, back trace
named[75816]: #0 0x439760 in ??
named[75816]: #1 0x612eaa in ??
named[75816]: #2 0x638cdd in ??
named[75816]: #3 0x4c3577 in ??
named[75816]: #4 0x4c00a8 in ??
named[75816]: #5 0x56c3fd in ??
named[75816]: #6 0x56ab97 in ??
named[75816]: #7 0x568d56 in ??
named[75816]: #8 0x62f5ed in ??
named[75816]: #9 0x80212bc06 in ??
named[75816]: #10 0x0 in ??
named[75816]: exiting (due to assertion failure)
Comment 10 commit-hook freebsd_committer 2019-06-03 12:44:16 UTC
A commit references this bug:

Author: mat
Date: Mon Jun  3 12:43:15 UTC 2019
New revision: 503379
URL: https://svnweb.freebsd.org/changeset/ports/503379

Log:
  Fix a possible race between udp dispatch and socket code.

  PR:		237640
  Obtained from:	https://gitlab.isc.org/isc-projects/bind9/merge_requests/1992
  MFH:		2019Q2

Changes:
  head/dns/bind914/Makefile
  head/dns/bind914/files/patch-lib_isc_unix_socket.c
Comment 11 commit-hook freebsd_committer 2019-06-03 12:47:22 UTC
A commit references this bug:

Author: mat
Date: Mon Jun  3 12:46:25 UTC 2019
New revision: 503383
URL: https://svnweb.freebsd.org/changeset/ports/503383

Log:
  MFH: r503379

  Fix a possible race between udp dispatch and socket code.

  PR:		237640
  Obtained from:	https://gitlab.isc.org/isc-projects/bind9/merge_requests/1992

Changes:
_U  branches/2019Q2/
  branches/2019Q2/dns/bind914/Makefile
  branches/2019Q2/dns/bind914/files/patch-lib_isc_unix_socket.c