Bug 196399 - databases/mariadb100-server: MariaDB daemon segfaults when built with clang 3.4 on 10.1-i386
Summary: databases/mariadb100-server: MariaDB daemon segfaults when built with clang 3...
Status: Closed Overcome By Events
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: i386 Any
: --- Affects Some People
Assignee: Bernard Spil
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-31 11:57 UTC by Bernard Spil
Modified: 2015-08-15 11:27 UTC (History)
7 users (show)

See Also:


Attachments
Patch multiple CMake files to fix clang detection (15.29 KB, patch)
2015-02-16 19:27 UTC, Dimitry Andric
no flags Details | Diff
Disable C11 atomics for i386, and use builtins instead (2.38 KB, patch)
2015-02-16 19:30 UTC, Dimitry Andric
no flags Details | Diff
Stack trace from mysqld coredump (11.75 KB, text/plain)
2015-05-14 21:22 UTC, Denis Kasak
no flags Details
Stack trace from mysql_install_db coredump (2.55 KB, text/plain)
2015-05-14 21:23 UTC, Denis Kasak
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bernard Spil freebsd_committer freebsd_triage 2014-12-31 11:57:49 UTC
When MariaDB (10.0.14/10.0.15 confirmed by committer, 5.5 reported) is built on FreeBSD 10.1 with base clang 3.4 the daemon will segfault when a client connects.

This does NOT occur on amd64
This does NOT occur when building with clang 3.3 from ports
This DOES occur when building with clang 3.4 from base
This DOES occur when building with clang 3.5 from ports
This does NOT occur on FreeBSD 10.0 with clang 3.3 from base
This does NOT occur when built in Poudriere (10.0 jail)

Coredump, backtrace, binary available.

From: spil.oss@gmail.com
To: maria-developers@lists.launchpad.net
subject: mysqld 10.0.15 segfaults on FreeBSD i386 clang 3.4

>Description:
        When MariaDB is built with clang 3.4 on FreeBSD i386 (which is
        the default compiler) the server will segfault the moment a
        client connects to it. Same behaviour is observed with clang
        3.5. Built with clang 3.3 on 10.1 runs without segfaults.
        Built on FreeBSD 10.0 (which comes with clang 3.3) runs OK.
>How-To-Repeat:
        Use FreeBSD 10.1 i386
        Use port to build MariaDB 10.0 or 5.5
        Connect to server using client

>Fix:
        Build with clang 3.3 or build using Pourdiere (uses 10.0 jail)

>Submitter-Id:  <submitter ID>
>Originator:    Bernard Spil
>Organization:
 FreeBSD MariaDB 10.0 port committer
>MySQL support: none
>Synopsis:      MariaDB segfaults on i386 FreeBSD
>Severity:      non-critical
>Priority:      low
>Category:      mysql
>Class:         sw-bug
>Release:       mysql-10.0.15 (FreeBSD Ports)

>C compiler:    clang 3.4
>C++ compiler:  clang 3.4
>Environment:
        FreeBSD 10.1 GENERIC i386 Celeron U4100 Dual Core 4GB
System: FreeBSD i386bsd 10.1-RELEASE FreeBSD 10.1-RELEASE #0 r274401: Tue Nov 11 22:51:51 UTC 2014     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC  i386


Some paths:  /usr/bin/perl /usr/bin/make /usr/local/bin/gmake /usr/bin/cc

Compilation info (call): CC='/usr/bin/cc'  CFLAGS='-O2 -pipe  -fstack-protector -fno-strict-aliasing -O2 -g -DNDEBUG -DDBUG_OFF'  CXX='/usr/bin/c++'  CXXFLAGS='-O2 -pipe -fstack-protector -fno-strict-aliasing -O2 -g -DNDEBUG -DDBUG_O
FF'  LDFLAGS=''  ASFLAGS=''
Compilation info (used): CC='/usr/bin/cc'  CFLAGS='-O2 -pipe  -fstack-protector -fno-strict-aliasing -O2 -g -DNDEBUG -DDBUG_OFF'  CXX='/usr/bin/c++'  CXXFLAGS='-O2 -pipe -fstack-protector -fno-strict-aliasing -O2 -g -DNDEBUG -DDBUG_O
FF'  LDFLAGS=''  ASFLAGS=''
LIBC:
-r--r--r--  1 root  wheel  1427444 Nov 11 23:52 /lib/libc.so.7
-r--r--r--  1 root  wheel  2833712 Nov 11 23:52 /usr/lib/libc.a
-r--r--r--  1 root  wheel  166 Nov 11 23:52 /usr/lib/libc.so

Perl: This is perl 5, version 18, subversion 4 (v5.18.4) built for i386-freebsd-thread-multi-64int
Comment 1 Bernard Spil freebsd_committer freebsd_triage 2014-12-31 12:06:58 UTC
Created an Issue on MariaDB's JIRA
https://mariadb.atlassian.net/browse/MDEV-7398
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2014-12-31 19:59:41 UTC
I'm going to modify the Summary to give it a single portname to hang its hat on.  However, I'm going to notify the maintainers of mariadb-server, mariadb55-server, and maraiadb100-server as well.
Comment 3 Steven Hartland freebsd_committer freebsd_triage 2015-02-05 23:33:12 UTC
May or may not be related but I'm struggling with a segv on connect with murmur when using clang 3.4.1 from 10.1-RELEASE on amd64 and switching to clang 3.3 from ports also fixes the issue.
Comment 4 Steven Hartland freebsd_committer freebsd_triage 2015-02-07 14:01:29 UTC
Separate PR raised in case its a different issue:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197389
Comment 5 Dimitry Andric freebsd_committer freebsd_triage 2015-02-15 18:53:40 UTC
This is most likely not related to bug 197389.  It also dies when built on a freshly built current, using clang 3.6.0rc3:

$ sudo /usr/local/etc/rc.d/mysql-server onestart
Installing MariaDB/MySQL system tables in '/var/db/mysql' ...
150215 19:50:44 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150215 19:50:44 [Note] InnoDB: The InnoDB memory heap is disabled
150215 19:50:44 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150215 19:50:44 [Note] InnoDB: Memory barrier is not used
150215 19:50:44 [Note] InnoDB: Compressed tables use zlib 1.2.8
150215 19:50:44 [Note] InnoDB: Not using CPU crc32 instructions
150215 19:50:44 [Note] InnoDB: Initializing buffer pool, size = 128.0M
150215 19:50:44 [Note] InnoDB: Completed initialization of buffer pool
150215 19:50:44 [Note] InnoDB: The first specified data file ./ibdata1 did not exist: a new database to be created!
150215 19:50:44 [Note] InnoDB: Setting file ./ibdata1 size to 12 MB
150215 19:50:44 [Note] InnoDB: Database physically writes the file full: wait...
150215 19:50:44 [Note] InnoDB: Setting log file ./ib_logfile101 size to 48 MB
150215 19:50:44 [Note] InnoDB: Setting log file ./ib_logfile1 size to 48 MB
150215 19:50:45 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
150215 19:50:45 [Warning] InnoDB: New log files created, LSN=45781
150215 19:50:45 [Note] InnoDB: Doublewrite buffer not found: creating new
150215 19:50:45 [Note] InnoDB: Doublewrite buffer created
150215 19:50:45 [Note] InnoDB: 128 rollback segment(s) are active.
150215 19:50:45 [Warning] InnoDB: Creating foreign key constraint system tables.
150215 19:50:45 [Note] InnoDB: Foreign key constraint system tables created
150215 19:50:45 [Note] InnoDB: Creating tablespace and datafile system tables.
150215 19:50:45 [Note] InnoDB: Tablespace and datafile system tables created.
150215 19:50:45 [Note] InnoDB: Waiting for purge to start
150215 19:50:45 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.22-71.0 started; log sequence number 0
150215 19:50:45 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.0.16-MariaDB
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=0
max_threads=153
thread_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 466005 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x3b7c1008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xba69ef68 thread_stack 0x48000

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): is an invalid pointer
Connection ID (thread ID): 1
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Installation of system tables failed!  Examine the logs in
/var/db/mysql for more information.

The problem could be conflicting information in an external
my.cnf files. You can ignore these by doing:

    shell> /usr/local/scripts/scripts/mysql_install_db --defaults-file=~/.my.cnf

You can also try to start the mysqld daemon with:

    shell> /usr/local/libexec/mysqld --skip-grant --general-log &

and use the command line tool /usr/local/bin/mysql
to connect to the mysql database and look at the grant tables:

    shell> /usr/local/bin/mysql -u root mysql
    mysql> show tables;

Try 'mysqld --help' if you have problems with paths.  Using
--general-log gives you a log in /var/db/mysql that may be helpful.

The latest information about mysql_install_db is available at
https://mariadb.com/kb/en/installing-system-tables-mysql_install_db
MariaDB is hosted on launchpad; You can find the latest source and
email lists at http://launchpad.net/maria

Please check all of the above before submitting a bug report
at http://mariadb.org/jira

/usr/local/etc/rc.d/mysql-server: WARNING: failed precmd routine for mysql

Can somebody tell me how to run *just* the mysqld executable under gdb, so I can get a reliable backtrace, instead of relying on the built-in one that doesn't seem to work?
Comment 6 Dimitry Andric freebsd_committer freebsd_triage 2015-02-16 19:27:49 UTC
Created attachment 153052 [details]
Patch multiple CMake files to fix clang detection

It turns out this is due to the way MariaDB handles atomic operations.  There are multiple problems with its methods:
1) It does not detect clang properly in the CMake files
2) It mis-detects the existence of atomic operations on i386, and thus:
3) It inserts non-threadsafe hand-rolled versions that "simulate" atomic operations, but these aren't really atomic. These versions seem to be buggy, 

For the problems where it doesn't detect clang in CMake files, I'm attaching one patch, which should be reasonably correct.

The other problem is that it's really only possible to do atomic 64 bit operations on i586 and higher, but we still default to i486 on the i386 architecture.

When MariaDB attempts to define atomic operations via include/atomic/gcc_builtins.h, it will by default use the C11 variants, which causes both clang and gcc to insert calls to the (non-existing) library functions __atomic_load_8() and __atomic_store_8().

These functions are supposed to implement atomic 64 bit load and store, but they do not exist in any of our libraries on i386 currently, and I've understood they are very difficult (or impossible) to implement without kernel support.

If we disable the use of C11 atomic operations, by changing the test in include/atomic/gcc_builtins.h as in the second patch, it will try to use the builtins __sync_fetch_and_or() and __sync_lock_test_and_set() instead.  If you then use clang, it will emit "lock cmpxchg8b" instructions instead, which are only compatible with i586 or higher CPU, and even then, some CPUs might not support them.  (This is really a bug in clang, but I'm unsure if upstream will care about CPUs lower than Pentiums.)

With both the first and second patch applied, at least the server starts for me, and survives light testing.  I have not run it through any thorough testing at all, though.
Comment 7 Dimitry Andric freebsd_committer freebsd_triage 2015-02-16 19:30:24 UTC
Created attachment 153054 [details]
Disable C11 atomics for i386, and use builtins instead

This is the second patch, which disables using __atomic_load_n() and __atomic_store_n(), and chooses __sync_fetch_and_or() and __sync_lock_test_and_set() instead.

Works for clang, but not for gcc, unless CPUTYPE is i586 or higher.
Comment 8 Denis Kasak 2015-05-14 21:21:04 UTC
What is the status on this? Is there something preventing the patches from being merged?

I just ran into what seems to be this bug, but I can't be sure since there are no stack traces supplied here. The symptoms are the same; i.e. the server crashes when connected to. This also prevents mysql_install_db from working properly. Attaching the stack traces produced from running both mysqld and mysql_install_db in case they are of any use.
Comment 9 Denis Kasak 2015-05-14 21:22:37 UTC
Created attachment 156786 [details]
Stack trace from mysqld coredump
Comment 10 Denis Kasak 2015-05-14 21:23:15 UTC
Created attachment 156787 [details]
Stack trace from mysql_install_db coredump
Comment 11 Mark Linimon freebsd_committer freebsd_triage 2015-08-13 13:43:03 UTC
Over to new maintainer.
Comment 12 Bernard Spil freebsd_committer freebsd_triage 2015-08-15 11:27:13 UTC
Just tested mariadb100-server 10.0.21 on 10.2-RELEASE and this fixed the issue with MariaDB 10.0 on FreeBSD 10 i386.