Bug 255398 - databases/mariadb105-server: deadlocks at start when wsrep is enabled
Summary: databases/mariadb105-server: deadlocks at start when wsrep is enabled
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Bernard Spil
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-25 18:24 UTC by Phillip R. Jaenke
Modified: 2023-04-10 12:12 UTC (History)
0 users

See Also:
bugzilla: maintainer-feedback? (brnrd)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Phillip R. Jaenke 2021-04-25 18:24:47 UTC
Reproduction using 10.5.9 with from ports:

DB1:
datadir=/var/db/mysql
sudo -u mysql /usr/local/bin/mysql_install_db --basedir=/usr/local --datadir=$datadir --skip-test-db
sudo -u mysql /usr/local/libexec/mariadbd --wsrep-new-cluster --wsrep-on --wsrep_cluster_address=gcomm://DB1 --datadir=$datadir

DB2:
datadir=/var/db/mysql
sudo -u mysql /usr/local/bin/mysql_install_db --basedir=/usr/local --datadir=$datadir --skip-test-db
sudo -u mysql /usr/local/libexec/mariadbd --wsrep-on --wsrep_cluster_address=gcomm://DB1 --datadir=$datadir

Initial membership will succeed (check the logs, of course.) Once that's confirmed, the cluster should be ready to go. So go kill -TERM mariadbd on DB2, then DB1. (Order matters, even though it's multi-master.)

BOTH hosts /usr/local/etc/mysql/conf.d/wsrep.cnf:
[mysqld]
bind-address=0.0.0.0
binlog_format=ROW
wsrep_on=1
wsrep_provider=/usr/local/lib/libgalera_smm.so
wsrep_cluster_name="demo"
wsrep_cluster_address="gcomm://DB1,DB2"

BOTH hosts rc.conf:
mysql_enable="YES"
mysql_dbdir="/var/db/mysql"

Then on DB1: /usr/local/etc/rc.d/mysql-server start
It'll go through the 15 second timeout waiting for the pidfile, and then exit 1, without actually killing the process. It just never writes either the pidfile or the socket. Ever. No errors are logged in either the mysql error log or wsrep error logs. The process just hangs and does not die to TERM.

This reproduces if manually started with "sudo -u mysql /usr/local/libexec/mariadbd --defaults-extra-file=/usr/local/etc/mysql/my.cnf --user=mysql --datadir=/var/db/mysql/data --pid-file=/var/run/mysql/mysqld.pid" Instead of starting, it just hangs and will not respond to TERM only KILL.

Port was built with options:
databases_mariadb105-server_SET+=GSSAPI_HEIMDAL LZ4 WSREP
databases_mariadb105-server_UNSET+=GSSAPI_BASE GSSAPI_MIT GSSAPI_NONE

What is perplexing is that this ONLY reproduces with wsrep being configured by files in /usr/local/etc/mysql/conf.d. If the server is started with "--wsrep-on --wsrep_cluster_address=gcomm://DB1,DB2" then it works as expected. So it is specifically something with reading the wsrep configuration from files. Even putting the wsrep configuration into my.cnf causes the exact same behavior.
Comment 1 Bernard Spil freebsd_committer freebsd_triage 2023-04-10 12:12:28 UTC
Does this behavior persist in 10.6 and 10.11?

Was this reported upstream?