Summary: | databases/galera: MariaDB crashes when using clustering | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Ports & Packages | Reporter: | ganbold-freebsd | ||||||
Component: | Individual Port(s) | Assignee: | Walter Schwarzenfeld <w.schwarzenfeld> | ||||||
Status: | Closed Overcome By Events | ||||||||
Severity: | Affects Some People | CC: | ari, brnrd, dpetrov67, horia, lapo, me, w.schwarzenfeld | ||||||
Priority: | Normal | ||||||||
Version: | Latest | ||||||||
Hardware: | amd64 | ||||||||
OS: | Any | ||||||||
Attachments: |
|
Description
ganbold-freebsd
2016-03-18 00:44:32 UTC
Hi, Can you share the options you've enabled when building? For one, the galera port builds with GCC not with CLANG, that's would lead to errors with boost, see #207094 comment 4. Thanks for reporting! I've tried updating the port to the latest version as per the MariaDB documented version 25.3.15 Can you try the updated port from my github? https://github.com/Sp1l/ports/tree/master/databases/galera That tarball from galeracluster.com doesn't bundle the documentation so I had to remove all of that. Not sure about the force config... I don't have a cluster here to test, would appreciate if you can let me know if this works! Horia: Haven't cleaned up... Probably a lot of now defunct REINPLACE_CMD in the post_patch target... (In reply to Bernard Spil from comment #1) Here is the options: # more /var/db/ports/databases_mariadb101-server/options # This file is auto-generated by 'make config'. # Options for mariadb101-server-10.1.11 _OPTIONS_READ=mariadb101-server-10.1.11 _FILE_COMPLETE_OPTIONS_LIST=FASTMTX MAXKEY GSSAPI_BASE GSSAPI_HEIMDAL GSSAPI_MIT INNOBASE MROONGA OQGRAPH SPHINX SPIDER TOKUDB OPTIONS_FILE_UNSET+=FASTMTX OPTIONS_FILE_SET+=MAXKEY OPTIONS_FILE_SET+=GSSAPI_BASE OPTIONS_FILE_UNSET+=GSSAPI_HEIMDAL OPTIONS_FILE_UNSET+=GSSAPI_MIT OPTIONS_FILE_UNSET+=INNOBASE OPTIONS_FILE_UNSET+=MROONGA OPTIONS_FILE_UNSET+=OQGRAPH OPTIONS_FILE_SET+=SPHINX OPTIONS_FILE_SET+=SPIDER OPTIONS_FILE_UNSET+=TOKUDB (In reply to ganbold-freebsd from comment #3) more /var/db/ports/databases_galera/options # This file is auto-generated by 'make config'. # Options for galera-25.3.15 _OPTIONS_READ=galera-25.3.15 _FILE_COMPLETE_OPTIONS_LIST=BOOSTPOOL BPOSTATIC DEBUG DOCS TEST EPUB JSON LATEX PICKLE OPTIONS_FILE_UNSET+=BOOSTPOOL OPTIONS_FILE_UNSET+=BPOSTATIC OPTIONS_FILE_SET+=DEBUG OPTIONS_FILE_UNSET+=DOCS OPTIONS_FILE_UNSET+=TEST OPTIONS_FILE_UNSET+=EPUB OPTIONS_FILE_UNSET+=JSON OPTIONS_FILE_UNSET+=LATEX OPTIONS_FILE_UNSET+=PICKLE (In reply to Bernard Spil from comment #2) Cool, this seems works for me. Following are logs: 160327 13:25:21 mysqld_safe Starting mysqld daemon with databases from /var/db/mysql 160327 13:25:21 mysqld_safe WSREP: Running position recovery with --log_error='/var/db/mysql/wsrep_recovery.bUFwsE' --pid-file='/var/db/mysql/bsd2-recover.pid' 2016-03-27 13:25:21 34426872832 [Note] /usr/local/libexec/mysqld (mysqld 10.1.11-MariaDB) starting as process 10679 ... 160327 13:25:24 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1 2016-03-27 13:25:24 34426872832 [Note] /usr/local/libexec/mysqld (mysqld 10.1.11-MariaDB) starting as process 10692 ... 2016-03-27 13:25:24 34426872832 [Note] WSREP: Setting wsrep_ready to 0 2016-03-27 13:25:24 34426872832 [Note] WSREP: Read nil XID from storage engines, skipping position init 2016-03-27 13:25:24 34426872832 [Note] WSREP: wsrep_load(): loading provider library '/usr/local/lib/libgalera_smm.so' 2016-03-27 13:25:24 34426872832 [Note] WSREP: wsrep_load(): Galera 3.15(r8459459) by Codership Oy <info@codership.com> loaded successfully. 2016-03-27 13:25:24 34426872832 [Note] WSREP: CRC-32C: using hardware acceleration. 2016-03-27 13:25:24 34426872832 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1 2016-03-27 13:25:24 34426872832 [Note] WSREP: Passing config to GCS: base_dir = /var/db/mysql/; base_host = 192.168.0.90; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/db/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/db/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc 2016-03-27 13:25:24 34426875904 [Note] WSREP: Service thread queue flushed. 2016-03-27 13:25:24 34426872832 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1 2016-03-27 13:25:24 34426872832 [Note] WSREP: wsrep_sst_grab() 2016-03-27 13:25:24 34426872832 [Note] WSREP: Start replication 2016-03-27 13:25:24 34426872832 [Note] WSREP: 'wsrep-new-cluster' option used, bootstrapping the cluster 2016-03-27 13:25:24 34426872832 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1 2016-03-27 13:25:24 34426872832 [Note] WSREP: protonet asio version 0 2016-03-27 13:25:24 34426872832 [Note] WSREP: Using CRC-32C for message checksums. 2016-03-27 13:25:24 34426872832 [Note] WSREP: backend: asio 2016-03-27 13:25:24 34426872832 [Warning] WSREP: access file(/var/db/mysql//gvwstate.dat) failed(No such file or directory) 2016-03-27 13:25:24 34426872832 [Note] WSREP: restore pc from disk failed 2016-03-27 13:25:24 34426872832 [Note] WSREP: GMCast version 0 2016-03-27 13:25:24 34426872832 [Note] WSREP: (ecffafaa, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 2016-03-27 13:25:24 34426872832 [Note] WSREP: (ecffafaa, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 2016-03-27 13:25:24 34426872832 [Note] WSREP: EVS version 0 2016-03-27 13:25:24 34426872832 [Note] WSREP: gcomm: bootstrapping new group 'galera_cluster' 2016-03-27 13:25:24 34426872832 [Note] WSREP: start_prim is enabled, turn off pc_recovery 2016-03-27 13:25:24 34426872832 [Note] WSREP: Node ecffafaa state prim 2016-03-27 13:25:24 34426872832 [Note] WSREP: view(view_id(PRIM,ecffafaa,1) memb { ecffafaa,0 } joined { } left { } partitioned { }) 2016-03-27 13:25:24 34426872832 [Note] WSREP: save pc into disk 2016-03-27 13:25:24 34426872832 [Note] WSREP: gcomm: connected 2016-03-27 13:25:24 34426872832 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 2016-03-27 13:25:24 34426872832 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) 2016-03-27 13:25:24 34426872832 [Note] WSREP: Opened channel 'galera_cluster' 2016-03-27 13:25:24 34426882048 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1 2016-03-27 13:25:24 34426882048 [Note] WSREP: Starting new group from scratch: ed006161-f3d3-11e5-9205-3f6cb0392534 2016-03-27 13:25:24 34426882048 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ed0065cb-f3d3-11e5-820d-3e8178981e69 2016-03-27 13:25:24 34426882048 [Note] WSREP: STATE EXCHANGE: sent state msg: ed0065cb-f3d3-11e5-820d-3e8178981e69 2016-03-27 13:25:24 34426882048 [Note] WSREP: STATE EXCHANGE: got state msg: ed0065cb-f3d3-11e5-820d-3e8178981e69 from 0 (node1) 2016-03-27 13:25:24 34426882048 [Note] WSREP: Quorum results: version = 3, component = PRIMARY, conf_id = 0, members = 1/1 (joined/total), act_id = 0, last_appl. = -1, protocols = 0/7/3 (gcs/repl/appl), group UUID = ed006161-f3d3-11e5-9205-3f6cb0392534 2016-03-27 13:25:24 34426882048 [Note] WSREP: Flow-control interval: [16, 16] 2016-03-27 13:25:24 34426882048 [Note] WSREP: Restored state OPEN -> JOINED (0) 2016-03-27 13:25:24 34426882048 [Note] WSREP: Member 0.0 (node1) synced with group. 2016-03-27 13:25:24 34426882048 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0) 2016-03-27 13:25:24 34426872832 [Note] WSREP: Waiting for SST to complete. 2016-03-27 13:25:24 34426884096 [Note] WSREP: New cluster view: global state: ed006161-f3d3-11e5-9205-3f6cb0392534:0, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 3 2016-03-27 13:25:24 34426872832 [Note] WSREP: SST complete, seqno: 0 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Using mutexes to ref count buffer pool pages 2016-03-27 13:25:24 34426872832 [Note] InnoDB: The InnoDB memory heap is disabled 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Memory barrier is not used 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Compressed tables use zlib 1.2.8 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Using SSE crc32 instructions 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Initializing buffer pool, size = 128.0M 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Completed initialization of buffer pool 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Highest supported file format is Barracuda. 2016-03-27 13:25:24 34426872832 [Note] InnoDB: 128 rollback segment(s) are active. 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Waiting for purge to start 2016-03-27 13:25:24 34426872832 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.26-76.0 started; log sequence number 1617439 2016-03-27 13:25:24 34426908672 [Note] InnoDB: Dumping buffer pool(s) not yet started 2016-03-27 13:25:24 34426872832 [Note] Plugin 'FEEDBACK' is disabled. 2016-03-27 13:25:24 34426872832 [Note] Server socket created on IP: '0.0.0.0'. 2016-03-27 13:25:24 34426884096 [Note] WSREP: Set WSREPXid for InnoDB: ed006161-f3d3-11e5-9205-3f6cb0392534:0 2016-03-27 13:25:24 34426884096 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2016-03-27 13:25:24 34426884096 [Note] WSREP: REPL Protocols: 7 (3, 2) 2016-03-27 13:25:24 34426875904 [Note] WSREP: Service thread queue flushed. 2016-03-27 13:25:24 34426884096 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3 2016-03-27 13:25:24 34426875904 [Note] WSREP: Service thread queue flushed. 2016-03-27 13:25:24 34426884096 [Note] WSREP: Synchronized with group, ready for connections 2016-03-27 13:25:24 34426884096 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2016-03-27 13:25:24 34426884096 [Note] WSREP: Nobody is waiting for SST. 2016-03-27 13:25:24 34426872832 [Note] /usr/local/libexec/mysqld: ready for connections. Version: '10.1.11-MariaDB' socket: '/tmp/mysql.sock' port: 3306 FreeBSD Ports And status: MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_%'; +------------------------------+----------------------------------------------+ | Variable_name | Value | +------------------------------+----------------------------------------------+ | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 0.000000 | | wsrep_causal_reads | 0 | | wsrep_cert_deps_distance | 0.000000 | | wsrep_cert_index_size | 0 | | wsrep_cert_interval | 0.000000 | | wsrep_cluster_conf_id | 1 | | wsrep_cluster_size | 1 | | wsrep_cluster_state_uuid | ed006161-f3d3-11e5-9205-3f6cb0392534 | | wsrep_cluster_status | Primary | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 0.000000 | | wsrep_connected | ON | | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 3.003e-06/7.2634e-06/1.313e-05/3.51116e-06/5 | | wsrep_evs_state | OPERATIONAL | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_paused_ns | 0 | | wsrep_flow_control_recv | 0 | | wsrep_flow_control_sent | 0 | | wsrep_gcomm_uuid | ecffafaa-f3d3-11e5-ad3c-fe49aff50f57 | | wsrep_incoming_addresses | 192.168.0.90:3306 | | wsrep_last_committed | 0 | | wsrep_local_bf_aborts | 0 | | wsrep_local_cached_downto | 18446744073709551615 | | wsrep_local_cert_failures | 0 | | wsrep_local_commits | 0 | | wsrep_local_index | 0 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_avg | 0.500000 | | wsrep_local_recv_queue_max | 2 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_avg | 0.500000 | | wsrep_local_send_queue_max | 2 | | wsrep_local_send_queue_min | 0 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_local_state_uuid | ed006161-f3d3-11e5-9205-3f6cb0392534 | | wsrep_protocol_version | 7 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy <info@codership.com> | | wsrep_provider_version | 3.15(r8459459) | | wsrep_ready | ON | | wsrep_received | 2 | | wsrep_received_bytes | 141 | | wsrep_repl_data_bytes | 0 | | wsrep_repl_keys | 0 | | wsrep_repl_keys_bytes | 0 | | wsrep_repl_other_bytes | 0 | | wsrep_replicated | 0 | | wsrep_replicated_bytes | 0 | | wsrep_thread_count | 2 | +------------------------------+----------------------------------------------+ 57 rows in set (0.02 sec) MariaDB [(none)]> I will try another node to join to cluster and let you know. thanks On stock 10.2-RELEASE however it fails to register: ===> Staging for galera-25.3.15 ===> Generating temporary packing list install -s -m 444 /usr/ports/databases/galera/work/galera-3-25.3.15/libgalera_smm.so /usr/ports/databases/galera/work/stage/usr/local/lib/ ====> Compressing man pages (compress-man) ===> Installing for galera-25.3.15 ===> Checking if galera already installed ===> Registering installation for galera-25.3.15 pkg-static: Unable to access file /usr/ports/databases/galera/work/stage/usr/local/share/doc/galera/AUTHORS: No such file or directory pkg-static: Unable to access file /usr/ports/databases/galera/work/stage/usr/local/share/doc/galera/README: No such file or directory *** Error code 74 Stop. make[1]: stopped in /usr/ports/databases/galera *** Error code 1 Stop. make: stopped in /usr/ports/databases/galera Updated my GitHub repo for the AUTHORS and README files, that should fix the install. Can you try this with the new 10.1.13 port as well? Available in my Github repo https://github.com/Sp1l/ports/tree/master/databases/mariadb101-server and as a raw patch in https://reviews.freebsd.org/D5751 (In reply to Bernard Spil from comment #7) pkg-static: mariadb101-server-10.1.13 conflicts with mariadb101-client-10.1.13 (installs files into the same place). Problematic file: /usr/local/lib/mysql/plugin/daemon_example.ini (In reply to Bernard Spil from comment #7) If I remove that from work/.PLIST.mktmp then similar error again: Installing mariadb101-server-10.1.13... pkg-static: mariadb101-server-10.1.13 conflicts with mariadb101-client-10.1.13 (installs files into the same place). Problematic file: /usr/local/lib/mysql/plugin/dialog.so *** Error code 70 With new ports I was able to create 2 node cluster with Galera. galera-25.3.15 Synchronous multi-master replication engine mariadb101-client-10.1.13 Multithreaded SQL database (client) mariadb101-server-10.1.13 Multithreaded SQL database (server) However I seem to have some problem, like when I try to create database it hangs in mysql client, log only shows: 2016-03-28 0:35:28 34426900480 [Note] WSREP: TO BEGIN: -1, 0 : create database mm_test2 And no log on another node. I could be missing something. I will check and let you know. I can not seem to run queries like 'create database testing'. show processlist output is like: MariaDB [(none)]> show processlist; +----+-------------+-----------+------+---------+-------+----------------------+-------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +----+-------------+-----------+------+---------+-------+----------------------+-------------------------+----------+ | 1 | system user | | NULL | Sleep | 12094 | wsrep aborter idle | NULL | 0.000 | | 2 | system user | | NULL | Sleep | 12094 | NULL | NULL | 0.000 | | 4 | root | localhost | NULL | Query | 11693 | checking permissions | create database mm_test | 0.000 | | 5 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 | +----+-------------+-----------+------+---------+-------+----------------------+-------------------------+----------+ 4 rows in set (0.00 sec) Since the Galera version in ports is now quite old, should the first step be the upgrade it to a more recent release. http://galeracluster.com/2016/05/announcing-galera-cluster-5-5-49-and-5-6-30-with-galera-3-16/ We should be up to 25.3.16 rather than the 25.3.5 in FreeBSD ports. I couldn't find a release date for 25.3.5, but it looks to be in late 2014. For reference, the issue was reported upstream too: https://jira.mariadb.org/browse/MDEV-9757. The problem still the same with: galera-25.3.16 Synchronous multi-master replication engine mariadb101-client-10.1.14 Multithreaded SQL database (client) (In reply to ganbold-freebsd from comment #14) I meant the problem still exists with: galera-25.3.16 Synchronous multi-master replication engine mariadb101-server-10.1.14 Multithreaded SQL database (server) mariadb101-client-10.1.14 Multithreaded SQL database (client) I'm resetting the bug assignee after https://svnweb.freebsd.org/ports?view=revision&revision=417703. Created attachment 172370 [details]
svn diff for databases/galera
databases/galera: Update to 25.3.16
- Update to 25.3.16
- Take maintainership
- Build with clang
- Fix build with base ssl (LDPATH)
- Remove Docs deps and options
- Move USE_OPENSSL to USES= ssl
Please test this!
Created attachment 173139 [details] svn diff for databases/mariadb101-server From the MariaDB documentation found on https://mariadb.com/kb/en/mariadb/getting-started-with-mariadb-galera-cluster/ to make replication work, we need to add some flags when compiling. > -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=1 This patch adds Galera as an option which defaults to ON (In reply to Bernard Spil from comment #18) The problem is still the same with: galera-25.3.16 Synchronous multi-master replication engine and Makefile patched: mariadb101-client-10.1.16_1 Multithreaded SQL database (client) mariadb101-server-10.1.16_1 Multithreaded SQL database (server) *** Bug 210209 has been marked as a duplicate of this bug. *** *** Bug 212492 has been marked as a duplicate of this bug. *** You can keep the BUILD_DEPENDS= ${PYTHON_PKGNAMEPREFIX}cloud_sptheme>=0:textproc/py-cloud_sptheme (now it is sphinx-version>=1.3.) if you want. This is fixed https://svnweb.freebsd.org/ports?view=revision&revision=421590 and updated to 1.7.1 https://svnweb.freebsd.org/ports?view=revision&revision=421591 Sorry, has nothing to do with sphinx-version (I puzzled this, this was in py-cloud_sptheme). Is this still relevant? No, no longer for me... I also recently set-up a MariaDB+Galera cluster and this problem didn't happen. (OTOH I opened PR 238360 for another one I found) I close here with overcome by events. If there are still similar problems please open a new PR. |