Bug 135673 - databases/mysql50-server - MySQL query lock-ups on 7.2-RELEASE amd64
Summary: databases/mysql50-server - MySQL query lock-ups on 7.2-RELEASE amd64
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: threads (show other bugs)
Version: 7.2-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-threads (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-17 19:20 UTC by freebsd-bugs
Modified: 2015-05-14 21:34 UTC (History)
1 user (show)

See Also:


Attachments
file.txt (31.00 KB, text/plain)
2009-06-17 19:20 UTC, freebsd-bugs
no flags Details
stacks.txt (45.20 KB, text/plain; charset=GB2312)
2010-09-27 15:19 UTC, Tom Judge
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description freebsd-bugs 2009-06-17 19:20:00 UTC
This report is about a problem occurring on a dedicated database slave server which runs five instances of MySQL inside jails.  The server has run flawlessly for a about six months under FreeBSD 7.0.

The server:

Dell R905 w/ 2x Opteron 8347 HE
64 GB of RAM @ 533 MHz
Mysql data on RAID-0 consisting of 4x Intel X25E connected to 256 MB PERC6i
Mysqld processes running in jails
Mysql data directories null mounted from SSD into jail
MySQL 5.0.51a compiled WITH_OPTIMIZED and WITH_PROC_SCOPE_PTH
MyISAM tables

The SSDs are recent, but the server has run for almost two months prior to the 7.2 upgrade with occurrence of this problem.

Since upgrading this machine to FreeBSD 7.2, on three separate occasions, individual queries against a particular jailed mysqld process have locked up while copying to a temporary table.

dmesg and kernel config are attached.
Only the one query locks up at a time, but since this is a replication slave, the read lock on the table brings replication to a halt.  Other read-only queries proceed normally.  We've seen the lock-up last over a day in our testing before we gave up on it.  The locked thread doesn't go away when KILLed.

We end up having to kill -9 the mysqld and run myisamchk on the tables.  Nothing less seems to break the deadlock.

Fix: Patch attached with submission follows:
How-To-Repeat: Since this is only happening on one of our data sets, under fairly high load, and intermittently at that, it might be more practical for us to collect and provide data which might help diagnose the problem.
Comment 1 nick 2009-06-17 21:24:02 UTC
A few bits of follow-up information.

This machine was using SCHED_ULE under 7.0, at which time it operated  
flawlessly.  We have other machines running ULE under 7.0 and 7.1.  No  
problems at all prior to 7.2.

The first two of these lockups occurred while the machine was running  
custom packages built on 7.0-RELEASE.  Now it's running equivalent  
custom packages built directly on 7.2, but the lockup has recurred  
anyway.

Also, s/WITH_OPTIMIZED/BUILD_OPTIMIZED/.

-nick

--
nick@desert.net - all messages cryptographically signed
Comment 2 Edwin Groothuis freebsd_committer freebsd_triage 2009-06-17 22:42:07 UTC
Responsible Changed
From-To: freebsd-ports-bugs->ale

Over to maintainer (via the GNATS Auto Assign Tool)
Comment 3 nick 2009-06-25 08:50:06 UTC
Upgraded to 7.2-RELEASE-p2, to see if that would help.  Actually, the  
wedge-up happened even sooner after the upgrade, a matter of hours  
rather than a matter of days.  Of course that could be due to other  
factors than the upgrade.

The last two times, the MySQL client thread has been in the state  
"Sending data".  I ran a tcpdump for a few hours on one of the stuck  
connections, and saw literally one packet on that particular  
connection during that time.

It actually seems like this might be more of a kernel threading/ 
locking issue.  Should this bug be assigned to a different category?

If we can't find a resolution to this, it'll mean 7.2 is off limits on  
our database servers. :(

-nick

--
nick@desert.net - all messages cryptographically signed
Comment 4 Alex Dupre freebsd_committer freebsd_triage 2009-07-14 08:52:43 UTC
Responsible Changed
From-To: ale->freebsd-threads

FreeBSD's threads problem.
Comment 5 dfilter service freebsd_committer freebsd_triage 2009-09-23 22:39:10 UTC
Author: attilio
Date: Wed Sep 23 21:38:57 2009
New Revision: 197445
URL: http://svn.freebsd.org/changeset/base/197445

Log:
  rwlock implemented from libthr need to fall through the 'hard path' and
  query umtx also if the shared waiters bit is set on a shared lock.
  The writer starvation avoidance technique, infact, can lead to shared
  waiters on a shared lock which can bring to a missed wakeup and thus
  to a deadlock if the right bit is not checked (a notable case is the
  writers counterpart to be handled through expired timeouts).
  
  Fix that by checking for the shared waiters bit also when unlocking the
  shared locks.
  
  That bug was causing a reported MySQL deadlock.
  Many thanks go to Nick Esborn and his employer DesertNet which provided
  time and machines to identify and fix this issue.
  
  PR:		thread/135673
  Reported by:	Nick Esborn <nick at desert dot net>
  Tested by:	Nick Esborn <nick at desert dot net>
  Reviewed by:	jeff

Modified:
  head/lib/libthr/thread/thr_umtx.h

Modified: head/lib/libthr/thread/thr_umtx.h
==============================================================================
--- head/lib/libthr/thread/thr_umtx.h	Wed Sep 23 20:49:14 2009	(r197444)
+++ head/lib/libthr/thread/thr_umtx.h	Wed Sep 23 21:38:57 2009	(r197445)
@@ -171,8 +171,11 @@ _thr_rwlock_unlock(struct urwlock *rwloc
 		for (;;) {
 			if (__predict_false(URWLOCK_READER_COUNT(state) == 0))
 				return (EPERM);
-			if (!((state & URWLOCK_WRITE_WAITERS) && URWLOCK_READER_COUNT(state) == 1)) {
-				if (atomic_cmpset_rel_32(&rwlock->rw_state, state, state-1))
+			if (!((state & (URWLOCK_WRITE_WAITERS |
+			    URWLOCK_READ_WAITERS)) &&
+			    URWLOCK_READER_COUNT(state) == 1)) {
+				if (atomic_cmpset_rel_32(&rwlock->rw_state,
+				    state, state-1))
 					return (0);
 				state = rwlock->rw_state;
 			} else {
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 6 Conor McDermottroe 2009-11-24 15:19:38 UTC
Thanks very much for the patch, we were running into it as well. The good
news is that it improves the situation quite a bit. The bad news is that
it does not appear to cure the problem entirely. Under heavy load we see
the same problem re-occurring.

We'll be bringing up another machine with 8.0 in the coming weeks and I
hope to test it then.

We're currently avoiding this bug by under-loading the 7.2 machine and
handling more queries on a different 7.0 machine. Is there any
information I can provide which would help diagnose this bug further?
Comment 7 shane.bester 2011-08-01 10:57:37 UTC
Hi!

Can anybody who frequently hits this hangup with relatively simple queries
against MyISAM tables please let us know if this solves or avoids the issue:
Add to [mysqld] section of my.cnf:
concurrent_insert=0

or SET GLOBAL concurrent_insert=0;
Comment 8 Ed Maste freebsd_committer freebsd_triage 2015-05-14 21:24:11 UTC
Is this still an issue?
Comment 9 freebsd-bugs 2015-05-14 21:32:20 UTC
Attilio's work resolved the problem completely. Thanks!
Comment 10 Ed Maste freebsd_committer freebsd_triage 2015-05-14 21:34:17 UTC
Submitter confirms this is fixed with Attilio's work -- thanks for the follow-up!