Summary: | Deadlock in the networking code, possible due to a bug in the SCHED_ULE | ||
---|---|---|---|
Product: | Base System | Reporter: | Eugene Grosbein <ports> |
Component: | kern | Assignee: | Eugene Grosbein <eugen> |
Status: | Closed FIXED | ||
Severity: | Affects Only Me | CC: | eugen, jhb |
Priority: | Normal | ||
Version: | 8.3-STABLE | ||
Hardware: | Any | ||
OS: | Any |
Description
Eugene Grosbein
2012-09-29 18:30:01 UTC
It looks like CPUs 0 - 4 are idle, but CPU 5 has load of three. One of those threads is the syslogd thread that holds the lock, but the currently running thread is 'ipmi0: kcs' thread with tid 100118. It would interesting to examine what it is doing. -- Andriy Gapon on 30/09/2012 14:54 Andriy Gapon said the following:
>
> It looks like CPUs 0 - 4 are idle, but CPU 5 has load of three.
> One of those threads is the syslogd thread that holds the lock, but the
> currently running thread is 'ipmi0: kcs' thread with tid 100118.
> It would interesting to examine what it is doing.
>
Looks like the kcs busy loops in here: kcs_loop -> kcs_read_byte ->
kcs_wait_for_obf.
Since this is a 6-CPU machine, steal threshold is set to 3 so other CPUs don't
try to take any work from CPU5. Not sure if this is smart actually. Maybe it
would make sense to have a lower threshold or to allow stealing of real-time
threads at a lower threshold.
Since the kcs thread is a kernel thread with real-time priority (68) it doesn't
allow any other lower priority thread to run while it's not sleeping.
Also, it looks like rwlock does not take care to propagate waiters' priorities
in all cases. Maybe priority propagation could have helped here, but not sure...
--
Andriy Gapon
on 30/09/2012 16:42 Andriy Gapon said the following:
> on 30/09/2012 14:54 Andriy Gapon said the following:
>>
>> It looks like CPUs 0 - 4 are idle, but CPU 5 has load of three.
>> One of those threads is the syslogd thread that holds the lock, but the
>> currently running thread is 'ipmi0: kcs' thread with tid 100118.
>> It would interesting to examine what it is doing.
>>
>
> Looks like the kcs busy loops in here: kcs_loop -> kcs_read_byte ->
> kcs_wait_for_obf.
> Since this is a 6-CPU machine, steal threshold is set to 3 so other CPUs don't
> try to take any work from CPU5. Not sure if this is smart actually. Maybe it
> would make sense to have a lower threshold or to allow stealing of real-time
> threads at a lower threshold.
>
> Since the kcs thread is a kernel thread with real-time priority (68) it doesn't
> allow any other lower priority thread to run while it's not sleeping.
>
> Also, it looks like rwlock does not take care to propagate waiters' priorities
> in all cases. Maybe priority propagation could have helped here, but not sure...
>
In any case, the original trigger for this problem seems to be something in IPMI
that keeps that thread running.
--
Andriy Gapon
About rw_lock priority propagation locking(9) tells: The rw_lock locks have priority propagation like mutexes, but priority can be propagated only to an exclusive holder. This limitation comes from the fact that shared owners are anonymous. What's about idle stealing threshold, it was fixed in HEAD at r239194, but wasn't merged yet. It should be trivial to merge it. -- Alexander Motin 02.10.2012 13:58, Alexander Motin ÐÉÛÅÔ:
> About rw_lock priority propagation locking(9) tells:
> The rw_lock locks have priority propagation like mutexes, but priority
> can be propagated only to an exclusive holder. This limitation comes
> from the fact that shared owners are anonymous.
>
> What's about idle stealing threshold, it was fixed in HEAD at r239194,
> but wasn't merged yet. It should be trivial to merge it.
Would it fix my problem with 6-CPU box?
Your commit log talks about "8 or more cores".
Eugene Grosbein
On 02.10.2012 10:48, Eugene Grosbein wrote: > 02.10.2012 13:58, Alexander Motin ÐÉÛÅÔ: >> About rw_lock priority propagation locking(9) tells: >> The rw_lock locks have priority propagation like mutexes, but priority >> can be propagated only to an exclusive holder. This limitation comes >> from the fact that shared owners are anonymous. >> >> What's about idle stealing threshold, it was fixed in HEAD at r239194, >> but wasn't merged yet. It should be trivial to merge it. > > Would it fix my problem with 6-CPU box? > Your commit log talks about "8 or more cores". Hmm. Then I see no reason why threads were not stolen, unless they are bound to specific CPU. Check `sysctl kern.sched.steal_thresh` output to be sure. -- Alexander Motin 02.10.2012 14:53, Alexander Motin ÐÉÛÅÔ:
> On 02.10.2012 10:48, Eugene Grosbein wrote:
>> 02.10.2012 13:58, Alexander Motin ÐÉÛÅÔ:
>>> About rw_lock priority propagation locking(9) tells:
>>> The rw_lock locks have priority propagation like mutexes, but priority
>>> can be propagated only to an exclusive holder. This limitation comes
>>> from the fact that shared owners are anonymous.
>>>
>>> What's about idle stealing threshold, it was fixed in HEAD at r239194,
>>> but wasn't merged yet. It should be trivial to merge it.
>>
>> Would it fix my problem with 6-CPU box?
>> Your commit log talks about "8 or more cores".
>
> Hmm. Then I see no reason why threads were not stolen, unless they are
> bound to specific CPU. Check `sysctl kern.sched.steal_thresh` output to
> be sure.
All NIC's threads and dummynet are bound in my boxes.
igb(4) in RELENG_8 bounds its threads by default in very wrong way,
so I rebound them. dummynet(8) in RELENG_8 goes wild under severe load
unless bound to single or two cores.
kern.sched.steal_thresh: 2
On 02.10.2012 10:59, Eugene Grosbein wrote: > 02.10.2012 14:53, Alexander Motin ÐÉÛÅÔ: >> On 02.10.2012 10:48, Eugene Grosbein wrote: >>> 02.10.2012 13:58, Alexander Motin ÐÉÛÅÔ: >>>> About rw_lock priority propagation locking(9) tells: >>>> The rw_lock locks have priority propagation like mutexes, but priority >>>> can be propagated only to an exclusive holder. This limitation comes >>>> from the fact that shared owners are anonymous. >>>> >>>> What's about idle stealing threshold, it was fixed in HEAD at r239194, >>>> but wasn't merged yet. It should be trivial to merge it. >>> >>> Would it fix my problem with 6-CPU box? >>> Your commit log talks about "8 or more cores". >> >> Hmm. Then I see no reason why threads were not stolen, unless they are >> bound to specific CPU. Check `sysctl kern.sched.steal_thresh` output to >> be sure. > > All NIC's threads and dummynet are bound in my boxes. > igb(4) in RELENG_8 bounds its threads by default in very wrong way, > so I rebound them. dummynet(8) in RELENG_8 goes wild under severe load > unless bound to single or two cores. That can be an answer. Active thread can never never stolen and if it has high absolute priority and never sleeps voluntary -- it will run there forever. If all other threads are bound to that CPU, they also can not be stolen and will wait forever. > kern.sched.steal_thresh: 2 This should not prevent stealing. PS: I've just noticed that for some reason I haven't merged my scheduler improvements to 8-STABLE branch. So behavior may differ from one in HEAD or 9-STABLE. I will recheck commits history to recall what stopped me from merge. But I don't remember all details to predict whether it may affect your problem somehow. -- Alexander Motin on 02/10/2012 09:58 Alexander Motin said the following: > About rw_lock priority propagation locking(9) tells: > The rw_lock locks have priority propagation like mutexes, but priority can be > propagated only to an exclusive holder. This limitation comes from the fact that > shared owners are anonymous. Yeah... and as we see it has a potential to result in priority inversion. > What's about idle stealing threshold, it was fixed in HEAD at r239194, but wasn't > merged yet. It should be trivial to merge it. And I've also misread the code, confused 6 CPUs case with 8 CPUs case. -- Andriy Gapon 03.10.2012 21:56, Andriy Gapon ÐÉÛÅÔ:
> on 02/10/2012 09:58 Alexander Motin said the following:
>> About rw_lock priority propagation locking(9) tells:
>> The rw_lock locks have priority propagation like mutexes, but priority can be
>> propagated only to an exclusive holder. This limitation comes from the fact that
>> shared owners are anonymous.
>
> Yeah... and as we see it has a potential to result in priority inversion.
>
>> What's about idle stealing threshold, it was fixed in HEAD at r239194, but wasn't
>> merged yet. It should be trivial to merge it.
>
> And I've also misread the code, confused 6 CPUs case with 8 CPUs case.
>
>
Can I have any advice/workaround/bugfix on how to reconfigure my routers
to prevent them from locking this way?
03.10.2012 21:56, Andriy Gapon ÐÉÛÅÔ:
> on 02/10/2012 09:58 Alexander Motin said the following:
>> About rw_lock priority propagation locking(9) tells:
>> The rw_lock locks have priority propagation like mutexes, but priority can be
>> propagated only to an exclusive holder. This limitation comes from the fact that
>> shared owners are anonymous.
>
> Yeah... and as we see it has a potential to result in priority inversion.
>
>> What's about idle stealing threshold, it was fixed in HEAD at r239194, but wasn't
>> merged yet. It should be trivial to merge it.
>
> And I've also misread the code, confused 6 CPUs case with 8 CPUs case.
>
>
Can I have any advice/workaround/bugfix on how to reconfigure my routers
to prevent them from locking this way?
on 04/10/2012 09:12 Eugene Grosbein said the following: > 03.10.2012 21:56, Andriy Gapon ÐÉÛÅÔ: >> on 02/10/2012 09:58 Alexander Motin said the following: >>> About rw_lock priority propagation locking(9) tells: >>> The rw_lock locks have priority propagation like mutexes, but priority can be >>> propagated only to an exclusive holder. This limitation comes from the fact that >>> shared owners are anonymous. >> >> Yeah... and as we see it has a potential to result in priority inversion. >> >>> What's about idle stealing threshold, it was fixed in HEAD at r239194, but wasn't >>> merged yet. It should be trivial to merge it. >> >> And I've also misread the code, confused 6 CPUs case with 8 CPUs case. >> BTW, I've just noticed that the syslogd thread had td_pinned == 1 and I can't explain why... But that probably explains why it was not stolen. > > Can I have any advice/workaround/bugfix on how to reconfigure my routers > to prevent them from locking this way? As I said, the primary problem here is the ipmi thread going insane. You can try to remove ipmi driver, if you can afford that. Or you can try to hack on it, so that (1) it voluntary yields even when it thinks that it always has work to do (2) there is some diagnostic on what keeps it running You may also try to set the thread's priority to PUSER (using sched_prio), but I am not sure what bad side-effects may happen because of that. No magic bullet here, sorry. -- Andriy Gapon 04.10.2012 17:23, Andriy Gapon ÐÉÛÅÔ:
>> Can I have any advice/workaround/bugfix on how to reconfigure my routers
>> to prevent them from locking this way?
>
> As I said, the primary problem here is the ipmi thread going insane.
> You can try to remove ipmi driver, if you can afford that.
> Or you can try to hack on it, so that
> (1) it voluntary yields even when it thinks that it always has work to do
> (2) there is some diagnostic on what keeps it running
>
> You may also try to set the thread's priority to PUSER (using sched_prio), but I
> am not sure what bad side-effects may happen because of that.
>
> No magic bullet here, sorry.
Thank you. As workaround, I've unloaded ipmi.ko
and edited my scripts to access IPMI sensors over IP instead of local interface.
Eugene Grosbein
Author: melifaro Date: Mon Mar 25 14:30:34 2013 New Revision: 248705 URL: http://svnweb.freebsd.org/changeset/base/248705 Log: Unlock IPMI sc while performing requests via KCS and SMIC interfaces. It is already done in SSIF interface code. This reduces contention/spinning reported by many users. PR: kern/172166 Submitted by: Eric van Gyzen <eric at vangyzen.net> MFC after: 2 weeks Modified: head/sys/dev/ipmi/ipmi_kcs.c head/sys/dev/ipmi/ipmi_smic.c Modified: head/sys/dev/ipmi/ipmi_kcs.c ============================================================================== --- head/sys/dev/ipmi/ipmi_kcs.c Mon Mar 25 13:58:17 2013 (r248704) +++ head/sys/dev/ipmi/ipmi_kcs.c Mon Mar 25 14:30:34 2013 (r248705) @@ -456,6 +456,7 @@ kcs_loop(void *arg) IPMI_LOCK(sc); while ((req = ipmi_dequeue_request(sc)) != NULL) { + IPMI_UNLOCK(sc); ok = 0; for (i = 0; i < 3 && !ok; i++) ok = kcs_polled_request(sc, req); @@ -463,6 +464,7 @@ kcs_loop(void *arg) req->ir_error = 0; else req->ir_error = EIO; + IPMI_LOCK(sc); ipmi_complete_request(sc, req); } IPMI_UNLOCK(sc); Modified: head/sys/dev/ipmi/ipmi_smic.c ============================================================================== --- head/sys/dev/ipmi/ipmi_smic.c Mon Mar 25 13:58:17 2013 (r248704) +++ head/sys/dev/ipmi/ipmi_smic.c Mon Mar 25 14:30:34 2013 (r248705) @@ -362,6 +362,7 @@ smic_loop(void *arg) IPMI_LOCK(sc); while ((req = ipmi_dequeue_request(sc)) != NULL) { + IPMI_UNLOCK(sc); ok = 0; for (i = 0; i < 3 && !ok; i++) ok = smic_polled_request(sc, req); @@ -369,6 +370,7 @@ smic_loop(void *arg) req->ir_error = 0; else req->ir_error = EIO; + IPMI_LOCK(sc); ipmi_complete_request(sc, req); } IPMI_UNLOCK(sc); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" My PR. batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed. Believed to be fixed in supported branches. |