Bug 223914 - [PATCH] Fix bug in which the long term ULE load balancer is executed only once
Summary: [PATCH] Fix bug in which the long term ULE load balancer is executed only once
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: Mark Johnston
Keywords: patch
Depends on:
Reported: 2017-11-27 17:51 UTC by Justinien Bouron
Modified: 2018-07-15 09:08 UTC (History)
13 users (show)

See Also:

Patch for the ULE load balancer solving the issue of this bug report. (462 bytes, patch)
2017-11-27 17:51 UTC, Justinien Bouron
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Justinien Bouron 2017-11-27 17:51:02 UTC
Created attachment 188323 [details]
Patch for the ULE load balancer solving the issue of this bug report.

The load balancer of ULE has a bug which cause it to be executed only once at startup, and it will *never* be executed again.

The reason is simple, when the sched_balance function is called for the first time, the smp is not yet setup, and the function returns.
However the balance_tick variable is not resetted before the return and thus will remains 0 for ever.

Because the condition in sched_clock to trigger the load balancer is :
" balance_ticks && --balance_ticks == 0 ", once balance_ticks is 0, this condition will never be met, and thus the load balancer will never be executed again.

To convince yourself you can even put a panic right before the call to sched_balance_group in sched_balance, it will never panic ...

The fix is rather simple, just move the line resetting the balance_ticks value *before* the if branch "if (smp_started == 0 || rebalance == 0)". That way we are sure that the load balancer is re-scheduled.
Comment 1 Konstantin Belousov freebsd_committer 2017-11-28 18:36:27 UTC
I think the analysis is correct.

Might be, sched_initticks() should set the balance_ticks as well, after the balance_interval is initialized ?
Comment 2 Justinien Bouron 2017-11-28 20:51:23 UTC
sched_setup calls sched_setup_smp which in turns calls sched_balance that will set the balance_ticks.
Comment 3 Ivan Klymenko 2017-12-12 11:47:46 UTC
Maybe, this version use less overhead charges for not SMP and given the sched_random() function.

--- sched_ule.c.orig    2017-11-29 09:23:11.503091000 +0200
+++ sched_ule.c.my1     2017-12-12 13:40:46.330506000 +0200
@@ -880,11 +880,15 @@
        struct tdq *tdq;
-       if (smp_started == 0 || rebalance == 0)
+       if (smp_started == 0)
        balance_ticks = max(balance_interval / 2, 1) +
            (sched_random() % balance_interval);
+       if (rebalance == 0)
+               return;
        tdq = TDQ_SELF();
Comment 4 Ivan Klymenko 2017-12-12 20:53:13 UTC
(In reply to Ivan Klymenko from comment #3)
Sorry, I think I said foolishness.
Comment 5 Mark Felder freebsd_committer 2018-07-13 11:28:22 UTC
This is getting attention due to the recent FreeBSD ULE vs Linux CFS research paper which is what caused this bug to be discovered. Is there further analysis needed here?
Comment 6 Ben Woods freebsd_committer 2018-07-15 00:08:39 UTC
Cc’ing a few people who have made recent changes to the ULE scheduler, and therefore might feel confident reviewing this patch.