Bug 247829 - Constant load of 1 on a recent 12.1-STABLE with 3 ZFS pools (high rates of zfskern{mmp_thread_enter})
Summary: Constant load of 1 on a recent 12.1-STABLE with 3 ZFS pools (high rates of zf...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.1-STABLE
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-07 17:12 UTC by Gordon Bergling
Modified: 2020-07-21 08:04 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gordon Bergling freebsd_committer 2020-07-07 17:12:44 UTC
On a recent virtualized 12.1-STABLE build I see a constant load of 1. While investigating 'top -HS' it shows a relative high cpu usage for 'zfskern{mmp_thread_enter}', like in the example below.

  PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
   11 root        155 ki31     0B    64K CPU2     2  17:28  97.39% idle{idle: cpu2}
   11 root        155 ki31     0B    64K CPU3     3  17:29  96.78% idle{idle: cpu3}
   11 root        155 ki31     0B    64K CPU1     1  17:29  96.40% idle{idle: cpu1}
   11 root        155 ki31     0B    64K RUN      0  17:25  96.13% idle{idle: cpu0}
    8 root         -8    -     0B  1040K mmp->m   2   0:44   4.32% zfskern{mmp_thread_enter}
    8 root         -8    -     0B  1040K mmp->m   1   0:44   4.28% zfskern{mmp_thread_enter}
    8 root         -8    -     0B  1040K mmp->m   3   0:44   4.25% zfskern{mmp_thread_enter}

The problem at this point is that the relatively small CPU usage results in a load=1.0, which let the host system schedule the assigned CPU cores of the VM at the highest possible clockrate.

Trying OpenZFS seems to improve the situation, but I am not sure that this is completely true, since only my zroot pool was detected and the two other pools weren't.
Comment 1 Jamie Landeg-Jones 2020-07-11 11:47:06 UTC
Yes, I just updated 12.1-stable from a previous 13th May build, and I'm now seeing similar.

I don't get quite up to 1.00 but I only have ZFS on an /archive partition that is literally idle.

I have one zfskern{mmp_thread_enter} continuously using 3% of one CPU core of an i7-7567-U 3.5GHz

PID   JID USERNAME    PRI NICE   SIZE    RES SWAP STATE    C   TIME    WCPU COMMAND
   11     0 root        155 ki31     0B    64K   0B CPU3     3  38.3H  99.29% [idle{idle: cpu3}]
   11     0 root        155 ki31     0B    64K   0B CPU1     1  38.1H  99.00% [idle{idle: cpu1}]
   11     0 root        155 ki31     0B    64K   0B CPU0     0  38.1H  98.53% [idle{idle: cpu0}]
   11     0 root        155 ki31     0B    64K   0B RUN      2  38.0H  98.34% [idle{idle: cpu2}]                                                                                                   46     0 root         -8    -     0B   656K   0B RUN      0  66:45   2.90% [zfskern{mmp_thread_enter}]
Comment 2 Jamie Landeg-Jones 2020-07-11 11:49:15 UTC
(mine is a direct install, not virtualised)
Comment 3 Jamie Landeg-Jones 2020-07-11 12:43:27 UTC
I've discovered this is a new feature that's been added, but it defaults to being off, and is so:

13:42 [2] (2) "~/build" jamie@thompson% zpool get multihost
NAME   PROPERTY   VALUE      SOURCE
ZFS:0  multihost  off        default
Comment 4 Dancho Penev 2020-07-13 07:18:06 UTC
Same issue here on recent stable, however with higher loadavg - 3-4. Dtrace shows that most active stack is:

  zfskern
              kernel`spinlock_exit+0x31
              kernel`_cv_timedwait_sig_sbt+0x11f
              zfs.ko`mmp_thread+0xc1c
              kernel`fork_exit+0x7e
              kernel`0xffffffff8064129e
             2147
Comment 5 Andriy Gapon freebsd_committer 2020-07-14 06:04:40 UTC
I know that you will find very little comfort in this, but the issue does not happen in head (CURRENT). I am trying to find out what's causing the trouble on stable/12.
Comment 6 Andriy Gapon freebsd_committer 2020-07-14 06:17:44 UTC
To anyone having the problem and being able to compile a new kernel.
Could you please try to apply base r340664 to your source tree, rebuild kernel and modules, reinstall, reboot and see if the situation improves?

P.S.
direct patch link for your convenience:
https://svnweb.freebsd.org/base/head/sys/sys/time.h?view=patch&r1=340664&r2=340663&pathrev=340664
Comment 7 Gordon Bergling freebsd_committer 2020-07-14 07:06:56 UTC
(In reply to Andriy Gapon from comment #6)

I can test this on a recent 12-STABLE. I report back, once I have a few results.
Comment 8 Gordon Bergling freebsd_committer 2020-07-14 10:00:27 UTC
(In reply to Andriy Gapon from comment #6)

The patch doesn't apply cleanly on -STABLE. Would it be possible to copy the sys/sys/time.h over from head, or would this leads to unforeseen side effects?
Comment 9 Andriy Gapon freebsd_committer 2020-07-14 10:12:31 UTC
(In reply to Gordon Bergling from comment #8)
I guess the reason it does not apply is base r340450 that comes earlier.
Copying the file wholesale would probably work as well.

P.S.
base r346176 is not strictly required, but it would probably be good to merge it as well. This is a note for my future self.
Comment 10 Gordon Bergling freebsd_committer 2020-07-14 11:52:57 UTC
(In reply to Andriy Gapon from comment #9)

You where right, I tried the patch on a recent revision.

But now to the good news, just copying sys/sys/time.h over from head solves the problem.

$ uptime
 1:51PM  up 4 mins, 1 user, load averages: 0.25, 0.15, 0.06

It would be great if the changes to time.h could be MFC'ed. :)
Comment 11 Andriy Gapon freebsd_committer 2020-07-14 12:29:44 UTC
Warner, would you like to do those MFC-s?
If you are busy elsewhere then I can get to doing that in a couple of days (maybe sooner).
Comment 12 Juraj Lutter 2020-07-14 14:51:28 UTC
FWW, diff against 12-STABLE r363181: https://freebsd-stable.builder.wilbury.net/patches/time-load1-247829.diff
Comment 13 Juraj Lutter 2020-07-14 19:20:35 UTC
Same results for me:

root@b12:/usr/ports # uptime
 9:19PM  up 18 mins, 5 users, load averages: 0.03, 0.14, 0.13

Load average for an "idle" machine is now way below 1.00.
Comment 14 Jamie Landeg-Jones 2020-07-17 04:47:50 UTC
Sorry for the delay in replying. I copied the 'current' /usr/include/sys/time.h into src/sys/sys/ and /usr/include/sys/ and recompiled the kernel, and it too is now working correctly.

Cheers, Jamie
Comment 15 Oleh Hushchenkov 2020-07-21 05:39:11 UTC
Any news about this?
Comment 16 commit-hook freebsd_committer 2020-07-21 07:59:15 UTC
A commit references this bug:

Author: avg
Date: Tue Jul 21 07:58:39 UTC 2020
New revision: 363384
URL: https://svnweb.freebsd.org/changeset/base/363384

Log:
  MFC r340450,r340664,r346176 by imp: fix time conversions to and from sbt

  Note that the PR is for a change elsewhere (ZFS) that expected the sane
  behavior of nstosbt().

  PR:		247829
  Reported by:	gbe, others
  Tested by:	gbe, others

Changes:
_U  stable/12/
  stable/12/sys/sys/time.h
Comment 17 Oleh Hushchenkov 2020-07-21 08:02:58 UTC
Andriy, thank you!
Comment 18 Andriy Gapon freebsd_committer 2020-07-21 08:03:54 UTC
Tank you, everyone, for reporting and testing!
Comment 19 Andriy Gapon freebsd_committer 2020-07-21 08:04:31 UTC
s/Tank/Thank/ obviously :-)