Bug 173541 - load average 0.60 at 100% idle
Summary: load average 0.60 at 100% idle
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: unspecified
Hardware: Any Any
: Normal Affects Some People
Assignee: freebsd-acpi mailing list
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2012-11-10 23:50 UTC by ultramage
Modified: 2018-03-21 08:56 UTC (History)
9 users (show)

See Also:


Attachments
messages.diff (3.11 KB, patch)
2012-11-11 19:26 UTC, ultramage
no flags Details | Diff
messages.diff (3.11 KB, patch)
2012-11-13 15:58 UTC, ultramage
no flags Details | Diff
broken loadavgs, a fix, then fixed loadavs (57.58 KB, text/plain)
2016-03-03 14:45 UTC, smithi
no flags Details
verbose boot Lenovo X200 (53.06 KB, text/plain)
2016-03-03 14:47 UTC, smithi
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description ultramage 2012-11-10 23:50:00 UTC
I run FreeBSD on an intel atom D525MWV board. On recent FreeBSD versions, I am observing unusual load averages reported by 'top' even when 100% idle. It happens in a normal system, in single user mode, and on official usb install images. I have not yet confirmed if the load is real or if it's just kernel reporting the values wrong.

Here are kernels that I have tried. Four of them are from the currently available usb memstick installers; the 2011 one is a version I've been successfully using for a year until downgrading recently.

8.3-RELEASE  amd64 2012-04-09           -> load 0.02 after 5 minutes idle
9.0-RELEASE  amd64 2012-01-03           -> load 0.02 after 5 minutes idle
10.0-CURRENT amd64 2011-11-12 (cvs)     -> load 0.02 after 5 minutes idle
9.1-RC3      amd64 2012-10-30 (r242324) -> load 0.60 after 5 minutes idle
10.0-CURRENT amd64 2012-11-02 (r242464) -> load 0.60 after 5 minutes idle

How-To-Repeat: 1) boot system on affected hardware(?)
2) leave 'top' running for a few minutes
3) observe load averages
Comment 1 ultramage 2012-11-11 05:50:10 UTC
A night without sleep and a chain of manual bisection steps later:
stable/9 r232058 2012-02-23 - pass
stable/9 r232129 2012-02-24 - fail
(will continue to narrow it down later)
Comment 2 ultramage 2012-11-11 19:26:25 UTC
The problem first occurs on 232086 (original head revision 231161)
http://svnweb.freebsd.org/base?view=revision&revision=232086

It has some acpi-related and tick source-related changes regarding their 
probing order/priority. I observed that this does indeed change the way 
they are reported in /var/log/messages. In particular, the order changes 
and they're initalized way earlier in the startup process. Diff attached.
Comment 3 ultramage 2012-11-13 15:58:03 UTC
The offending revision where the issue can first be observed in stable-9 
is r232086 (MFC: r231161) - 
http://svnweb.freebsd.org/base?view=revision&revision=232086
The consequence of that change can be observed in /var/log/messages - 
the timer sources get initialized way earlier and in different order. 
Diff attached.
Comment 4 je.broman 2013-01-03 22:19:00 UTC
Hi!

I'm getting the exact same issue. Upgraded from FreeBSD 9.0 to FreeBSD 9.1
and my load avg jumped from 0.01 to 0.60, while the server is doing pretty
much nothing at all.

I'm also seeing the timer differences in dmesg as described earlier in the
bug report.

# sysctl -a | egrep "kern.event|clock"
kern.clockrate: { hz = 1000, tick = 1000, profhz = 8126, stathz = 127 }
kern.eventtimer.choice: HPET(450) HPET1(440) HPET2(440) LAPIC(400)
i8254(100) RTC(0)
kern.eventtimer.et.LAPIC.flags: 15
kern.eventtimer.et.LAPIC.frequency: 0
kern.eventtimer.et.LAPIC.quality: 400
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.et.HPET.flags: 3
kern.eventtimer.et.HPET.frequency: 14318180
kern.eventtimer.et.HPET.quality: 450
kern.eventtimer.et.HPET1.flags: 3
kern.eventtimer.et.HPET1.frequency: 14318180
kern.eventtimer.et.HPET1.quality: 440
kern.eventtimer.et.HPET2.flags: 3
kern.eventtimer.et.HPET2.frequency: 14318180
kern.eventtimer.et.HPET2.quality: 440
kern.eventtimer.periodic: 0
kern.eventtimer.timer: HPET
kern.eventtimer.activetick: 1
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 2
debug.acpi.reset_clock: 1
debug.clocktime: 0
hw.clockrate: 2133
machdep.wall_cmos_clock: 0
dev.atrtc.0.%desc: AT realtime clock


# dmesg | more
Copyright (c) 1992-2012 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 9.1-RELEASE #0 r243825: Tue Dec  4 09:23:10 UTC 2012
    root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
CPU: Intel(R) Core(TM)2 CPU          6400  @ 2.13GHz (2133.45-MHz K8-class
CPU)
  Origin = "GenuineIntel"  Id = 0x6f2  Family = 6  Model = f  Stepping = 2

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant, performance statistics
real memory  = 4296015872 (4097 MB)
avail memory = 4101558272 (3911 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: <COMPAQ GLENWOOD>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0: Changing APIC ID to 1
ioapic0 <Version 2.0> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <HPQOEM SLIC-WKS> on motherboard
acpi0: Power Button (fixed)
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, dff00000 (3) failed
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
hpet1: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Event timer "HPET" frequency 14318180 Hz quality 450
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
device_attach: hpet0 attach returned 12
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0xf808-0xf80b on acpi0

Regards
Johan
Comment 5 ultramage 2013-01-03 22:44:46 UTC
For my system's kernel I just undid that one revision I mentioned 
earlier. No ill-effects observed. Since the change is so low-level and 
there is no rationale provided in the commit message, I do not know what 
improvement it was supposed to achieve. I tried e-mailing the author of 
that commit, but got no response.
Also, due to the low-power nature of my system, I can't test if the 0.60 
load is real, or just a by-product of broken time accounting. Either way 
it might be affecting the process scheduler.
Comment 6 je.broman 2013-01-03 23:06:16 UTC
Hi!

I see. I got around the problem in FreeBSD 9.1 by changing the clock
source. Like this:

# sysctl -w kern.eventtimer.timer=LAPIC

The load avg dropped to 0.00 again. You can try the different clock sources
available to you and see what works best. Some sources might increase the
number of interrupts (like the RTC) and some might cause more context
switching or CPU load. For me the LAPIC works best. You can list your clock
sources using:

# sysctl kern.eventtimer.choice

From what I understand, measuring time can be tricky because new systems
can regulate core frequency on the fly (in HW) and virtualization also
increases the difficulty...

It seems the order and/or weight of the clock sources has changed. When I
have a chance I will reboot into the old kernel and see what has been
changed. I'm totally new to FreeBSD so I don't know the normal handling of
these bugs but I'll put the author of the patch on cc as well :)

Good luck!
Johan
Comment 7 frederique 2014-01-31 06:14:14 UTC
I'm seeing the same issues in FreeBSD 10-RELEASE.

A VM has a load avg of about 0.5, a physical install will show ~0.3 load 
average. My tests were done in single user mode even.

https://forums.freebsd.org/viewtopic.php?f=4&t=44600
Comment 8 Jeremy Chadwick 2015-10-27 04:44:43 UTC
(Warning: below is a long comment with a large amount of detail.  I am CC'ing mav@ because I believe this may have to do with his eventtimer changes made sometime in 2012[1].  I'll provide some information that may greatly help narrow down the root cause near the end).

I reported this problem, specific to stable/10 (i.e. the issue I see may be different than the one reported here), back in June 2014[2].  A visual representation of the difference between stable/10 and stable/9 on said hardware (bare-metal, not virtualised) is available[3].  The system in question still runs stable/9 (r289992) today without any load oddities.

All of my KTR details and so on from that system are available here[4], where Adrian Chadd attempted to help me figure out what was going on.  The mail thread (see [1] again) consisted of many people chiming in with useful details.  At some point during this thread, maybe not publicly (I would need to review my mail archives), someone (John Baldwin?  I'm not certain, my memory is not as good as it used to be) informed me that stable/10 had its load calculation algorithm changed.

Since July 2014, I have been able to reproduce this problem with stable/10 on several different types of systems:

- Supermicro X7SBA (bare metal)
- VMware Workstation 11.x (FreeBSD stable/10 is a guest)
- VPS provider Vultr.com (using QEMU) (FreeBSD stable/10 is a guest)

There are, however, some individuals running stable/10 (e.g. a friend of mine using FreeNAS) on bare-metal hardware that do not experience this problem.

Today I spent some time on a stable/10 (r289827) guest under VMware Workstation 11.x where I was able to reproduce the issue and found something surprising:

kern.eventtimer.periodic=1 completely alleviates the problem -- the instant this sysctl is set to 1 ("periodic mode") the load begins to plummet and eventually hits 0.00, only increasing when there is "real" load.

You can even set kern.eventtimer.periodic=0 ("one-shot") maybe 5-6 minutes after that and still see the system load at 0.00, although over the next 10+ minutes it will begin to gradually increase again.

The downside/problem to using "periodic mode": vmstat -i shows a cpuX:timer rate of almost 2000 on both CPU cores, which if I understand eventtimers(4) correctly, would indicate that extremely high interrupt usage (in my case, LAPIC, i.e. local APIC).

I just find it very interesting that somehow kern.eventtimer.periodic=1 is able to alleviate the issue.  This leads me to wonder if the issue is with eventtimers(4) changes, or with the load calculation algorithm (or some "intermediary" piece that isn't taking into consideration certain things).

On the VM in question, the eventtimer choices are:

root@test-freebsd:~ # sysctl -a | grep kern.eventtimer
kern.eventtimer.periodic: 0
kern.eventtimer.timer: LAPIC
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 2
kern.eventtimer.choice: LAPIC(600) i8254(100) RTC(0)
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.LAPIC.quality: 600
kern.eventtimer.et.LAPIC.frequency: 33000101
kern.eventtimer.et.LAPIC.flags: 7

On a Vultr.com VPS (QEMU), the eventtimer choices are (according to stable/9, which I use there):

kern.eventtimer.singlemul: 2
kern.eventtimer.idletick: 0
kern.eventtimer.activetick: 1
kern.eventtimer.timer: LAPIC
kern.eventtimer.periodic: 0
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.LAPIC.quality: 400
kern.eventtimer.et.LAPIC.frequency: 500017353
kern.eventtimer.et.LAPIC.flags: 15
kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)

And finally on the aforementioned Supermicro X7SBA box, the eventtimer choices (per stable/9) are:

kern.eventtimer.singlemul: 2
kern.eventtimer.idletick: 0
kern.eventtimer.activetick: 1
kern.eventtimer.timer: LAPIC
kern.eventtimer.periodic: 0
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.LAPIC.quality: 400
kern.eventtimer.et.LAPIC.frequency: 166681842
kern.eventtimer.et.LAPIC.flags: 15
kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)

Per hpet(4) I have tried the following in /boot/loader.conf on all of the above VM systems to no avail (probably because they do not use HPET):

hint.hpet.X.legacy_route=1
hint.attimer.0.clock=0
hint.atrtc.0.clock=0

Another thing I found interesting on stable/10 under the VMware VM was that this message is printed:

Event timer "HPET" frequency 14318180 Hz quality 550

Yet the eventtimer is no where to be found.

Anyway, this problem has come up *repeatedly* on multiple FreeBSD mailing lists and on the forums yet has gotten no traction[5][6][7][8][9][10].  I feel, respectively, that this has gone on long enough.  This problem greatly diminishes the feasibility of using FreeBSD as a server operating system (when you can no longer trust or rely on the load average, what can you believe?).  End-users and even skilled/senior administrators do not know how to diagnose or troubleshoot + rectify this problem given where it lies in kernel space.  Please remember: the problem is 100% reproducible.

I am happy to pay for a low-end Vultr.com dedicated VPS (includes console access as long as you have a web browser w/ Javascript) and/or set up a dedicated VMware Workstation VM (w/ remote VNC console) for any developer which wants to take it on, on my own dime.  I cannot provide the aforementioned Supermicro X7SBA box for testing (it's my FreeBSD box and must remain usable).

Alexander (mav@) -- what can we provide you that can help narrow this down?

P.S. -- The Importance field needs to be changed from Affects Only Me to Affects Some People, as this is definitely not specific to just Viktor.  Also, I would strongly suggest changing Depends On to reference bug 192315, which references eventtimer issues (but may be a different problem -- I'll let mav@ decide).

[1]: https://lists.freebsd.org/pipermail/freebsd-net/2012-April/031893.html
[2]: https://lists.freebsd.org/pipermail/freebsd-stable/2014-July/079386.html
[3]: http://jdc.koitsu.org/freebsd/releng10_perf_issue/loadavg_comparison.png
[4]: http://jdc.koitsu.org/freebsd/releng10_perf_issue/
[5]: https://lists.freebsd.org/pipermail/freebsd-virtualization/2014-October/002835.html
[6]: https://forums.freebsd.org/threads/high-idle-load-in-vps.52688/
[7]: https://forums.freebsd.org/threads/high-cpu-utilization-by-average-load-values-while-no-processes-in-top-output-have-high-cpu-load.48933/
[8]: https://lists.freebsd.org/pipermail/freebsd-bugs/2013-January/051385.html
[9]: https://lists.freebsd.org/pipermail/freebsd-xen/2014-January/002006.html
[10]: https://forums.freebsd.org/threads/high-load-average-with-idle-state.38757/
Comment 9 Alexander Motin freebsd_committer 2015-10-27 08:33:02 UTC
I don't know whether it is the only cause of such behavior, but I have one logical explanation.  The problem goes from the sampled nature of our load average calculation.  To reduce power usage new clock code tries to aggregate events in bursts to reduce number of CPU wakeups.  As result, if system is idle, its CPU may wakeup as low as two times per second.  And if that CPU has some other events waking it up, it is quite likely they will be aggregated with load average sampling.  As result, there is high enough probability that sampler will see some "load" to account, while it has no idea that that load will disappear in microsecond for another second.  Enabling periodic interrupts breaks the event aliasing -- load average sampling will more likely fire in moments when system is really idle.  But do we want to consume several extra watts of power just for one good looking number?  To solve this we should probably change our mechanism of load average calculation somehow.  If anybody have ideas -- I am all ears.
Comment 10 ultramage 2015-10-27 08:49:59 UTC
On 10.2-RELEASE, my long-term load average is around 0.10, which may be closer to reality because of powerd (225-1800MHz frequency scaling range). The currently chosen timecounter is TSC and the eventtimer is HPET. Unfortunately I forgot to note down the previous values. I can try an older kernel if needed.
Comment 11 amvandemore 2015-10-29 13:13:36 UTC
Load Average has been known to be broken for a long time, pre-dating this by some time I think:

https://lists.freebsd.org/pipermail/freebsd-current/2011-September/027138.html

I think mav's observation is basically what's happening above.
Comment 12 smithi 2016-03-03 14:45:47 UTC
Created attachment 167688 [details]
broken loadavgs, a fix, then fixed loadavs
Comment 13 smithi 2016-03-03 14:47:56 UTC
Created attachment 167689 [details]
verbose boot Lenovo X200
Comment 14 smithi 2016-03-03 14:51:45 UTC
The first 40% of the attached is a study of the "0.6" erroneous loadavgs
experienced also by others, shown in some detail over long periods at
virtually complete idle, plus a number of 100% CPU-bound tasks running.

This shows that the oft-mentioned "0.6" figure (seeming randomly variable
over a range of ~0.45 to ~0.75, with some outliers even beyond that) is an
ADDITIVE increment to real loadavg, for real 100% CPU loads up to here 11.

At 40% through this report are applied a pair of cpuset(1) commands, which
on this machine completely fix broken loadavgs, permanently until reboot.

I'll refrain from further speculation on what might cause this to happen.

The rest of the report illustrates the well-behaved nature of HPET-derived
loadavgs (both one-shot and periodic) at idle and at real loads of 1, 2, 4
and 11 CPU-bound tasks, after this particulat magic spell has been cast ..

Verbose dmesg is also attached.
Comment 15 Jeremy Chadwick 2016-08-15 18:34:32 UTC
Thought I'd add that I checked 11.0-BETA4 for this problem: it still exists.
Comment 16 Aleksander Alekseev 2016-08-15 19:13:58 UTC
Jeremy, you shouldn't probably test LA on -BETA since it's built with INVARIANTS and WITNESS enabled. I'm not an expert but I wouldn't be surprised if these options affect LA.

I've just checked 11.0-RC1 (in which kernel is built without these options) as described above and got LA 0.03. Since system never does literally nothing (there are system clock, process scheduler, etc) I would say it's a pretty expected behavior. At least it's what I see on my hardware (Fujitsu Lifebook E733).
Comment 17 Jeremy Chadwick 2016-08-15 19:30:49 UTC
The problem varies per hardware/system (see comment #8), including multiple types/implementations of VMs.  Ian Lepore (smithi@nimnet.asn.au) already covered part of this in his "broken loadavg, a fix, then fixed loadavgs" attachment (it has a lengthy explanation).  The issue almost certainly has something to do with eventtimers or timecounters.  Ian and I chatted for a bit off-ticket about some of the possibilities, but neither of us could narrow down the root cause.

I'm happy to wait for 11.0-RELEASE and try that, but I'm inclined to believe the issue is still present unless a committer can point to a commit that actually fixed it; the only commits I follow presently are to stable/9, and a search of stable/10 for the word "load" turns up no commits that look relevant.  So, call it sysadmin + FreeBSD-user-since-1997 gut instinct.  :-)
Comment 18 Jeremy Chadwick 2017-01-20 00:48:34 UTC
Problem described continues to exist on the following present-day FreeBSD releases (I've tested all of them; I did not test x86):

* 10.3-RELEASE amd64
* 11.0-RELEASE amd64
* 12.0-CURRENT amd64 (r309773)

Getting traction on this is important due to the EoL of the 9.x series as of January 1st 2017, compounded by the ports infrastructure strongly advocating ceasing use of 9.x (see ports r431746 for proof).
Comment 19 Mark Linimon freebsd_committer freebsd_triage 2017-01-27 08:21:14 UTC
Apparently acpi-related.  Notify committer of 232086 -- although this commit was several years ago.
Comment 20 Jeremy Chadwick 2017-01-27 18:15:37 UTC
For what it's worth: below is a spreadsheet of all the relevant systems I've tested including what OS version, timecounters available (and chosen), eventtimers available (and chosen), etc..

https://docs.google.com/spreadsheets/d/1xsKZ6yNMWW6TA_ZaYMwzZ-vI3C-pSQ9QAJk_yjPBOIQ/edit?usp=sharing
Comment 21 Jeremy Chadwick 2017-01-28 10:38:08 UTC
I wanted to add this piece of data (reflected in spreadsheet):

I was able to reproduce this problem on Amazon EC2 using official FreeBSD AMIs (both 10.3-STABLE and 11.0-RELEASE) on a t2.micro instance.  Launch, wait several minutes, then look at the system.  There, timecounter is XENTIMER, and eventtimer is XENTIMER.
Comment 22 Nick 2017-02-19 22:47:56 UTC
Hi,

I have the same problem on pretty much all of the VPSes I use at different providers. On a brand new FreeBSD 11.0 amd64 install at Leaseweb, VULTR, Liteserver.nl, Amazon, etc I see high loads (sometimes hitting up to 2.00) on idle virtual servers with nothing installed.

I've seen this high load issue reported many times on various sites and forums. It makes it hard to accurately see just how much cpu or load my virtual servers are using since I can't rely on FreeBSD's load report.

Thank you,
- Nick