After upgrading my FreeBSD-based firewall from 10.3R to 11.0R, the boot time (from power-on to the login prompt) rocketed from ~1 minute to more than 45 minutes, and even when up, the system was extremely slow. The loadavg was >2, with no process really taking that much CPU time, and the number of IRQ was not too high (200-300 per sec) even if 20% of the time was spent in IRQ. To rule out any oddity with my system config, I reproduced the problem with the bare official FreeBSD-10.3-RELEASE-amd64-memstick.img where the kernel takes 20s to boot, and FreeBSD-11.0-RELEASE-amd64-memstick.img takes more than 8 minutes, from "Booting..." to the login prompt. So, it took me several days of manual bisecting-compiling-memstick-booting, and I finally found the culprit: base r265474 The diff is pretty straightforward, in /etc/defaults/rc.conf: -performance_cx_lowest="HIGH" # Online CPU idle state +performance_cx_lowest="Cmax" # Online CPU idle state -economy_cx_lowest="HIGH" # Offline CPU idle state +economy_cx_lowest="Cmax" # Offline CPU idle state If I override these values by putting HIGH back in /etc/rc.conf.local, the problem disappears. I was able to manually confirm that if I set dev.cpu.N.cx_lowest to C2 (or C3), the system is horribly slow and sluggish, and back to C1 ("HIGH"), it's usable again. The commit message indicates that "This may not stay through 11.0-RELEASE, but at least having it on by default in -HEAD will expose (more) issues with broken hardware", so I guess my hardware is broken, even if I didn't observe such behavior previously under other OSes (namely Linux). I can imagine that anybody else booting FreeBSD 11.0 under such hardware would just give up, so it might be a good idea to detect this CPU on install and keep cx_lowest at C1? root@fw:~ # sysctl hw.machine hw.model dev.cpu.0 hw.machine: amd64 hw.model: Intel(R) Atom(TM) CPU 330 @ 1.60GHz dev.cpu.0.cx_method: C1/hlt C2/io C3/io dev.cpu.0.cx_usage_counters: 10731 0 0 dev.cpu.0.cx_usage: 100.00% 0.00% 0.00% last 4432us dev.cpu.0.cx_lowest: C1 dev.cpu.0.cx_supported: C1/1/0 C2/2/1 C3/3/85 dev.cpu.0.%parent: acpi0 dev.cpu.0.%pnpinfo: _HID=none _UID=0 dev.cpu.0.%location: handle=\_PR_.P001 dev.cpu.0.%driver: cpu dev.cpu.0.%desc: ACPI CPU
Adding the commit author, adrian, in CC
it sounds like something is broken with that sleep state handling. can you check the latest -head and see if it behaves any better/differently?
(In reply to Stéphane Lesimple from comment #0) Please provide a verbose dmesg of the boot, and output of sysctl kern.timecounter kern.eventtimer for 10.3 and 11.0. I am somewhat skeptical WRT the exact commit which you blame, since rc.conf settings cannot affect earliest boot, but you noted that the boot is slow.
(In reply to Adrian Chadd from comment #2) Compiled HEAD this morning, namely base r314326 and booted it. It does behave differently: at boot I couldn't observe the slowdown. However after some digging, I found out that HEAD's /etc/default/rc.conf is different and now contains: > performance_cx_lowest="NONE" Where 11.0R contained > performance_cx_lowest="Cmax" (introduced by base r265474) On my system, NONE translates to C1, as shown by sysctl -a, so that's why it behaves properly: it never tries to use C2 or C3. I tried to manually set dev.cpu.X.cx_lowest to C2, and see if the slowdown was still there, but it wasn't. However sysctl indicates that my setting is completely ignored and the system always uses C1, as seen by cx_usage: dev.cpu.0.cx_method: C1/hlt C2/io C3/io dev.cpu.0.cx_usage_counters: 4617 0 0 dev.cpu.0.cx_usage: 100.00% 0.00% 0.00% last 49467us dev.cpu.0.cx_lowest: C2 dev.cpu.0.cx_supported: C1/1/0 C2/2/1 C3/3/85 dev.cpu.0.%parent: acpi0 dev.cpu.0.%pnpinfo: _HID=none _UID=0 dev.cpu.0.%location: handle=\_PR_.P001 dev.cpu.0.%driver: cpu dev.cpu.0.%desc: ACPI CPU So this *is* indeed different from 11.0R. For some reason the kernel now chooses to keep C1 at all times, even if I tell it that C2 is okay. If you want me to try anything, just ask, I'm keeping HEAD's memstick.img around just in case.
(In reply to Konstantin Belousov from comment #3) You're right, the early stage of the boot is the same between 10.3 and 11.0. The massive slowdown starts somewhat late in the boot process. What I called "boot process" here is timed from the "Booting..." message (right after the kernel is loaded) up to the moment where ttyv0 shows the "login:" prompt. Those two moments are easier to measure with a stopwatch. So this probably includes the start time of most (limited) userland stuff from the bare release memstick image, run with the "livecd" option. Does that sounds more reasonable? In any case, I can for sure provide dmesgs and kern.{eventtimer,timecounter} for 10.3 and 11.0.
(In reply to Stéphane Lesimple from comment #5) Since you have HEAD kernel, most interesting is the verbose dmesg from HEAD boot and sysctls from there.
Created attachment 180350 [details] verbose dmesg of HEAD (r314326) Here you go. This is the dmesg of a verbose boot under HEAD. Note that as stated above, with this kernel, cx_lowest is set at C1 on bootup, and that even if manually set to C2 or more, 100% of the time is still spent in C1 for some reason. If you think a verbose dmesg of 10.3R and/or 11.0R would be of use, just ask.
(In reply to Stéphane Lesimple from comment #7) I also asked for sysctl output. From what I see in the dmesg, your CPU's LAPIC does not work in C2/C3. The fact that TSC is also not invariant forced the kernel to disable C2/C3 entrance, which you noted as well. There is also r309189 which downgrades LAPIC eventtimer quality. I am not sure that the commit is important for your machine, since it refuses to enter C2 at all. Yes, I probably also want to see verbose dmesg and sysctl output for 11.0. I might provide a recipe to restore C2/C3 usage after I see the data.
Created attachment 180369 [details] sysctl from HEAD Here is the previously missing sysctl output from a freshly booted HEAD
Created attachment 180372 [details] verbose dmesg of 11.0R Verbose dmesg of 11.0R (official memstick.img)
Created attachment 180373 [details] sysctl from 11.0R sysctl output from 11.0R, including dev.cpu.0 if it's of any use. Note that due to the bug, the sysctl output is from roughly 8 minutes after boot (as reported by `uptime')
(In reply to Stéphane Lesimple from comment #9) Do you have anything in loader.conf related to HPETs ? You have HPET registered for timecounter, but not for eventtimer, which is weird.
(In reply to Konstantin Belousov from comment #12) I don't, `grep -i hpet /boot/defaults/loader.conf /boot/loader.conf` returns zero lines on my HEAD test build.
Some new data points: The oldest version I could try, 8.0-RELEASE, behaves the same with C2 as in any more recent version (up to HEAD). This problem was hidden before due to the cx_lowest default always being C1 up to 11.0R. I also noticed that when C2 is set as cx_lowest, C1 is not used at all. Another interesting behavior is that the cx_usage_counters increases 10 times faster with C2 than with C1. It might also be interesting to note that C3 is ignored even if advertised as supported (which is probably irrelevant to this bug). # sysctl dev.cpu.0.cx_lowest=C1 ; sleep 10 ; sysctl dev.cpu.0.cx_usage_counters dev.cpu.0.cx_lowest: C1 -> C1 dev.cpu.0.cx_usage_counters: 1192 0 0 # sysctl dev.cpu.0.cx_lowest=C2 ; sleep 10 ; sysctl dev.cpu.0.cx_usage_counters dev.cpu.0.cx_lowest: C1 -> C2 dev.cpu.0.cx_usage_counters: 0 11122 0 # sysctl dev.cpu.0.cx_lowest=C3 ; sleep 10 ; sysctl dev.cpu.0.cx_usage_counters dev.cpu.0.cx_lowest: C2 -> C3 dev.cpu.0.cx_usage_counters: 0 11048 0
A commit references this bug: Author: brnrd Date: Sat Oct 24 11:30:13 UTC 2020 New revision: 553184 URL: https://svnweb.freebsd.org/changeset/ports/553184 Log: databases/mariadb105-server: Security update to 10.5.6 * Partial rc.d fixes from [1] PR: 217247 [1] MFH: 2020Q4 Security: a2565962-1156-11eb-9c9c-d4c9ef517024 Changes: head/databases/mariadb105-server/Makefile head/databases/mariadb105-server/distinfo head/databases/mariadb105-server/files/mysql-server.in head/databases/mariadb105-server/pkg-plist
A commit references this bug: Author: brnrd Date: Sat Oct 24 12:05:19 UTC 2020 New revision: 553189 URL: https://svnweb.freebsd.org/changeset/ports/553189 Log: MFH: r553184 databases/mariadb105-server: Security update to 10.5.6 * Partial rc.d fixes from [1] PR: 217247 [1] Security: a2565962-1156-11eb-9c9c-d4c9ef517024 Approved by: ports-secteam (fluffy) Changes: _U branches/2020Q4/ branches/2020Q4/databases/mariadb105-server/Makefile branches/2020Q4/databases/mariadb105-server/distinfo branches/2020Q4/databases/mariadb105-server/files/mysql-server.in branches/2020Q4/databases/mariadb105-server/pkg-plist