New build with Threadripper 1920X. The kernel reports the following error repeatedly - ranging from a few seconds to a couple minutes between repeats: hwpstate0: set freq failed, err 6 No change if powerd is on or off. uname -av FreeBSD gzy 11.1-RELEASE FreeBSD 11.1-RELEASE #0 r321309: Fri Jul 21 02:08:28 UTC 2017 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 ---- sysctl -a | grep hwpstate hwpstate0: set freq failed, err 6 [repeated / fills buffer] debug.hwpstate_verbose: 0 dev.hwpstate.0.freq_settings: 3500/4287 3000/3187 2200/1897 dev.hwpstate.0.%parent: cpu0 dev.hwpstate.0.%pnpinfo: dev.hwpstate.0.%location: dev.hwpstate.0.%driver: hwpstate dev.hwpstate.0.%desc: Cool`n'Quiet 2.0 dev.hwpstate.%parent: ---- sysctl -a | grep cpu kern.smp.cpus: 24 kern.smp.maxcpus: 256 kern.ccpu: 0 <cpu count="24" mask="ffffff,0,0,0">0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23</cpu> <cpu count="6" mask="3f,0,0,0">0, 1, 2, 3, 4, 5</cpu> <cpu count="2" mask="3,0,0,0">0, 1</cpu> <cpu count="2" mask="c,0,0,0">2, 3</cpu> <cpu count="2" mask="30,0,0,0">4, 5</cpu> <cpu count="6" mask="fc0,0,0,0">6, 7, 8, 9, 10, 11</cpu> <cpu count="2" mask="c0,0,0,0">6, 7</cpu> <cpu count="2" mask="300,0,0,0">8, 9</cpu> <cpu count="2" mask="c00,0,0,0">10, 11</cpu> <cpu count="6" mask="3f000,0,0,0">12, 13, 14, 15, 16, 17</cpu> <cpu count="2" mask="3000,0,0,0">12, 13</cpu> <cpu count="2" mask="c000,0,0,0">14, 15</cpu> <cpu count="2" mask="30000,0,0,0">16, 17</cpu> <cpu count="6" mask="fc0000,0,0,0">18, 19, 20, 21, 22, 23</cpu> <cpu count="2" mask="c0000,0,0,0">18, 19</cpu> <cpu count="2" mask="300000,0,0,0">20, 21</cpu> <cpu count="2" mask="c00000,0,0,0">22, 23</cpu> kern.sched.cpusetsize: 32 kern.pin_pcpu_swi: 0 kern.racct.pcpu_threshold: 1 cpu HAMMER device cpufreq kern.vt.splash_cpu_duration: 10 kern.vt.splash_cpu_style: 2 kern.vt.splash_ncpu: 0 kern.vt.splash_cpu: 0 vfs.ncpurgeminvnodes: 2048 net.inet.tcp.per_cpu_timers: 0 debug.cpufreq.verbose: 0 debug.cpufreq.lowest: 0 debug.acpi.cpu_unordered: 0 hw.ncpu: 24 hw.acpi.cpu.cx_lowest: C2 dev.cpufreq.0.%parent: cpu0 dev.cpufreq.0.%pnpinfo: dev.cpufreq.0.%location: dev.cpufreq.0.%driver: cpufreq dev.cpufreq.0.%desc: dev.cpufreq.%parent: dev.hwpstate.0.%parent: cpu0 dev.acpi_perf.23.%parent: cpu23 dev.acpi_perf.22.%parent: cpu22 dev.acpi_perf.21.%parent: cpu21 dev.acpi_perf.20.%parent: cpu20 dev.acpi_perf.19.%parent: cpu19 dev.acpi_perf.18.%parent: cpu18 dev.acpi_perf.17.%parent: cpu17 dev.acpi_perf.16.%parent: cpu16 dev.acpi_perf.15.%parent: cpu15 dev.acpi_perf.14.%parent: cpu14 dev.acpi_perf.13.%parent: cpu13 dev.acpi_perf.12.%parent: cpu12 dev.acpi_perf.11.%parent: cpu11 dev.acpi_perf.10.%parent: cpu10 dev.acpi_perf.9.%parent: cpu9 dev.acpi_perf.8.%parent: cpu8 dev.acpi_perf.7.%parent: cpu7 dev.acpi_perf.6.%parent: cpu6 dev.acpi_perf.5.%parent: cpu5 dev.acpi_perf.4.%parent: cpu4 dev.acpi_perf.3.%parent: cpu3 dev.acpi_perf.2.%parent: cpu2 dev.acpi_perf.1.%parent: cpu1 dev.acpi_perf.0.%parent: cpu0 dev.nvme.0.num_cpus_per_ioq: 4 dev.cpu.23.cx_method: C1/hlt C2/io dev.cpu.23.cx_usage_counters: 2453945 4102905 dev.cpu.23.cx_usage: 37.42% 62.57% last 979us dev.cpu.23.cx_lowest: C2 dev.cpu.23.cx_supported: C1/1/1 C2/2/400 dev.cpu.23.%parent: acpi0 dev.cpu.23.%pnpinfo: _HID=none _UID=0 dev.cpu.23.%location: handle=\_PR_.C017 dev.cpu.23.%driver: cpu dev.cpu.23.%desc: ACPI CPU dev.cpu.22.cx_method: C1/hlt C2/io dev.cpu.22.cx_usage_counters: 2600457 4181710 dev.cpu.22.cx_usage: 38.34% 61.65% last 3711us dev.cpu.22.cx_lowest: C2 dev.cpu.22.cx_supported: C1/1/1 C2/2/400 dev.cpu.22.%parent: acpi0 dev.cpu.22.%pnpinfo: _HID=none _UID=0 dev.cpu.22.%location: handle=\_PR_.C016 dev.cpu.22.%driver: cpu dev.cpu.22.%desc: ACPI CPU dev.cpu.21.cx_method: C1/hlt C2/io dev.cpu.21.cx_usage_counters: 2422602 3977879 dev.cpu.21.cx_usage: 37.85% 62.14% last 3908us dev.cpu.21.cx_lowest: C2 dev.cpu.21.cx_supported: C1/1/1 C2/2/400 dev.cpu.21.%parent: acpi0 dev.cpu.21.%pnpinfo: _HID=none _UID=0 dev.cpu.21.%location: handle=\_PR_.C015 dev.cpu.21.%driver: cpu dev.cpu.21.%desc: ACPI CPU dev.cpu.20.cx_method: C1/hlt C2/io dev.cpu.20.cx_usage_counters: 2536009 4052910 dev.cpu.20.cx_usage: 38.48% 61.51% last 1052us dev.cpu.20.cx_lowest: C2 dev.cpu.20.cx_supported: C1/1/1 C2/2/400 dev.cpu.20.%parent: acpi0 dev.cpu.20.%pnpinfo: _HID=none _UID=0 dev.cpu.20.%location: handle=\_PR_.C014 dev.cpu.20.%driver: cpu dev.cpu.20.%desc: ACPI CPU dev.cpu.19.cx_method: C1/hlt C2/io dev.cpu.19.cx_usage_counters: 2341676 3899865 dev.cpu.19.cx_usage: 37.51% 62.48% last 2us dev.cpu.19.cx_lowest: C2 dev.cpu.19.cx_supported: C1/1/1 C2/2/400 dev.cpu.19.%parent: acpi0 dev.cpu.19.%pnpinfo: _HID=none _UID=0 dev.cpu.19.%location: handle=\_PR_.C013 dev.cpu.19.%driver: cpu dev.cpu.19.%desc: ACPI CPU dev.cpu.18.cx_method: C1/hlt C2/io dev.cpu.18.cx_usage_counters: 2434036 3973339 dev.cpu.18.cx_usage: 37.98% 62.01% last 3489us dev.cpu.18.cx_lowest: C2 dev.cpu.18.cx_supported: C1/1/1 C2/2/400 dev.cpu.18.%parent: acpi0 dev.cpu.18.%pnpinfo: _HID=none _UID=0 dev.cpu.18.%location: handle=\_PR_.C012 dev.cpu.18.%driver: cpu dev.cpu.18.%desc: ACPI CPU dev.cpu.17.cx_method: C1/hlt C2/io dev.cpu.17.cx_usage_counters: 2355975 4002148 dev.cpu.17.cx_usage: 37.05% 62.94% last 1924us dev.cpu.17.cx_lowest: C2 dev.cpu.17.cx_supported: C1/1/1 C2/2/400 dev.cpu.17.%parent: acpi0 dev.cpu.17.%pnpinfo: _HID=none _UID=0 dev.cpu.17.%location: handle=\_PR_.C011 dev.cpu.17.%driver: cpu dev.cpu.17.%desc: ACPI CPU dev.cpu.16.cx_method: C1/hlt C2/io dev.cpu.16.cx_usage_counters: 2534527 4129061 dev.cpu.16.cx_usage: 38.03% 61.96% last 507us dev.cpu.16.cx_lowest: C2 dev.cpu.16.cx_supported: C1/1/1 C2/2/400 dev.cpu.16.%parent: acpi0 dev.cpu.16.%pnpinfo: _HID=none _UID=0 dev.cpu.16.%location: handle=\_PR_.C010 dev.cpu.16.%driver: cpu dev.cpu.16.%desc: ACPI CPU dev.cpu.15.cx_method: C1/hlt C2/io dev.cpu.15.cx_usage_counters: 2298008 3937197 dev.cpu.15.cx_usage: 36.85% 63.14% last 1186us dev.cpu.15.cx_lowest: C2 dev.cpu.15.cx_supported: C1/1/1 C2/2/400 dev.cpu.15.%parent: acpi0 dev.cpu.15.%pnpinfo: _HID=none _UID=0 dev.cpu.15.%location: handle=\_PR_.C00F dev.cpu.15.%driver: cpu dev.cpu.15.%desc: ACPI CPU dev.cpu.14.cx_method: C1/hlt C2/io dev.cpu.14.cx_usage_counters: 2994036 4853700 dev.cpu.14.cx_usage: 38.15% 61.84% last 1567us dev.cpu.14.cx_lowest: C2 dev.cpu.14.cx_supported: C1/1/1 C2/2/400 dev.cpu.14.%parent: acpi0 dev.cpu.14.%pnpinfo: _HID=none _UID=0 dev.cpu.14.%location: handle=\_PR_.C00E dev.cpu.14.%driver: cpu dev.cpu.14.%desc: ACPI CPU dev.cpu.13.cx_method: C1/hlt C2/io dev.cpu.13.cx_usage_counters: 2243953 3861132 dev.cpu.13.cx_usage: 36.75% 63.24% last 2914us dev.cpu.13.cx_lowest: C2 dev.cpu.13.cx_supported: C1/1/1 C2/2/400 dev.cpu.13.%parent: acpi0 dev.cpu.13.%pnpinfo: _HID=none _UID=0 dev.cpu.13.%location: handle=\_PR_.C00D dev.cpu.13.%driver: cpu dev.cpu.13.%desc: ACPI CPU dev.cpu.12.cx_method: C1/hlt C2/io dev.cpu.12.cx_usage_counters: 2372562 3947137 dev.cpu.12.cx_usage: 37.54% 62.45% last 1280us dev.cpu.12.cx_lowest: C2 dev.cpu.12.cx_supported: C1/1/1 C2/2/400 dev.cpu.12.%parent: acpi0 dev.cpu.12.%pnpinfo: _HID=none _UID=0 dev.cpu.12.%location: handle=\_PR_.C00C dev.cpu.12.%driver: cpu dev.cpu.12.%desc: ACPI CPU dev.cpu.11.cx_method: C1/hlt C2/io dev.cpu.11.cx_usage_counters: 2539566 4104573 dev.cpu.11.cx_usage: 38.22% 61.77% last 2991us dev.cpu.11.cx_lowest: C2 dev.cpu.11.cx_supported: C1/1/1 C2/2/400 dev.cpu.11.%parent: acpi0 dev.cpu.11.%pnpinfo: _HID=none _UID=0 dev.cpu.11.%location: handle=\_PR_.C00B dev.cpu.11.%driver: cpu dev.cpu.11.%desc: ACPI CPU dev.cpu.10.cx_method: C1/hlt C2/io dev.cpu.10.cx_usage_counters: 2625097 4190803 dev.cpu.10.cx_usage: 38.51% 61.48% last 2765us dev.cpu.10.cx_lowest: C2 dev.cpu.10.cx_supported: C1/1/1 C2/2/400 dev.cpu.10.%parent: acpi0 dev.cpu.10.%pnpinfo: _HID=none _UID=0 dev.cpu.10.%location: handle=\_PR_.C00A dev.cpu.10.%driver: cpu dev.cpu.10.%desc: ACPI CPU dev.cpu.9.cx_method: C1/hlt C2/io dev.cpu.9.cx_usage_counters: 2476575 4019405 dev.cpu.9.cx_usage: 38.12% 61.87% last 3526us dev.cpu.9.cx_lowest: C2 dev.cpu.9.cx_supported: C1/1/1 C2/2/400 dev.cpu.9.%parent: acpi0 dev.cpu.9.%pnpinfo: _HID=none _UID=0 dev.cpu.9.%location: handle=\_PR_.C009 dev.cpu.9.%driver: cpu dev.cpu.9.%desc: ACPI CPU dev.cpu.8.cx_method: C1/hlt C2/io dev.cpu.8.cx_usage_counters: 2641831 4109578 dev.cpu.8.cx_usage: 39.13% 60.86% last 2049us dev.cpu.8.cx_lowest: C2 dev.cpu.8.cx_supported: C1/1/1 C2/2/400 dev.cpu.8.%parent: acpi0 dev.cpu.8.%pnpinfo: _HID=none _UID=0 dev.cpu.8.%location: handle=\_PR_.C008 dev.cpu.8.%driver: cpu dev.cpu.8.%desc: ACPI CPU dev.cpu.7.cx_method: C1/hlt C2/io dev.cpu.7.cx_usage_counters: 2409144 3945099 dev.cpu.7.cx_usage: 37.91% 62.08% last 11695us dev.cpu.7.cx_lowest: C2 dev.cpu.7.cx_supported: C1/1/1 C2/2/400 dev.cpu.7.%parent: acpi0 dev.cpu.7.%pnpinfo: _HID=none _UID=0 dev.cpu.7.%location: handle=\_PR_.C007 dev.cpu.7.%driver: cpu dev.cpu.7.%desc: ACPI CPU dev.cpu.6.cx_method: C1/hlt C2/io dev.cpu.6.cx_usage_counters: 2489036 4013233 dev.cpu.6.cx_usage: 38.27% 61.72% last 2511us dev.cpu.6.cx_lowest: C2 dev.cpu.6.cx_supported: C1/1/1 C2/2/400 dev.cpu.6.%parent: acpi0 dev.cpu.6.%pnpinfo: _HID=none _UID=0 dev.cpu.6.%location: handle=\_PR_.C006 dev.cpu.6.%driver: cpu dev.cpu.6.%desc: ACPI CPU dev.cpu.5.cx_method: C1/hlt C2/io dev.cpu.5.cx_usage_counters: 2432103 3961948 dev.cpu.5.cx_usage: 38.03% 61.96% last 1532us dev.cpu.5.cx_lowest: C2 dev.cpu.5.cx_supported: C1/1/1 C2/2/400 dev.cpu.5.%parent: acpi0 dev.cpu.5.%pnpinfo: _HID=none _UID=0 dev.cpu.5.%location: handle=\_PR_.C005 dev.cpu.5.%driver: cpu dev.cpu.5.%desc: ACPI CPU dev.cpu.4.cx_method: C1/hlt C2/io dev.cpu.4.cx_usage_counters: 2902090 4284426 dev.cpu.4.cx_usage: 40.38% 59.61% last 1432us dev.cpu.4.cx_lowest: C2 dev.cpu.4.cx_supported: C1/1/1 C2/2/400 dev.cpu.4.%parent: acpi0 dev.cpu.4.%pnpinfo: _HID=none _UID=0 dev.cpu.4.%location: handle=\_PR_.C004 dev.cpu.4.%driver: cpu dev.cpu.4.%desc: ACPI CPU dev.cpu.3.cx_method: C1/hlt C2/io dev.cpu.3.cx_usage_counters: 2403177 3900568 dev.cpu.3.cx_usage: 38.12% 61.87% last 143us dev.cpu.3.cx_lowest: C2 dev.cpu.3.cx_supported: C1/1/1 C2/2/400 dev.cpu.3.%parent: acpi0 dev.cpu.3.%pnpinfo: _HID=none _UID=0 dev.cpu.3.%location: handle=\_PR_.C003 dev.cpu.3.%driver: cpu dev.cpu.3.%desc: ACPI CPU dev.cpu.2.cx_method: C1/hlt C2/io dev.cpu.2.cx_usage_counters: 2492593 3971882 dev.cpu.2.cx_usage: 38.55% 61.44% last 3067us dev.cpu.2.cx_lowest: C2 dev.cpu.2.cx_supported: C1/1/1 C2/2/400 dev.cpu.2.%parent: acpi0 dev.cpu.2.%pnpinfo: _HID=none _UID=0 dev.cpu.2.%location: handle=\_PR_.C002 dev.cpu.2.%driver: cpu dev.cpu.2.%desc: ACPI CPU dev.cpu.1.cx_method: C1/hlt C2/io dev.cpu.1.cx_usage_counters: 2427833 3862553 dev.cpu.1.cx_usage: 38.59% 61.40% last 3042us dev.cpu.1.cx_lowest: C2 dev.cpu.1.cx_supported: C1/1/1 C2/2/400 dev.cpu.1.%parent: acpi0 dev.cpu.1.%pnpinfo: _HID=none _UID=0 dev.cpu.1.%location: handle=\_PR_.C001 dev.cpu.1.%driver: cpu dev.cpu.1.%desc: ACPI CPU dev.cpu.0.cx_method: C1/hlt C2/io dev.cpu.0.cx_usage_counters: 34418398 12688334 dev.cpu.0.cx_usage: 73.06% 26.93% last 803us dev.cpu.0.cx_lowest: C2 dev.cpu.0.cx_supported: C1/1/1 C2/2/400 dev.cpu.0.freq_levels: 3500/4287 3000/3187 2200/1897 dev.cpu.0.freq: 3500 dev.cpu.0.%parent: acpi0 dev.cpu.0.%pnpinfo: _HID=none _UID=0 dev.cpu.0.%location: handle=\_PR_.C000 dev.cpu.0.%driver: cpu dev.cpu.0.%desc: ACPI CPU dev.cpu.%parent: security.jail.param.cpuset.id: 0
Well, I don't know what is actually logging that line. But I expect the problem lies in hwpstates, probably in the family-specific MSR support that does not yet include 17h (Zen): https://svnweb.freebsd.org/base/head/sys/x86/cpufreq/hwpstate.c?view=markup#l397 I don't have this hardware around yet (waiting on a board to come back in stock), but I know a couple others do.
By the way, setting 'debug.hwpstate_verbose="1"' in loader.conf may show a little more of what is going on.
Please try the following patch: --- a/sys/x86/cpufreq/hwpstate.c +++ b/sys/x86/cpufreq/hwpstate.c @@ -83,6 +83,10 @@ __FBSDID("$FreeBSD$"); #define AMD_10H_11H_CUR_DID(msr) (((msr) >> 6) & 0x07) #define AMD_10H_11H_CUR_FID(msr) ((msr) & 0x3F) +#define AMD_17H_CUR_VID(msr) (((msr) >> 14) & 0xFF) +#define AMD_17H_CUR_DID(msr) (((msr) >> 8) & 0x3F) +#define AMD_17H_CUR_FID(msr) ((msr) & 0xFF) + #define HWPSTATE_DEBUG(dev, msg...) \ do{ \ if(hwpstate_verbose) \ @@ -427,6 +431,15 @@ hwpstate_get_info_from_msr(device_t dev) case 0x16: hwpstate_set[i].freq = (100 * (fid + 0x10)) >> did; break; + case 0x17: + did = AMD_17H_CUR_DID(msr); + if (did == 0) { + HWPSTATE_DEBUG(dev, "unexpected did: 0\n"); + did = 1; + } + fid = AMD_17H_CUR_FID(msr); + hwpstate_set[i].freq = (200 * fid) / did; + break; default: HWPSTATE_DEBUG(dev, "get_info_from_msr: AMD family" " 0x%02x CPUs are not supported yet\n", family); It is based on the Family 17h Reference manual and compiles, but obviously I am unable to test it.
(In reply to Conrad Meyer from comment #2) Thanks. Here's what verbose mode returns. I'll take the system down to apply the patch in the morning and report back: 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu0 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu1 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu2 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu3 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu4 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu5 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu6 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu7 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu8 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu9 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu10 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu11 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu12 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu13 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu14 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu15 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu16 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu17 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu18 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu19 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu20 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu21 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu22 2017-08-19 01:24:03 gzy kernel hwpstate0: setting P1-state on cpu23 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu0 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu1 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu2 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu3 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu4 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu5 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu6 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu7 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu8 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu9 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu10 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu11 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu12 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu13 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu14 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu15 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu16 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu17 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu18 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu19 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu20 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu21 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu22 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: result P2-state on cpu23 2017-08-19 01:24:03 gzy kernel hwpstate0: error: loop is not enough. 2017-08-19 01:24:03 gzy kernel hwpstate0: set freq failed, err 6
Hm, it's interesting that setting pstates fails like that. The current code should be valid on Zen. I don't see any documentation in the reference manual on transition latency, but given this loop is processed sequentially it seems that it never succeeded even after 24 * 0.01 = 0.24 seconds. That seems like it should be plenty of time to me, but I don't know. Maybe hardware or some other system is controlling the p-state on your system?
(In reply to Conrad Meyer from comment #5) As your last post suspected, that patch didn't change the hwpstate0 error. I ran powerd -v and it returns the following messages, if they're helpful. I only hesitate in saying that you're almost certainly right that something in the hardware or firmware is controlling the power because of the powerd output below (says no frequency adjustments happening). OTOH, the system has definitely been running much (_much_) cooler than I thought it would. I'll install Windows later today to see if its power management changes the temperature,fan speed, etc. in any way that's noticeably different. Any other ideas? And, if not, what's the cleanest way to stop the hwpstate0 messages from filling my log without losing any important functionality? Thanks again. Here's the powerd output: powerd -v powerd: unable to determine AC line status load 7%, current freq 3500 MHz ( 0), wanted freq 3390 MHz load 14%, current freq 3500 MHz ( 0), wanted freq 3284 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 3181 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 3081 MHz load 10%, current freq 3500 MHz ( 0), wanted freq 2984 MHz changing clock speed from 3500 MHz to 3000 MHz powerd: error setting CPU frequency 3000: Device not configured load 83%, current freq 3500 MHz ( 0), wanted freq 7000 MHz load 17%, current freq 3500 MHz ( 0), wanted freq 6781 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 6569 MHz load 19%, current freq 3500 MHz ( 0), wanted freq 6363 MHz load 3%, current freq 3500 MHz ( 0), wanted freq 6164 MHz load 19%, current freq 3500 MHz ( 0), wanted freq 5971 MHz load 2%, current freq 3500 MHz ( 0), wanted freq 5784 MHz load 4%, current freq 3500 MHz ( 0), wanted freq 5603 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 5427 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 5257 MHz load 2%, current freq 3500 MHz ( 0), wanted freq 5092 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 4932 MHz load 9%, current freq 3500 MHz ( 0), wanted freq 4777 MHz load 14%, current freq 3500 MHz ( 0), wanted freq 4627 MHz load 5%, current freq 3500 MHz ( 0), wanted freq 4482 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 4341 MHz load 5%, current freq 3500 MHz ( 0), wanted freq 4205 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 4073 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 3945 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 3821 MHz load 3%, current freq 3500 MHz ( 0), wanted freq 3701 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 3585 MHz load 14%, current freq 3500 MHz ( 0), wanted freq 3472 MHz load 5%, current freq 3500 MHz ( 0), wanted freq 3363 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 3257 MHz load 4%, current freq 3500 MHz ( 0), wanted freq 3155 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 3056 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 2960 MHz changing clock speed from 3500 MHz to 3000 MHz powerd: error setting CPU frequency 3000: Device not configured load 142%, current freq 3500 MHz ( 0), wanted freq 7000 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 6781 MHz load 7%, current freq 3500 MHz ( 0), wanted freq 6569 MHz load 5%, current freq 3500 MHz ( 0), wanted freq 6363 MHz load 8%, current freq 3500 MHz ( 0), wanted freq 6164 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 5971 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 5784 MHz load 3%, current freq 3500 MHz ( 0), wanted freq 5603 MHz load 11%, current freq 3500 MHz ( 0), wanted freq 5427 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 5257 MHz load 20%, current freq 3500 MHz ( 0), wanted freq 5092 MHz load 4%, current freq 3500 MHz ( 0), wanted freq 4932 MHz load 34%, current freq 3500 MHz ( 0), wanted freq 4932 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 4777 MHz load 3%, current freq 3500 MHz ( 0), wanted freq 4627 MHz load 5%, current freq 3500 MHz ( 0), wanted freq 4482 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 4341 MHz load 2%, current freq 3500 MHz ( 0), wanted freq 4205 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 4073 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 3945 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 3821 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 3701 MHz load 4%, current freq 3500 MHz ( 0), wanted freq 3585 MHz load 3%, current freq 3500 MHz ( 0), wanted freq 3472 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 3363 MHz load 10%, current freq 3500 MHz ( 0), wanted freq 3257 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 3155 MHz load 28%, current freq 3500 MHz ( 0), wanted freq 3155 MHz load 8%, current freq 3500 MHz ( 0), wanted freq 3056 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 2960 MHz changing clock speed from 3500 MHz to 3000 MHz powerd: error setting CPU frequency 3000: Device not configured load 103%, current freq 3500 MHz ( 0), wanted freq 7000 MHz load 6%, current freq 3500 MHz ( 0), wanted freq 6781 MHz load 0%, current freq 3500 MHz ( 0), wanted freq 6569 MHz load 14%, current freq 3500 MHz ( 0), wanted freq 6363 MHz ^Ctotal joules used: 71.757
(In reply to Sean P.R. from comment #6) I suspect the repeated logging is coming from powerd's attempts to change the p-state (error 6 == ENXIO == "Device not configured"). If you disable the powerd service, does the log spam go away? It doesn't seem like powerd is doing anything for now anyway, so disabling it seems harmless. It's interesting that powerd is reporting the machine in P0, and attempting to transition to P1. Attempting to set P1 matches what the hwpstate module is logging; however, hwpstate is reporting that the machine is actually in P2 instead! So it seems there is some discrepancy between what hwpstate sees and what powerd sees. Perhaps powerd is just seeing stale cached data from hwpstate. Either way, if the CPU is truly in P2 (2.2GHz base), that would explain the low temperatures.
(In reply to Conrad Meyer from comment #7) The hwpstate0 error messages stay the same regardless whether I have powerd disabled or enabled... Sounds like nothing else to try at the moment then? thanks again.
(In reply to Sean P.R. from comment #8) Yeah, sorry, I don't have any other ideas. I couldn't find that specific log message so I don't know where it's coming from.
The error message is the same as https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=167018 . Unfortunately, the change that fixed it for those processors is already included in CURRENT and 11.1, so it seems we have a different issue here. Perhaps the MSRC001_0071 register mentioned in https://lists.freebsd.org/pipermail/freebsd-stable/2012-May/067758.html would work better on Zen? I don't see it documented in the reference manual, though. Instead, there is MSRC001_0293 "Hardware PState Status," which has HW p-state in bits 22-24? But the text seems to indicate that the existing register is the right one to check: > ยง 2.1.5.2.1.2 Core P-state Visibility > > MSRC001_0063[CurPstate] reflects the current non-boosted P-state number for > each core. For example, if Core::X86::Msr::PStateStat[CurPstate]=001b on core > 1, then core 1 is in the P1 state. Sorry I'm not more help :-).
(In reply to Conrad Meyer from comment #10) Hi, I want to know where the text is. Because in https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf , There is no content of the section "2.1.3 CPU Power Management". the next line is the section 2.1.4. If you have this section's content, you can go forward, I think. I'm waiting for this section's content. By the way, the error messages should be gone away, if powerd is not running. I wonder why :-)?
(In reply to mailsed from comment #11) The updated (in-progress, not finalized) version of the PPR on AMD's developer hub has some content in "CPU Power Management," now numbered 2.1.5 (but that may change).
(In reply to Conrad Meyer from comment #12) After some playing around yesterday I think your earlier suspicious is right - pretty sure the BIOS is controlling the p-state. MB is msi X399 "Gaming Pro Carbon AC." Uses AMI BIOS. Same behavior under BIOS versions 1.2 (shipped), 1.3 (official release, flashed), and 1.43 (beta release, flashed) Since the sensors aren't giving much useful information, I've been focused on watching the fans; I can see that the fans increase and decrease speed according to the load. Very unscientifically, fan behavior looks about the same under ordinary use as in Windows, where I can see from the MB manufacturer's tool that the p-state varies a bit under load. Of course, there are also AMD chipset drivers that allow more fine-grained control under Windows, so BIOS control shouldn't be mutually exclusive. One possible lead: I am pretty sure that the MB LEDs always read "AA" when booted into FreeBSD. According to the manual, that means "System has transitioned into ACPI mode. Interrupt controller is in APIC mode." - while the CPU temperature should be displayed "after the CPU has fully booted into the OS." It's possible I'm misreading "99" as "AA," but I'm pretty sure not. See PDF page 48 of http://download.msi.com/archive/mnu_exe/E7B09v1.0.zip Re error messages- In addition to disabling powerd, I turned everything I could find related to AHCI software control (I'm impatient :P). That fixed it and things are running acceptably* (*in terms of power management; performance is great under FreeBSD and most common tasks are faster than Windows). Happy to help debug more if you like. [public service p.s. in case someone having other trouble with this MB happens across this message in a search: (1) turn off the XHCI hand off or there's a high probability the USB controller will go into a reset loop; and (2) virtualization is turned off by default and BIOS versions < the 1.43 beta inexplicably lack the option to turn it on, so bhyve and vbox will fail until you flash the BIOS and enable "SVM"]
One more data point - OpenBSD's power management appears to work fine with no errors reported, in case their code's helpful in debugging. Set to hw.perfpolicy=auto it switches between "low" (2200MHz) and "high" (3500MHz) based on how hard I'm working the machine, or at least that's what the daemon log reports (in a series of short tests). Manually switching to low keeps things at 2200MHz in the same tests.
(In reply to Sean P.R. from comment #14) If someone wants to investigate what OpenBSD does differently here, I think the relevant files are: https://github.com/openbsd/src/blob/master/sys/arch/amd64/amd64/k1x-pstate.c https://github.com/openbsd/src/blob/master/sys/dev/acpi/acpicpu.c (Grep for cpu_setperf(), which is a callback called by the "auto" hw.perfpolicy policy in sched_bsd.c.)
I've noticed a curious thing on my 1950X. Setting P0 or P2 works fine. But P1 does not: $ sysctl dev.cpu.0.freq=2800 dev.cpu.0.freq: 3400 sysctl: dev.cpu.0.freq=2800: Device not configured $ sysctl dev.cpu.0.freq=2200 dev.cpu.0.freq: 3400 -> 2200 $ sysctl dev.cpu.0.freq=3400 dev.cpu.0.freq: 2200 -> 3400 Do you observe the same (well, similar) thing on your 1920X? OpenBSD's power management might "work" by only toggling between P0 and P2. Or otherwise ignoring the inability to set P1.
By the way, I also note that the failure to set P1 drives the load average up enough that powerd thinks P0 is correct again. So we don't just try P2 eventually because of that, I think. I see the same thing as you (comment 6) (except my P0 is 3400 and P2 is only 2800): load 10%, current freq 3500 MHz ( 0), wanted freq 2984 MHz changing clock speed from 3500 MHz to 3000 MHz powerd: error setting CPU frequency 3000: Device not configured load 83%, current freq 3500 MHz ( 0), wanted freq 7000 MHz ^^^ ^^^^^^^^^^^^^^^^^^^^ ... load 6%, current freq 3500 MHz ( 0), wanted freq 2960 MHz changing clock speed from 3500 MHz to 3000 MHz powerd: error setting CPU frequency 3000: Device not configured load 142%, current freq 3500 MHz ( 0), wanted freq 7000 MHz ^^^^ ^^^^^^^^^^^^^^^^^^^^ ... load 0%, current freq 3500 MHz ( 0), wanted freq 2960 MHz changing clock speed from 3500 MHz to 3000 MHz powerd: error setting CPU frequency 3000: Device not configured load 103%, current freq 3500 MHz ( 0), wanted freq 7000 MHz ^^^^ ^^^^^^^^^^^^^^^^^^^^
I've changed hwpstate's mechanism to wait for CPUs to switch frequency levels from DELAY() to sleep()[0]. This allows the CPU to idle instead of spinning for the duration of the failed transition to P1. This at least allows powerd to switch to P2 eventually: $ powerd -v ... // Load is added ... load 34%, current freq 2200 MHz ( 2), wanted freq 2200 MHz load 100%, current freq 2200 MHz ( 2), wanted freq 6800 MHz changing clock speed from 2200 MHz to 3400 MHz load 100%, current freq 3400 MHz ( 0), wanted freq 6800 MHz ... // Load is removed ... load 0%, current freq 3400 MHz ( 0), wanted freq 2877 MHz load 0%, current freq 3400 MHz ( 0), wanted freq 2787 MHz changing clock speed from 3400 MHz to 2800 MHz powerd: error setting CPU frequency 2800: Device not configured load 2%, current freq 3400 MHz ( 0), wanted freq 2699 MHz changing clock speed from 3400 MHz to 2800 MHz powerd: error setting CPU frequency 2800: Device not configured load 2%, current freq 3400 MHz ( 0), wanted freq 2614 MHz changing clock speed from 3400 MHz to 2800 MHz powerd: error setting CPU frequency 2800: Device not configured load 2%, current freq 3400 MHz ( 0), wanted freq 2532 MHz changing clock speed from 3400 MHz to 2800 MHz powerd: error setting CPU frequency 2800: Device not configured load 2%, current freq 3400 MHz ( 0), wanted freq 2452 MHz changing clock speed from 3400 MHz to 2800 MHz powerd: error setting CPU frequency 2800: Device not configured load 0%, current freq 3400 MHz ( 0), wanted freq 2375 MHz changing clock speed from 3400 MHz to 2800 MHz powerd: error setting CPU frequency 2800: Device not configured load 0%, current freq 3400 MHz ( 0), wanted freq 2300 MHz changing clock speed from 3400 MHz to 2800 MHz powerd: error setting CPU frequency 2800: Device not configured load 0%, current freq 3400 MHz ( 0), wanted freq 2228 MHz changing clock speed from 3400 MHz to 2800 MHz powerd: error setting CPU frequency 2800: Device not configured load 2%, current freq 3400 MHz ( 0), wanted freq 2200 MHz changing clock speed from 3400 MHz to 2200 MHz load 0%, current freq 2200 MHz ( 2), wanted freq 2200 MHz [0]: https://reviews.freebsd.org/D12260
A commit references this bug: Author: cem Date: Thu Sep 7 20:20:12 UTC 2017 New revision: 323286 URL: https://svnweb.freebsd.org/changeset/base/323286 Log: cpufreq(4) hwpstate: Yield CPU awaiting frequency change It doesn't seem necessary to busy the CPU while waiting to transition into a different p-state. PR: 221621 (related, but does not completely address) Reviewed by: truckman Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12260 Changes: head/sys/x86/cpufreq/hwpstate.c
(In reply to Conrad Meyer from comment #18) Sorry I dropped off for a while. Please let me know if there's something I can help test this weekend.
Resolved by r326383 + r326407. Now we simply do not verify if the value set reads back from MSR.