The Tyan Power8 machines seem to need some kind of tweak to allow any powerd functionality. Starting powerd simply emits this repeatedly on the console: pmcr0: set freq failed, err 22 pmcr1: set freq failed, err 22 pmcr2: set freq failed, err 22 pmcr3: set freq failed, err 22 pmcr0: set freq failed, err 22 pmcr1: set freq failed, err 22 pmcr2: set freq failed, err 22 pmcr3: set freq failed, err 22 pmcr(8) seems to be detecting the following: root@pylon.nyi:~ # sysctl dev.pmcr dev.pmcr.3.freq_settings: 4322/-1 4289/-1 4256/-1 4222/-1 4189/-1 4156/-1 4123/-1 4089/-1 4056/-1 4023/-1 3990/-1 3956/-1 3923/-1 3890/-1 3857/-1 3823/-1 3790/-1 3757/-1 3724/-1 3690/-1 3657/-1 3624/-1 3591/-1 3557/-1 3524/-1 3491/-1 3458/-1 3424/-1 3391/-1 3358/-1 3325/-1 3291/-1 3258/-1 3225/-1 3192/-1 3158/-1 3125/-1 3092/-1 3059/-1 3025/-1 2992/-1 2959/-1 2926/-1 2892/-1 2859/-1 2826/-1 2793/-1 2759/-1 2726/-1 2693/-1 2660/-1 2626/-1 2593/-1 2560/-1 2527/-1 2493/-1 2460/-1 2427/-1 2394/-1 2360/-1 2327/-1 2294/-1 2261/-1 2227/-1 2194/-1 2161/-1 2128/-1 2094/-1 2061/-1 dev.pmcr.3.%parent: cpu3 dev.pmcr.3.%pnpinfo: dev.pmcr.3.%location: dev.pmcr.3.%driver: pmcr dev.pmcr.3.%desc: Power Management Control Register dev.pmcr.2.freq_settings: 4322/-1 4289/-1 4256/-1 4222/-1 4189/-1 4156/-1 4123/-1 4089/-1 4056/-1 4023/-1 3990/-1 3956/-1 3923/-1 3890/-1 3857/-1 3823/-1 3790/-1 3757/-1 3724/-1 3690/-1 3657/-1 3624/-1 3591/-1 3557/-1 3524/-1 3491/-1 3458/-1 3424/-1 3391/-1 3358/-1 3325/-1 3291/-1 3258/-1 3225/-1 3192/-1 3158/-1 3125/-1 3092/-1 3059/-1 3025/-1 2992/-1 2959/-1 2926/-1 2892/-1 2859/-1 2826/-1 2793/-1 2759/-1 2726/-1 2693/-1 2660/-1 2626/-1 2593/-1 2560/-1 2527/-1 2493/-1 2460/-1 2427/-1 2394/-1 2360/-1 2327/-1 2294/-1 2261/-1 2227/-1 2194/-1 2161/-1 2128/-1 2094/-1 2061/-1 dev.pmcr.2.%parent: cpu2 dev.pmcr.2.%pnpinfo: dev.pmcr.2.%location: dev.pmcr.2.%driver: pmcr dev.pmcr.2.%desc: Power Management Control Register dev.pmcr.1.freq_settings: 4322/-1 4289/-1 4256/-1 4222/-1 4189/-1 4156/-1 4123/-1 4089/-1 4056/-1 4023/-1 3990/-1 3956/-1 3923/-1 3890/-1 3857/-1 3823/-1 3790/-1 3757/-1 3724/-1 3690/-1 3657/-1 3624/-1 3591/-1 3557/-1 3524/-1 3491/-1 3458/-1 3424/-1 3391/-1 3358/-1 3325/-1 3291/-1 3258/-1 3225/-1 3192/-1 3158/-1 3125/-1 3092/-1 3059/-1 3025/-1 2992/-1 2959/-1 2926/-1 2892/-1 2859/-1 2826/-1 2793/-1 2759/-1 2726/-1 2693/-1 2660/-1 2626/-1 2593/-1 2560/-1 2527/-1 2493/-1 2460/-1 2427/-1 2394/-1 2360/-1 2327/-1 2294/-1 2261/-1 2227/-1 2194/-1 2161/-1 2128/-1 2094/-1 2061/-1 dev.pmcr.1.%parent: cpu1 dev.pmcr.1.%pnpinfo: dev.pmcr.1.%location: dev.pmcr.1.%driver: pmcr dev.pmcr.1.%desc: Power Management Control Register dev.pmcr.0.freq_settings: 4322/-1 4289/-1 4256/-1 4222/-1 4189/-1 4156/-1 4123/-1 4089/-1 4056/-1 4023/-1 3990/-1 3956/-1 3923/-1 3890/-1 3857/-1 3823/-1 3790/-1 3757/-1 3724/-1 3690/-1 3657/-1 3624/-1 3591/-1 3557/-1 3524/-1 3491/-1 3458/-1 3424/-1 3391/-1 3358/-1 3325/-1 3291/-1 3258/-1 3225/-1 3192/-1 3158/-1 3125/-1 3092/-1 3059/-1 3025/-1 2992/-1 2959/-1 2926/-1 2892/-1 2859/-1 2826/-1 2793/-1 2759/-1 2726/-1 2693/-1 2660/-1 2626/-1 2593/-1 2560/-1 2527/-1 2493/-1 2460/-1 2427/-1 2394/-1 2360/-1 2327/-1 2294/-1 2261/-1 2227/-1 2194/-1 2161/-1 2128/-1 2094/-1 2061/-1 dev.pmcr.0.%parent: cpu0 dev.pmcr.0.%pnpinfo: dev.pmcr.0.%location: dev.pmcr.0.%driver: pmcr dev.pmcr.0.%desc: Power Management Control Register dev.pmcr.%parent:
Created attachment 199712 [details] Kernel Config being used on this machine Just in case I'm missing something obvious to enable this support, I'm attaching the kernel config being used on this machine.
It doesn't fix this issue, but I suspect there is an off-by-one bug in pmcr_set(): --- a/sys/powerpc/cpufreq/pmcr.c +++ b/sys/powerpc/cpufreq/pmcr.c @@ -189,7 +190,7 @@ pmcr_set(device_t dev, const struct cf_setting *set) if (set == NULL) return (EINVAL); - if (set->spec[0] < 0 || set->spec[0] > npstates) + if (set->spec[0] < 0 || set->spec[0] >= npstates) return (EINVAL); pmcr = ((long)pstate_ids[set->spec[0]] << PMCR_LOWERPS_SHIFT) &
My read of pmcr and cpu_freq, based on Sean's supporting CF_DEBUG output, is that probably the pstate_ids table is bogus on this machine. Or that it shouldn't be used as an array index into sets? I.e. maybe pstate_ids is discontiguous while pmcr_set() (and cpufreq's sets) are contiguous.
Wait, aren't we indicing pstate_ids twice? Is the right fix something like this: --- a/sys/powerpc/cpufreq/pmcr.c +++ b/sys/powerpc/cpufreq/pmcr.c @@ -189,13 +189,13 @@ pmcr_set(device_t dev, const struct cf_setting *set) if (set == NULL) return (EINVAL); - if (set->spec[0] < 0 || set->spec[0] > npstates) +#if 0 + if (set->spec[0] < 0 || set->spec[0] >= npstates) return (EINVAL); +#endif - pmcr = ((long)pstate_ids[set->spec[0]] << PMCR_LOWERPS_SHIFT) & - PMCR_LOWERPS_MASK; - pmcr |= ((long)pstate_ids[set->spec[0]] << PMCR_UPPERPS_SHIFT) & - PMCR_UPPERPS_MASK; + pmcr = ((long)set->spec[0] << PMCR_LOWERPS_SHIFT) & PMCR_LOWERPS_MASK; + pmcr |= ((long)set->spec[0] << PMCR_UPPERPS_SHIFT) & PMCR_UPPERPS_MASK; Given that spec[0] was assigned: 176 sets[i].spec[0] = pstate_ids[i]; in pmcr_settings?
Created attachment 199713 [details] Possible fix Please try this patch. I think it should fix the issue.
(In reply to Conrad Meyer from comment #5) Well, that certainly squashes the error and powerd is running happily on archon.nyi.freebsd.org I'm note that powerd is a bit hungry when it comes to CPU as top reports it will hit about 3-5% when it runs. That is, as the turn of phrase goes, "An exercise left for the reader." I turned on debug to see what powerd was doing and in most cases, probably due to how "busy" it is when writing to the console, it left the CPUs at max. I didn't see a sysctl that would tell me "current" frequency, so I'm assuming that its fine.
I'll commit the patch on comment #5; I'd like to get a 'Reviewed by:' someone more familiar with power first, though.
(In reply to Conrad Meyer from comment #7) Looks good to me. The reason it works as-is on POWER9 is that the index directly matches the id that gets inserted. On POWER8 the id that gets inserted is increasingly negative, so will always fail. Your patch looks like the correct fix.
I'm happier with the explanation of why it worked on power9, thanks. :-)
A commit references this bug: Author: cem Date: Sat Dec 1 21:37:47 UTC 2018 New revision: 341389 URL: https://svnweb.freebsd.org/changeset/base/341389 Log: pmcr: Fix pstate setting on Power8 Fix p-state setting on Power8 by removing the accidental double-indirection of the pstate_ids table. The pstate_ids table comes from the OF property "ibm,pstate-ids." On Power9, the values happen to be identical to the indices, so the extra indirection was harmless. On Power8, the values were out of the range [0, npstates], so pmcr_set() would fail the spec[0] range check with EINVAL. While here, include both the value and index in the driver-specific register array as spec[0] and spec[1] respectively. They're redundant, but relatively harmless, and it may aid debugging. While here, fix the range check to exclude the index npstates, which is one past the last valid index. PR: 233693 Reported and tested by: sbruno Reviewed by: jhibbits Changes: head/sys/powerpc/cpufreq/pmcr.c