Bug 233693 - [PowerPC64] Powerd unable to change cpu frequency
Summary: [PowerPC64] Powerd unable to change cpu frequency
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: powerpc Any
: --- Affects Some People
Assignee: Conrad Meyer
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-01 16:18 UTC by Sean Bruno
Modified: 2018-12-01 21:39 UTC (History)
2 users (show)

See Also:


Attachments
Kernel Config being used on this machine (6.39 KB, text/plain)
2018-12-01 16:29 UTC, Sean Bruno
no flags Details
Possible fix (1.22 KB, patch)
2018-12-01 17:12 UTC, Conrad Meyer
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sean Bruno freebsd_committer 2018-12-01 16:18:47 UTC
The Tyan Power8 machines seem to need some kind of tweak to allow any powerd functionality.  Starting powerd simply emits this repeatedly on the console:

pmcr0: set freq failed, err 22
pmcr1: set freq failed, err 22
pmcr2: set freq failed, err 22
pmcr3: set freq failed, err 22
pmcr0: set freq failed, err 22
pmcr1: set freq failed, err 22
pmcr2: set freq failed, err 22
pmcr3: set freq failed, err 22

pmcr(8) seems to be detecting the following:
root@pylon.nyi:~ # sysctl dev.pmcr
dev.pmcr.3.freq_settings: 4322/-1 4289/-1 4256/-1 4222/-1 4189/-1 4156/-1 4123/-1 4089/-1 4056/-1 4023/-1 3990/-1 3956/-1 3923/-1 3890/-1 3857/-1 3823/-1 3790/-1 3757/-1 3724/-1 3690/-1 3657/-1 3624/-1 3591/-1 3557/-1 3524/-1 3491/-1 3458/-1 3424/-1 3391/-1 3358/-1 3325/-1 3291/-1 3258/-1 3225/-1 3192/-1 3158/-1 3125/-1 3092/-1 3059/-1 3025/-1 2992/-1 2959/-1 2926/-1 2892/-1 2859/-1 2826/-1 2793/-1 2759/-1 2726/-1 2693/-1 2660/-1 2626/-1 2593/-1 2560/-1 2527/-1 2493/-1 2460/-1 2427/-1 2394/-1 2360/-1 2327/-1 2294/-1 2261/-1 2227/-1 2194/-1 2161/-1 2128/-1 2094/-1 2061/-1
dev.pmcr.3.%parent: cpu3
dev.pmcr.3.%pnpinfo: 
dev.pmcr.3.%location: 
dev.pmcr.3.%driver: pmcr
dev.pmcr.3.%desc: Power Management Control Register
dev.pmcr.2.freq_settings: 4322/-1 4289/-1 4256/-1 4222/-1 4189/-1 4156/-1 4123/-1 4089/-1 4056/-1 4023/-1 3990/-1 3956/-1 3923/-1 3890/-1 3857/-1 3823/-1 3790/-1 3757/-1 3724/-1 3690/-1 3657/-1 3624/-1 3591/-1 3557/-1 3524/-1 3491/-1 3458/-1 3424/-1 3391/-1 3358/-1 3325/-1 3291/-1 3258/-1 3225/-1 3192/-1 3158/-1 3125/-1 3092/-1 3059/-1 3025/-1 2992/-1 2959/-1 2926/-1 2892/-1 2859/-1 2826/-1 2793/-1 2759/-1 2726/-1 2693/-1 2660/-1 2626/-1 2593/-1 2560/-1 2527/-1 2493/-1 2460/-1 2427/-1 2394/-1 2360/-1 2327/-1 2294/-1 2261/-1 2227/-1 2194/-1 2161/-1 2128/-1 2094/-1 2061/-1
dev.pmcr.2.%parent: cpu2
dev.pmcr.2.%pnpinfo: 
dev.pmcr.2.%location: 
dev.pmcr.2.%driver: pmcr
dev.pmcr.2.%desc: Power Management Control Register
dev.pmcr.1.freq_settings: 4322/-1 4289/-1 4256/-1 4222/-1 4189/-1 4156/-1 4123/-1 4089/-1 4056/-1 4023/-1 3990/-1 3956/-1 3923/-1 3890/-1 3857/-1 3823/-1 3790/-1 3757/-1 3724/-1 3690/-1 3657/-1 3624/-1 3591/-1 3557/-1 3524/-1 3491/-1 3458/-1 3424/-1 3391/-1 3358/-1 3325/-1 3291/-1 3258/-1 3225/-1 3192/-1 3158/-1 3125/-1 3092/-1 3059/-1 3025/-1 2992/-1 2959/-1 2926/-1 2892/-1 2859/-1 2826/-1 2793/-1 2759/-1 2726/-1 2693/-1 2660/-1 2626/-1 2593/-1 2560/-1 2527/-1 2493/-1 2460/-1 2427/-1 2394/-1 2360/-1 2327/-1 2294/-1 2261/-1 2227/-1 2194/-1 2161/-1 2128/-1 2094/-1 2061/-1
dev.pmcr.1.%parent: cpu1
dev.pmcr.1.%pnpinfo: 
dev.pmcr.1.%location: 
dev.pmcr.1.%driver: pmcr
dev.pmcr.1.%desc: Power Management Control Register
dev.pmcr.0.freq_settings: 4322/-1 4289/-1 4256/-1 4222/-1 4189/-1 4156/-1 4123/-1 4089/-1 4056/-1 4023/-1 3990/-1 3956/-1 3923/-1 3890/-1 3857/-1 3823/-1 3790/-1 3757/-1 3724/-1 3690/-1 3657/-1 3624/-1 3591/-1 3557/-1 3524/-1 3491/-1 3458/-1 3424/-1 3391/-1 3358/-1 3325/-1 3291/-1 3258/-1 3225/-1 3192/-1 3158/-1 3125/-1 3092/-1 3059/-1 3025/-1 2992/-1 2959/-1 2926/-1 2892/-1 2859/-1 2826/-1 2793/-1 2759/-1 2726/-1 2693/-1 2660/-1 2626/-1 2593/-1 2560/-1 2527/-1 2493/-1 2460/-1 2427/-1 2394/-1 2360/-1 2327/-1 2294/-1 2261/-1 2227/-1 2194/-1 2161/-1 2128/-1 2094/-1 2061/-1
dev.pmcr.0.%parent: cpu0
dev.pmcr.0.%pnpinfo: 
dev.pmcr.0.%location: 
dev.pmcr.0.%driver: pmcr
dev.pmcr.0.%desc: Power Management Control Register
dev.pmcr.%parent:
Comment 1 Sean Bruno freebsd_committer 2018-12-01 16:29:42 UTC
Created attachment 199712 [details]
Kernel Config being used on this machine

Just in case I'm missing something obvious to enable this support, I'm attaching the kernel config being used on this machine.
Comment 2 Conrad Meyer freebsd_committer 2018-12-01 17:02:38 UTC
It doesn't fix this issue, but I suspect there is an off-by-one bug in pmcr_set():

--- a/sys/powerpc/cpufreq/pmcr.c
+++ b/sys/powerpc/cpufreq/pmcr.c
@@ -189,7 +190,7 @@ pmcr_set(device_t dev, const struct cf_setting *set)
        if (set == NULL)
                return (EINVAL);

-       if (set->spec[0] < 0 || set->spec[0] > npstates)
+       if (set->spec[0] < 0 || set->spec[0] >= npstates)
                return (EINVAL);

        pmcr = ((long)pstate_ids[set->spec[0]] << PMCR_LOWERPS_SHIFT) &
Comment 3 Conrad Meyer freebsd_committer 2018-12-01 17:05:03 UTC
My read of pmcr and cpu_freq, based on Sean's supporting CF_DEBUG output, is that probably the pstate_ids table is bogus on this machine.  Or that it shouldn't be used as an array index into sets?  I.e. maybe pstate_ids is discontiguous while pmcr_set() (and cpufreq's sets) are contiguous.
Comment 4 Conrad Meyer freebsd_committer 2018-12-01 17:09:39 UTC
Wait, aren't we indicing pstate_ids twice?   Is the right fix something like this:

--- a/sys/powerpc/cpufreq/pmcr.c
+++ b/sys/powerpc/cpufreq/pmcr.c
@@ -189,13 +189,13 @@ pmcr_set(device_t dev, const struct cf_setting *set)
        if (set == NULL)
                return (EINVAL);

-       if (set->spec[0] < 0 || set->spec[0] > npstates)
+#if 0
+       if (set->spec[0] < 0 || set->spec[0] >= npstates)
                return (EINVAL);
+#endif

-       pmcr = ((long)pstate_ids[set->spec[0]] << PMCR_LOWERPS_SHIFT) &
-           PMCR_LOWERPS_MASK;
-       pmcr |= ((long)pstate_ids[set->spec[0]] << PMCR_UPPERPS_SHIFT) &
-           PMCR_UPPERPS_MASK;
+       pmcr = ((long)set->spec[0] << PMCR_LOWERPS_SHIFT) & PMCR_LOWERPS_MASK;
+       pmcr |= ((long)set->spec[0] << PMCR_UPPERPS_SHIFT) & PMCR_UPPERPS_MASK;


Given that spec[0] was assigned:

  176                 sets[i].spec[0] = pstate_ids[i];

in pmcr_settings?
Comment 5 Conrad Meyer freebsd_committer 2018-12-01 17:12:49 UTC
Created attachment 199713 [details]
Possible fix

Please try this patch.  I think it should fix the issue.
Comment 6 Sean Bruno freebsd_committer 2018-12-01 18:33:07 UTC
(In reply to Conrad Meyer from comment #5)
Well, that certainly squashes the error and powerd is running happily on archon.nyi.freebsd.org

I'm note that powerd is a bit hungry when it comes to CPU as top reports it will hit about 3-5% when it runs.  That is, as the turn of phrase goes, "An exercise left for the reader."

I turned on debug to see what powerd was doing and in most cases, probably due to how "busy" it is when writing to the console, it left the CPUs at max.  I didn't see a sysctl that would tell me "current" frequency, so I'm assuming that its fine.
Comment 7 Conrad Meyer freebsd_committer 2018-12-01 20:37:23 UTC
I'll commit the patch on comment #5; I'd like to get a 'Reviewed by:' someone more familiar with power first, though.
Comment 8 Justin Hibbits freebsd_committer 2018-12-01 20:45:54 UTC
(In reply to Conrad Meyer from comment #7)

Looks good to me.

The reason it works as-is on POWER9 is that the index directly matches the id that gets inserted.  On POWER8 the id that gets inserted is increasingly negative, so will always fail.  Your patch looks like the correct fix.
Comment 9 Conrad Meyer freebsd_committer 2018-12-01 20:46:59 UTC
I'm happier with the explanation of why it worked on power9, thanks. :-)
Comment 10 commit-hook freebsd_committer 2018-12-01 21:38:17 UTC
A commit references this bug:

Author: cem
Date: Sat Dec  1 21:37:47 UTC 2018
New revision: 341389
URL: https://svnweb.freebsd.org/changeset/base/341389

Log:
  pmcr: Fix pstate setting on Power8

  Fix p-state setting on Power8 by removing the accidental double-indirection of
  the pstate_ids table.

  The pstate_ids table comes from the OF property "ibm,pstate-ids."  On Power9,
  the values happen to be identical to the indices, so the extra indirection was
  harmless.  On Power8, the values were out of the range [0, npstates], so
  pmcr_set() would fail the spec[0] range check with EINVAL.

  While here, include both the value and index in the driver-specific register
  array as spec[0] and spec[1] respectively.  They're redundant, but relatively
  harmless, and it may aid debugging.

  While here, fix the range check to exclude the index npstates, which is one
  past the last valid index.

  PR:		233693
  Reported and tested by:	sbruno
  Reviewed by:	jhibbits

Changes:
  head/sys/powerpc/cpufreq/pmcr.c