Bug 236513 - HP Thin clients T620/T730 ACPI: Only CPU core 0 detects C2 state
Summary: HP Thin clients T620/T730 ACPI: Only CPU core 0 detects C2 state
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: John Baldwin
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2019-03-13 18:58 UTC by stockhausen
Modified: 2019-05-14 06:37 UTC (History)
2 users (show)

See Also:


Attachments
dmesg (36.18 KB, text/plain)
2019-03-13 18:58 UTC, stockhausen
no flags Details
Screenshot dmesg (579.64 KB, image/jpeg)
2019-03-22 19:11 UTC, stockhausen
no flags Details
Fedora powertop (499.99 KB, image/jpeg)
2019-03-22 20:09 UTC, stockhausen
no flags Details
dmesg of T620 FreeBSD 12 boot (10.31 KB, text/plain)
2019-03-25 06:12 UTC, stockhausen
no flags Details
acpudump -dt of T620 (709.35 KB, text/plain)
2019-03-25 06:13 UTC, stockhausen
no flags Details
pciconf Output (9.02 KB, text/plain)
2019-03-25 09:45 UTC, stockhausen
no flags Details
devinfo -rv (17.60 KB, text/plain)
2019-03-25 13:38 UTC, stockhausen
no flags Details
Original (buggy) ACPIDUMP of HP T620 Plus (83.88 KB, application/octet-stream)
2019-03-27 14:23 UTC, stockhausen
no flags Details
Modified (working) ACPIDUMP of HP T620 Plus (83.88 KB, application/octet-stream)
2019-03-27 14:23 UTC, stockhausen
no flags Details
shareable_acpi_set_resource.patch (1.18 KB, patch)
2019-03-28 19:27 UTC, John Baldwin
no flags Details | Diff
acpi_cpu_noreserve.patch (616 bytes, patch)
2019-03-29 16:16 UTC, John Baldwin
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description stockhausen 2019-03-13 18:58:48 UTC
Created attachment 202845 [details]
dmesg

Hi,

we are running FreeBSD 11.2-RELEASE-p6 on a HP T620 Thin client. Trying to enable power saving through C states we noticed that the system presents different states for the cores:

sysctl -a | grep cx_supported
dev.cpu.3.cx_supported: C1/1/0
dev.cpu.2.cx_supported: C1/1/0
dev.cpu.1.cx_supported: C1/1/0
dev.cpu.0.cx_supported: C1/1/0 C2/2/400

Enabling C2 state on CPU shows that it seems to work quite well:

sysctl -a | grep cx_usage:
dev.cpu.3.cx_usage: 100.00% last 34923us
dev.cpu.2.cx_usage: 100.00% last 10us
dev.cpu.1.cx_usage: 100.00% last 3853us
dev.cpu.0.cx_usage: 5.54% 94.45% last 33us

CPU details:

CPU: AMD GX-420CA SOC with Radeon(tm) HD Graphics    (1996.29-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x700f01  Family=0x16  Model=0x0  Stepping=1
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x3ed8220b<SSE3,PCLMULQDQ,MON,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C>
  AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
  AMD Features2=0x154037ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT,Topology,PNXC,DBE,PL2I>
  Structured Extended Features=0x8<BMI1>
  XSAVE Features=0x1<XSAVEOPT>
  SVM: (disabled in BIOS) NP,NRIP,AFlush,DAssist,NAsids=8
  TSC: P-state invariant, performance statistics

Is this the expected behaviour?

dmesg log attached.
Comment 1 Conrad Meyer freebsd_committer 2019-03-13 20:03:23 UTC
This is a Jaguar-class (Family 16h) AMD CPU.

This information is probed automatically out of the ACPI information presented by your BIOS.  That code hasn't changed appreciably in 4 years.  It seems your BIOS does not provide a _CST for all 4 cores, or it is invalid.

Do you see any logs in dmesg about "skipping invalid Cx state" or similar?

Ah, it might be possible that your CPU is new enough to define the cores as Device objects rather than Processor, but you're running too old of FreeBSD to enumerate the Device objects and find appropriate C-states _CST tables?  I don't know if r326956 has been MFCed to 11.x, but that change may address the issue.

Early AP startup (present in 12.0, I think) may also affect this, but I'm not sure.

C-state enumeration on Family 17h AMD CPUs seems to work just fine on CURRENT.
Comment 2 Conrad Meyer freebsd_committer 2019-03-13 20:12:44 UTC
(In reply to Conrad Meyer from comment #1)
> I don't know if r326956 has been MFCed to 11.x

FWIW, that change is not present in 11.x, but is in 12.0.  Please try running 12.0, if possible.
Comment 3 stockhausen 2019-03-13 20:17:49 UTC
Thanks for the quick response. I'll give 12.0 a try and give feedback asap.
Comment 4 stockhausen 2019-03-13 20:20:28 UTC
Speculative question in advance. If this situation does not change in 12.0:

Might it be that the system only supports C2 state for the whole CPU package?

And if yes: Shouldn't this be reflected in all cores?
Comment 5 stockhausen 2019-03-22 19:07:17 UTC
Finally I got time to retest everything with 12.0-RELEASE.

Situation is the same. Only the first CPU can go to C2 state. All other cores only show up with cx_lowest = C1.

Only difference this time: It is the successor CPU AMD RX-427BB.

Btw. The hardware in my tests is:

First Time: HP Thin client T620 PLUS
Second Time: HP Thin client T730
Comment 6 stockhausen 2019-03-22 19:11:58 UTC
Created attachment 203049 [details]
Screenshot dmesg
Comment 7 stockhausen 2019-03-22 20:08:41 UTC
Just fired up a Fedora 29 on the HP T730. 

Running a cpu intensive task and looking at the powertop output I would say that C2 states are core independent. 

Conclusion: Linux kernel detects C2 states per core. So at least BIOS should not be broken.

Screenshot attached.
Comment 8 stockhausen 2019-03-22 20:09:03 UTC
Created attachment 203050 [details]
Fedora powertop
Comment 9 stockhausen 2019-03-22 20:41:06 UTC
Bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=189267 shows other occurences of CPUs where c states for the first core is C1+C2 while all other cores are C1 only.
Comment 10 stockhausen 2019-03-24 20:48:14 UTC
Findings so far:

acpi_cpu_cx_cst() runs fine for CPU core 0. With core 1 something is going wrong:

1. acpi_cpu_cx_cst() finds C2 state (WHAT WE EXPECT!)
2. in the function the valid states loop comes to this sequence:

...
#endif
        {
            cx_ptr->res_rid = sc->cpu_cx_count;
            acpi_PkgGas(sc->cpu_dev, pkg, 0, &cx_ptr->res_type,
                &cx_ptr->res_rid, &cx_ptr->p_lvlx, RF_SHAREABLE);
            if (cx_ptr->p_lvlx) {
                ACPI_DEBUG_PRINT((ACPI_DB_INFO,
                     "acpi_cpu%d: Got C%d - %d latency\n",
...

2a. acpi_PkgGas() is called
2b. sanity check in acpi_PkgGas() is ok
2c. acpi_PkgGas() calls acpi_bus_alloc_gas() that returns with rc=ENOMEM
2d. In main function cx_ptr->p_lvlx is NULL
2e. C2 is not added to the list of C states
Comment 11 stockhausen 2019-03-24 20:51:47 UTC
Memo for myself: This can only occur if acpi_bus_alloc_gas() fails in bus_alloc_resource_any():

acpi_bus_alloc_gas()
{
    ...
    error = ENOMEM;
    ...
    *res = bus_alloc_resource_any(dev, res_type, rid, RF_ACTIVE | flags);
    if (*res != NULL) {
        *type = res_type;
        error = 0;
    } else
        bus_delete_resource(dev, res_type, *rid);

    return (error);
}
Comment 12 Andriy Gapon freebsd_committer 2019-03-24 21:38:21 UTC
Has anyone attached an acpidump from an affected system yet?
Comment 13 stockhausen 2019-03-25 06:12:52 UTC
Created attachment 203116 [details]
dmesg of T620 FreeBSD 12 boot
Comment 14 stockhausen 2019-03-25 06:13:32 UTC
Created attachment 203117 [details]
acpudump -dt of T620
Comment 15 stockhausen 2019-03-25 06:14:45 UTC
(In reply to Andriy Gapon from comment #12)

Attached as requested
Comment 16 stockhausen 2019-03-25 09:45:29 UTC
Created attachment 203120 [details]
pciconf Output
Comment 17 Andriy Gapon freebsd_committer 2019-03-25 13:34:45 UTC
Could you please also add devinfo -rv output?
Comment 18 stockhausen 2019-03-25 13:38:00 UTC
Created attachment 203132 [details]
devinfo -rv
Comment 19 stockhausen 2019-03-25 13:38:19 UTC
(In reply to stockhausen from comment #18)

Attached.
Comment 20 stockhausen 2019-03-25 15:12:05 UTC
Memo for myself: Seems as if port 0x414 is only available for core0 and not the other ones. Maybe that is the source of the registration error.

At least there are other devinfo outputs that show that a port can be bound to different cores: https://people.freebsd.org/~adrian/laptop/lenovo_x230/devinfo-rv.txt
Comment 21 stockhausen 2019-03-25 17:50:41 UTC
And one OpenBSD evidence about an AMD mobile CPU that has exactly our setup: http://openbsd-archive.7691.n7.nabble.com/Disabled-USB-port-td328343.html

acpicpu0 at acpi0: C2(0@400 io@0x414), C1(@1 halt!), PSS 
acpicpu1 at acpi0: C2(0@400 io@0x414), C1(@1 halt!), PSS 
acpicpu2 at acpi0: C2(0@400 io@0x414), C1(@1 halt!), PSS 
acpicpu3 at acpi0: C2(0@400 io@0x414), C1(@1 halt!), PSS
Comment 22 stockhausen 2019-03-25 19:32:05 UTC
After a lot of investigation it seems to boil down to some kind of timing problem.

acpi_resource.c:acpi_res_set_ioport() is called for the ACPI defined ressources. In our case there are several IO ports according to acpidump:

0xcf8, 8 ports
0x0, 944 ports
0x3e0, 2328 ports
... and so on ...

Somewhere inside it calls acpi.c:acpi_set_resource(). But this silently fails exactly here:

    /* Don't reserve resources until the system resources are allocated. */
    if (!sc->acpi_resources_reserved)
        return (0);

Must be some kind of concurrency. Don't know yet where it comes from.
Comment 23 stockhausen 2019-03-25 20:16:53 UTC
ok. ignore comment #22 i try to explain it once again. The problem seems to be the other way round.

CPU wants to register port 0x414 for C state handling. This should be covered by BIOS ACPI tables starting at port 0x3e0 with a total of 2328 (0x918) ports. Due to misconfiguaration this port range is attached to the PCIe bridge. acpi_set_resource() simply ignores this registration

    /*
     * Ignore most resources for PCI root bridges.  Some BIOSes
     * incorrectly enumerate the memory ranges they decode as plain
     * memory resources instead of as ResourceProducer ranges.  Other
     * BIOSes incorrectly list system resource entries for I/O ranges
     * under the PCI bridge.  Do allow the one known-correct case on
     * x86 of a PCI bridge claiming the I/O ports used for PCI config
     * access.
     */
#if defined(__i386__) || defined(__amd64__)
    if (type == SYS_RES_MEMORY || type == SYS_RES_IOPORT) {
        if (ACPI_SUCCESS(AcpiGetObjectInfo(ad->ad_handle, &devinfo))) {
            if ((devinfo->Flags & ACPI_PCI_ROOT_BRIDGE) != 0) {
                if (!(type == SYS_RES_IOPORT && start == CONF1_ADDR_PORT)) {
                    printf("ADBG: acpi_set_resource(%lx,%lu) inside root bridge\n", start,count);
                    AcpiOsFree(devinfo);
                    return (0);
                }
            }
            AcpiOsFree(devinfo);
        }
    }
#endif

So we are left without port 0x414. This gets registered later under CPU0. See devinfo:

    cpu0 pnpinfo _HID=none _UID=0 at handle=\_PR_.P000
        I/O ports:
            0x414

If registragtion works fine this should read "ACPI I/o port" and port 0x414 should be visible below acpi0 node.

To fix that situation in a clean way acpi_cpu_cx_cst() should re-register the port below acpi0 just for safety.

No idea how to do that...
Comment 24 stockhausen 2019-03-26 18:09:56 UTC
One step further:

The wrong ACPI definitions are indeed the culprit of the wrong C2 state registration. I implemented a lousy patch in acpi.c:

/*
 * Pre-allocate/manage all memory and IO resources.  Since rman can't handle
 * duplicates, we merge any in the sysresource attach routine.
 */
static int
acpi_sysres_alloc(device_t dev)
{
  ...
    rl = BUS_GET_RESOURCE_LIST(device_get_parent(dev), dev);
    STAILQ_FOREACH(rle, rl, link) {
        if (rle->res != NULL) {
            device_printf(dev, "duplicate resource for %jx\n", rle->start);
            continue;
        }

        if (rle->start == 0x40b 
         && rle->count == 1 
         && rle->type == SYS_RES_IOPORT) {
          printf("ACPI: fake IO port range 0x40b/1 -> 0x400/32\n");
          rle->start = 0x400;
          rle->count = 32;
        }

So when ACPI wants to register a single port 0x40b we expand the range to 0x400-0x420. 

With that devinfo -rv shows:

nexus0
  ...
  acpi0
      Interrupt request lines:
          0x9
      I/O ports:
          ...
          0x400-0x41f
          ...
    cpu0 pnpinfo _HID=none _UID=0 at handle=\_PR_.P000
        ACPI I/O ports:
            0x414
      acpi_perf0
      acpi_throttle0
      hwpstate0
      cpufreq0
    cpu1 pnpinfo _HID=none _UID=0 at handle=\_PR_.P001
        ACPI I/O ports:
            0x414
      acpi_perf1
      acpi_throttle1
      hwpstate1

Yeah! The second and following cores can register I/O port 0x414. And finally sysctl shows all C2 states:

# sysctl -a | grep cx_suppo
dev.cpu.3.cx_supported: C1/1/0 C2/2/400
dev.cpu.2.cx_supported: C1/1/0 C2/2/400
dev.cpu.1.cx_supported: C1/1/0 C2/2/400
dev.cpu.0.cx_supported: C1/1/0 C2/2/400

Now I need some assistance:

1. Either patch the BIOS. But how?

2. Implement a workaround in CPU registration if port is not yet registered. But how?

Thanks in advance.
Comment 25 Andriy Gapon freebsd_committer 2019-03-27 07:16:14 UTC
(In reply to stockhausen from comment #24)
I still don't understand the problem.
How does that 0x40b 1-byte port interfere with 0x414 port?
Or is it the other way around?  Are you abusing the system resource for 0x40b to compensate for something wrong with 0x414?

It would be helpful [to me] to understand precisely why, on the unpatched system, setting and allocating 0x414 port seems to work fine for cpu0 but not for the other cpus?  That includes the order of acpi cpu devices attachment (relative to each other and to other acpi "things").  An exact place in the code where setting or allocating the resource fails and a reason why.
Comment 26 stockhausen 2019-03-27 08:08:58 UTC
(In reply to Andriy Gapon from comment #25)

Sorry for the confusion. I'm totally new to the topic and try to do my best to explain my observations more clear.

The thin clients I'm talking about have a C2 power saving state that is controlled by port 0x414. This port is the same for all 4 cores. Due to the ACPI cpu startup logic the port reservation must work for each individual core. If it fails C2 is not available for a core.

In the buggy case we can see:

1. System registers all known ports in acpi_sysres_alloc() in resource manager. Port 0x414 is skipped because it is not correctly defined. There exists a BIOS range 0x3e0 with 2328 ports but it is located directly under the PCI bridge. 
2. So we have no range under acpi0 > I/O ports that covers port 0x414.
3. CPUs are fired up and do their initialization. The first CPU wants to register port 0x414. As it is not yet managed the registration process hangs the port as "I/O port" below CPU0.
4. The second CPU tries to do the same but it fails. For whatever reason

Result:

nexus0
  acpi0
    I/O ports:
  cpu0 pnpinfo _HID=none _UID=0 at handle=\_PR_.P000
        I/O ports: <-- Notice no type ACPI!
            0x414  

Now to the fixed case:

1. I found a port range 0x40B-0x40B in the ACPI tables that registers fine in acpi_sysres_alloc() just before the CPUs.
2. I modified the ACPI tables and extended the range to something larger that also covers port 0x414
3. After a restart with the modified table port 0x414 is now covered by a range under acpi0 > I/O ports.
4. During CPU startup all CPUs can correctly allocate port 0x414 and a "copy" named "ACPI I/O port" is placed under each CPU core.
 
Result:

nexus0
  acpi0
    I/O ports:
      0x400-0x41f
  cpu0 pnpinfo _HID=none _UID=0 at handle=\_PR_.P000
        ACPI I/O ports:  <-- Notice type ACPI!
            0x414
Comment 27 stockhausen 2019-03-27 14:23:00 UTC
Created attachment 203191 [details]
Original (buggy) ACPIDUMP of HP T620 Plus
Comment 28 stockhausen 2019-03-27 14:23:35 UTC
Created attachment 203192 [details]
Modified (working) ACPIDUMP of HP T620 Plus
Comment 29 stockhausen 2019-03-27 16:22:30 UTC
Memo for myself

T730: AMD BIOS Developer Guide (Family=0x15  Model=0x30)
https://www.amd.com/system/files/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf

T620: AMD BIOS Developer Guide (Family=0x16  Model=0x0)
https://www.amd.com/system/files/TechDocs/48751_16h_bkdg.pdf
Comment 30 Andriy Gapon freebsd_committer 2019-03-27 21:57:52 UTC
(In reply to stockhausen from comment #26)
So, I am mostly interested in point 4 of the buggy case.  The rest was already clear.
Specifically, why something that works for cpu0 does not work for the rest.
The resource is allocated as RF_SHARED, so my expectation is that all cores should be able to set and allocate it.  But apparently not so.
Comment 31 Andriy Gapon freebsd_committer 2019-03-27 23:36:53 UTC
I think I am starting to see what you've been saying.

So, first of all, as you discovered, your BIOS indeed provides a buggy ACPI table.  But I am not sure where exactly the bug is.
It could be that the table does not provide a system resource for I/O port 0x414 by mistake.  Alternatively, it could be that 0x414 is a wrong value and it should be something else (e.g., 0x814).  In that case, even when you get C2 state to appear it actually would not do its job.

The other problem, as you noted, that if 0x414 is not defined as a system resource, then only cpu0 is able to allocate it.  All other cpu devices fail to do it.  I haven't fully understood yet why that happens.  And also I am not sure if that is what should really happen.

The best way to fix this problem is to fix the BIOS / ACPI, of course.
But I wonder if there is anything that FreeBSD could do to work around or "quirk up" such a problem.

P.S.
0x814 I mentioned is not an arbitrary port.
I see it on some AMD systems, although with different processors.
Also, all other processor related ACPI IO ports seem to be in the 0x800 area.
And it seems that PMBS and PMLN constants are supposed to define the processor power management I/O region as a system resource.
The relevant section of the BKDG document for your processor is 3.25.15.5.
So, I guess the only way to learn the true value of PLvl2 register is to somehow read what's in PMx66 register.

P.P.S. The PCI root bridge settings appear to be correct and I do not see them contributing to the problem.
Comment 32 stockhausen 2019-03-28 06:46:16 UTC
Thanks for the helpful discussion. I think I'm getting a better knowledge about it all. Reading across the AMD documents I think everything works as expected. Or at least quite close.

Comparing the T620 plus and the T730 I noticed that the C2 state port is different. 0x1771 and not 0x414. Reason is simple - this port is arbitrary set by BIOS. Explanation comes from the AMD family 0x15 models 0x30-0x03F developer guide 2.5.3.2.7 BIOS Requirements for Initialization:

1. Initialize MSRC001_0073[CstateAddr] with an available IO address. See 2.5.3.2.6.3 [_CRS].

-> HP chose port 0x1770 for the T730 and 0x413 for the T620 as can be seen from cpucontrol output
root@freebsd:~ # cpucontrol -m 0xC0010073 /dev/cpuctl0
MSR 0xc0010073: 0x00000000 0x00001770

2. Initialize D18F4x11[C:8].

-> I don't know how to read that indexed register

3. Generate ACPI objects as follows
3.a The _CST object should be generated for each core as follows:
- Count = 1.
- Register = MSRC001_0073[CstateAddr] + 1.
- Type = 2.
- Latency = 400. 
- Power = 0.

-> We can see that in the decoded ACPI dump
   ResourceTemplate ()
   {
     Register (SystemIO,
               0x08,               // Bit Width
               0x00,               // Bit Offset
               0x0000000000001771, // Address
               0x01,               // Access Size
              )
   },


3.b The _CSD object should be generated for each core as follows:
- NumberOfEntries = 6.
- Revision = 0.
- Domain = CPUID Fn0000_0001_EBX[LocalApicId[7:1]].
- CoordType = FEh. (HW_ALL)
- NumProcessors = 2.
- Index = 0.

-> Also inside the ACPI dump
   Name (_CSD, Package (0x01)  // _CSD: C-State Dependencies
   {
     Package (0x06)
     {
       0x06,
       0x00,
       0x00000000,
       0x000000FE,
       0x00000002,
       0x00000000
     }
   })

3.b BIOS must declare in the root host bridge _CRS object that the IO address range from MSRC001_0073[CstateAddr] to MSRC001_0073[CstateAddr]+7 is consumed by the host bridge.

-> This is missing in the ACPI tables. We need an entry for 8 ports starting from 0x1770. 

4. Fixed ACPI Description Table (FADT) Entries

-> Should not bee neded as FreeBSD is hopefully ACPI 2.0 aware and can read the _CST object from above.

So two question remain

A. Is the current ACPI code able to compensate for the missing port allocation ranges in BIOS? If you say yes we have found an error.

B. My stupid fix may be totally wrong. Where should I place the port registration in the ACPI tables?
Comment 33 stockhausen 2019-03-28 06:52:45 UTC
Btw. I found a sample with C2 state port 0x814

https://mail-index.netbsd.org/current-users/2010/07/21/msg013890.html
Comment 34 Andriy Gapon freebsd_committer 2019-03-28 08:25:54 UTC
(In reply to stockhausen from comment #32)
1. Ah, you are right, I forgot that modern processors intercept C-state I/O accesses at a core level.
Just to clarify, I assume that on your T620 cpucontrol -m 0xC0010073 /dev/cpuctl0 gives 0x414?

3.b(2) The root bridge has a range that covers 0x414, so that's okay.

A. I do not think so.  I think that the code currently expects BIOS to set up everything correctly. We could try to think about how FreeBSD ACPI code could work around not having Cx I/O ports in the system resources...

By the way, I see why acpi cpu devices cannot share a resource if it's not an acpi system resource.  If a resource is a system resource then it is allocated from the nexus (very early) and the acpi bus manages it on its own.  That means that acpi_set_resource() -> resource_list_reserve(flags=0) -> resource_list_alloc() -> BUS_ALLOC_RESOURCE() -> nexus_alloc_resource() -> rman_reserve_resource(flags=0) would always fail.  But that's okay, because acpi_alloc_resource() has a fallback to acpi_alloc_sysres().  And if the resource is consistently allocated with RF_SHARED, then all allocations of it succeed.
If a resource is _not_ a system resource, then the resource is not allocated by the acpi bus in the nexus as the acpi bus is unaware of it.  Then, when the first cpu calls acpi_PkgGas() -> acpi_bus_alloc_gas() -> bus_set_resource() -> acpi_set_resource() -> resource_list_reserve(flags=0) -> resource_list_alloc(flags=0) -> BUS_ALLOC_RESOURCE(flags=0) -> nexus_alloc_resource(flags=0) -> rman_reserve_resource(flags=0), the resource is allocated in the nexus and it is allocated without RF_SHARED flag.  Because the allocation is successful, the resource is added to the device's resource list.  After bus_set_resource(), acpi_bus_alloc_gas() calls bus_alloc_resource_any() -> acpi_alloc_resource(flags=RF_ACTIVE|RF_SHARED) -> resource_list_alloc().
Because the resource already exists in the device's resource list, resource_list_alloc() would simply activate it and return. So, for the first cpu acpi_bus_alloc_gas() is successful.
For any subsequent cpu, this call chain acpi_PkgGas() -> acpi_bus_alloc_gas() -> bus_set_resource() -> acpi_set_resource() -> resource_list_reserve(flags=0) -> resource_list_alloc(flags=0) -> BUS_ALLOC_RESOURCE(flags=0) -> nexus_alloc_resource(flags=0) -> rman_reserve_resource(flags=0) would fail because the resource is already allocated in nexus and it is not shared.  Thus, the resource would not be created in the device's resource list.  Thus, the subsequent call to bus_alloc_resource_any() -> acpi_alloc_resource(flags=RF_ACTIVE|RF_SHARED) -> resource_list_alloc() would not find that resource and it would have to call BUS_ALLOC_RESOURCE() -> nexus_alloc_resource() -> rman_reserve_resource() and that would fail again for the same reason.  And, in this case, acpi_alloc_resource()'s fallback to acpi_alloc_sysres() would also fail, because the resource is not a system resource.  In the end, acpi_PkgGas() -> acpi_bus_alloc_gas() fails just as you have seen.

B. I would add 0x414 to CRS of Device S900.  E.g. using 0x040B as an example, I would add this just after it:
IO (Decode16,
    0x0414,             // Range Minimum
    0x0414,             // Range Maximum
    0x00,               // Alignment
    0x04,               // Length
    )
Comment 35 stockhausen 2019-03-28 08:53:49 UTC
Phew, I see you found the culprit. Although I do not know every single function I saw a lot of the names during my code analysis.

Regarding HP T620 settings: cpucontrol -m 0xC0010073 /dev/cpuctl0 gives 0x413 (The CPU gets Port+1 from BIOS). So the ACPI tables need:

For T620:
IO (Decode16,
    0x0413,             // Range Minimum
    0x0413,             // Range Maximum
    0x00,               // Alignment
    0x08,               // Length
    )

For T730:
IO (Decode16,
    0x1770,             // Range Minimum
    0x1770,             // Range Maximum
    0x00,               // Alignment
    0x08,               // Length
    )

With these settings C2 states come up fine on both machines.

I will check if all of this hard work at least gives some power saving effects.
Comment 36 John Baldwin freebsd_committer freebsd_triage 2019-03-28 19:15:52 UTC
It may be that we need to fix the RF_SHAREABLE to propagate up when the resource is first allocated by resource_list_reserve, etc.  That might fix this without requiring the BIOS to be patched.  I haven't looked to see what that would involve though.
Comment 37 John Baldwin freebsd_committer freebsd_triage 2019-03-28 19:27:01 UTC
Created attachment 203218 [details]
shareable_acpi_set_resource.patch

This is an untested hack that uses RF_SHAREABLE when reserving resources for CPUs.  The other approach is we could just not reserve resources for CPU devices which might be safer as it would then always honor the flags passed to bus_alloc_resource().
Comment 38 stockhausen 2019-03-29 10:37:53 UTC
Sorry not quite yet... The only call to acpi_set_resource() during _CST port setup has the following input:

class of dev: "acpi"
class of child: "cpu"

A slight modification gives:

    if (device_get_devclass(child) == devclass_find("cpu")) {
            flags = RF_SHAREABLE;
    }
    else {
            flags = 0;
    }

Ports are now registered per CPU (which is fine), but devinfo gives different output:

nexus0
      I/O ports:
        < no port 0x1771 here ...>

    cpu0 pnpinfo _HID=none _UID=0 at handle=\_PR_.P000
        I/O ports:  <---- Port without ACPI flag
            0x1771
      acpi_perf0
      acpi_throttle0
      hwpstate0
      cpufreq0
    cpu1 pnpinfo _HID=none _UID=0 at handle=\_PR_.P001
        I/O ports:
            0x1771

Nevertheless CPU C states seem to register correctly:

root@freebsd:~ # sysctl -a | grep cx_supp
dev.cpu.3.cx_supported: C1/1/0 C2/2/400
dev.cpu.2.cx_supported: C1/1/0 C2/2/400
dev.cpu.1.cx_supported: C1/1/0 C2/2/400
dev.cpu.0.cx_supported: C1/1/0 C2/2/400
Comment 39 John Baldwin freebsd_committer freebsd_triage 2019-03-29 16:01:57 UTC
Yes, having the resources be per-CPU in devinfo is the expected effect of this patch.  I think though that I want to modify this a bit more so that we just don't reserve resources for CPU devices rather than forcing RF_SHAREABLE.  This will then honor whatever is passed to bus_alloc_resource().
Comment 40 John Baldwin freebsd_committer freebsd_triage 2019-03-29 16:16:28 UTC
Created attachment 203244 [details]
acpi_cpu_noreserve.patch

This is the new patch that doesn't try to force shareable but skips reserving.  This is more consistent with other workarounds in this function that generally just skip reserving when that has problems.
Comment 41 stockhausen 2019-03-29 19:05:35 UTC
The new patch only works if we use child instead of dev.

old: device_get_devclass(dev)
new: device_get_devclass(child)
Comment 42 Andriy Gapon freebsd_committer 2019-03-29 19:09:33 UTC
(In reply to stockhausen from comment #41)
Correct, it should be 'child' as 'dev' in that context is the acpi bus itself.
Comment 43 Andriy Gapon freebsd_committer 2019-03-29 19:13:23 UTC
(In reply to John Baldwin from comment #40)
John, this is a great simple patch.  I tried to come up with a way to achieve this effect, but didn't think about this approach.
Comment 44 stockhausen 2019-03-29 19:25:12 UTC
(In reply to Andriy Gapon from comment #43)
I can confirm that the patch works fine (with child instead of dev)

devinfo -rv
    cpu0 pnpinfo _HID=none _UID=0 at handle=\_PR_.P000
        I/O ports:
            0x1771
      acpi_perf0
      acpi_throttle0
      hwpstate0
      cpufreq0
    cpu1 pnpinfo _HID=none _UID=0 at handle=\_PR_.P001
        I/O ports:
            0x1771
      acpi_perf1
      acpi_throttle1
      hwpstate1

root@freebsd:~ # sysctl -a | grep cx_supp
dev.cpu.3.cx_supported: C1/1/0 C2/2/400
dev.cpu.2.cx_supported: C1/1/0 C2/2/400
dev.cpu.1.cx_supported: C1/1/0 C2/2/400
dev.cpu.0.cx_supported: C1/1/0 C2/2/400
Comment 45 stockhausen 2019-04-09 16:11:49 UTC
Is there anything I can do, to mainline this patch?
Comment 46 John Baldwin freebsd_committer freebsd_triage 2019-04-09 19:22:42 UTC
I was just wanting to test it locally first, sorry.
Comment 47 commit-hook freebsd_committer 2019-04-09 19:22:52 UTC
A commit references this bug:

Author: jhb
Date: Tue Apr  9 19:22:09 UTC 2019
New revision: 346063
URL: https://svnweb.freebsd.org/changeset/base/346063

Log:
  Don't pre-reserve resources for CPU devices when they are set.

  CPUs can use shared (RF_SHAREABLE) resources for the I/O port used for
  entering and exiting C states.  If this I/O port is included in an ACPI
  system resource device, then this happens to still work, but if the port
  wasn't part of a system resource device, only the first CPU could allocate
  the I/O port and use C states since resource_list_reserve() was always
  allocating the resource from nexus0 without RF_SHAREABLE.  By avoiding
  the reservation, the flags from the bus_alloc_resource() in the CPU driver
  (which include RF_SHAREABLE) are honored.

  PR:		236513
  Reported by:	stockhausen@collogia.de
  Sleuthing by:	avg
  Reviewed by:	avg
  MFC after:	2 weeks

Changes:
  head/sys/dev/acpica/acpi.c
Comment 48 commit-hook freebsd_committer 2019-05-01 17:30:52 UTC
A commit references this bug:

Author: jhb
Date: Wed May  1 17:30:14 UTC 2019
New revision: 346999
URL: https://svnweb.freebsd.org/changeset/base/346999

Log:
  MFC 346063: Don't pre-reserve resources for CPU devices when they are set.

  CPUs can use shared (RF_SHAREABLE) resources for the I/O port used for
  entering and exiting C states.  If this I/O port is included in an ACPI
  system resource device, then this happens to still work, but if the port
  wasn't part of a system resource device, only the first CPU could allocate
  the I/O port and use C states since resource_list_reserve() was always
  allocating the resource from nexus0 without RF_SHAREABLE.  By avoiding
  the reservation, the flags from the bus_alloc_resource() in the CPU driver
  (which include RF_SHAREABLE) are honored.

  PR:		236513

Changes:
_U  stable/11/
  stable/11/sys/dev/acpica/acpi.c
_U  stable/12/
  stable/12/sys/dev/acpica/acpi.c
Comment 49 Andriy Gapon freebsd_committer 2019-05-14 06:37:14 UTC
Just a note that I had the same problem on an APU2C4 box that I recently received. And this change fixed it, of course.