Bug 253915 - pchtherm0: Sensor enable failed
Summary: pchtherm0: Sensor enable failed
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Andriy Gapon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-28 15:50 UTC by Michal Vanco
Modified: 2021-03-23 11:06 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michal Vanco 2021-02-28 15:50:08 UTC
Hi,

pchtherm in stable/13 doesn't work on my Supermicro X10SDV-TLN4F with Xeon D-1541. I'm not sure if it is supposed to work on this hardware but the pchtherm kernel module is being loaded automatically during boot (i.e. no pchtherm_load in my /boot/loader.conf).

# dmesg | grep pchtherm
pchtherm0: <Haswell Thermal Subsystem> mem 0x383ffff10000-0x383ffff10fff irq 18 at device 31.6 on pci1
pchtherm0: Enabling Sensor
pchtherm0: Sensor enable failed

# sysctl dev.pchtherm
dev.pchtherm.0.%parent: pci1
dev.pchtherm.0.%pnpinfo: vendor=0x8086 device=0x8c24 subvendor=0x15d9 subdevice=0x086d class=0x118000
dev.pchtherm.0.%location: slot=31 function=6 dbsf=pci0:0:31:6 handle=\_SB_.PCI0.TERM
dev.pchtherm.0.%driver: pchtherm
dev.pchtherm.0.%desc: Haswell Thermal Subsystem
dev.pchtherm.%parent: 

# pciconf -lvbBecV pci0:0:31:6
pchtherm0@pci0:0:31:6:	class=0x118000 rev=0x05 hdr=0x00 vendor=0x8086 device=0x8c24 subvendor=0x15d9 subdevice=0x086d
    vendor     = 'Intel Corporation'
    device     = '8 Series Chipset Family Thermal Management Controller'
    class      = dasp
    bar   [10] = type Memory, range 64, base rx383ffff10000, size 4096, enabled
    cap 01[50] = powerspec 3  supports D0 D3  current D0
    cap 05[80] = MSI supports 1 message 

regards,
Michal
Comment 1 Michal Vanco 2021-03-01 06:51:17 UTC
I managed to fix this first issue with this simple patch:

# git diff
diff --git a/sys/dev/intel/pchtherm.c b/sys/dev/intel/pchtherm.c
index 13f0abc54b6..4a9edc2da28 100644
--- a/sys/dev/intel/pchtherm.c
+++ b/sys/dev/intel/pchtherm.c
@@ -163,7 +163,7 @@ static int pchtherm_attach(device_t dev)
                        bus_write_1(sc->tbar, PCHTHERM_REG_TSEL,
                                    PCHTHERM_GEN_ENABLE);
                        sc->enable = bus_read_1(sc->tbar, PCHTHERM_REG_TSEL);
-                       if (!(sc->enable & PCHTHERM_REG_TSEL)){
+                       if (!(sc->enable & PCHTHERM_GEN_ENABLE)) {
                                device_printf(dev, "Sensor enable failed\n");
                                return 0;
                        }

Now pchtherm gets enabled as it should:

# dmesg | grep pchtherm
pchtherm0: <Haswell Thermal Subsystem> mem 0x383ffff10000-0x383ffff10fff irq 18 at device 31.6 on pci1
pchtherm0: Enabling Sensor

But reported values are strange:

# sysctl dev.pchtherm
dev.pchtherm.0.ctt: 130.0C
dev.pchtherm.0.talv: -50.0C
dev.pchtherm.0.tahv: -50.0C
dev.pchtherm.0.temperature: 67.5C
dev.pchtherm.0.pch_hot_level: -50.0C
dev.pchtherm.0.t2temp: -50.0C
dev.pchtherm.0.t1temp: -50.0C
dev.pchtherm.0.t0temp: -50.0C
dev.pchtherm.0.pmtime: 1
dev.pchtherm.0.pmtemp: -50.0C
dev.pchtherm.0.%parent: pci1
dev.pchtherm.0.%pnpinfo: vendor=0x8086 device=0x8c24 subvendor=0x15d9 subdevice=0x086d class=0x118000
dev.pchtherm.0.%location: slot=31 function=6 dbsf=pci0:0:31:6 handle=\_SB_.PCI0.TERM
dev.pchtherm.0.%driver: pchtherm
dev.pchtherm.0.%desc: Haswell Thermal Subsystem
dev.pchtherm.%parent: 

regards,
michal
Comment 2 Andriy Gapon freebsd_committer freebsd_triage 2021-03-01 07:47:39 UTC
Thank you for catching this!
Added the author to CC.

Also, I think that the driver could use a pass of style fixes (missing or extra spaces, indentation, etc).
Comment 3 Andriy Gapon freebsd_committer freebsd_triage 2021-03-02 09:00:18 UTC
(In reply to michal from comment #1)
If the author does not follow-up I can commit your fix.
How would you like your name + email to appear in the commit message?

P.S.
Ideally, it's best to submit patches via https://reviews.freebsd.org/
Comment 4 Michal Vanco 2021-03-02 14:53:59 UTC
Hi,

I found another bug regarding the "SMI on alert" value. Value is read from wrong register. Here is the patch fixing both bugs:

diff --git a/sys/dev/intel/pchtherm.c b/sys/dev/intel/pchtherm.c
index 13f0abc54b6..31d06a1bc26 100644
--- a/sys/dev/intel/pchtherm.c
+++ b/sys/dev/intel/pchtherm.c
@@ -163,7 +163,7 @@ static int pchtherm_attach(device_t dev)
                        bus_write_1(sc->tbar, PCHTHERM_REG_TSEL,
                                    PCHTHERM_GEN_ENABLE);
                        sc->enable = bus_read_1(sc->tbar, PCHTHERM_REG_TSEL);
-                       if (!(sc->enable & PCHTHERM_REG_TSEL)){
+                       if (!(sc->enable & PCHTHERM_GEN_ENABLE)) {
                                device_printf(dev, "Sensor enable failed\n");
                                return 0;
                        }
@@ -178,7 +178,7 @@ static int pchtherm_attach(device_t dev)
        if (bootverbose) {
                FLAG_PRINT(dev, "SMBus report", val);
        }
-       val = bus_read_1(sc->tbar, PCHTHERM_REG_TSC);
+       val = bus_read_1(sc->tbar, PCHTHERM_REG_TSMIC);
        if (bootverbose) {
                FLAG_PRINT(dev, "SMI on alert", val);
        }

Honestly, I think this driver is pretty unfinished as of now.

If you decide to commit this, please use "Michal Vanco <michal.vanco@gmail.com>" in the commit message.

There is another issue. pchtherm(4) man page says that it can be compiled into kernel (device pchtherm) which doesn't work. It can only compiled as a kernel module. 

Thanks,
Michal
Comment 5 commit-hook freebsd_committer freebsd_triage 2021-03-05 09:03:54 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=5084dde5f087264cf9a826569d1152c65d88a0fe

commit 5084dde5f087264cf9a826569d1152c65d88a0fe
Author:     Michal Vanco <michal.vanco@gmail.com>
AuthorDate: 2021-03-05 08:55:30 +0000
Commit:     Andriy Gapon <avg@FreeBSD.org>
CommitDate: 2021-03-05 09:01:28 +0000

    pchtherm: fix a wrong bit and a wrong register use

    Probably just copy-paste errors that slipped in.

    PR:             253915
    Reported by:    Michal Vanco <michal.vanco@gmail.com>
    MFC after:      1 week

 sys/dev/intel/pchtherm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 6 commit-hook freebsd_committer freebsd_triage 2021-03-23 11:03:12 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=395e612b7bb1874811f5bcd6136296dd7a76da1b

commit 395e612b7bb1874811f5bcd6136296dd7a76da1b
Author:     Michal Vanco <michal.vanco@gmail.com>
AuthorDate: 2021-03-05 08:55:30 +0000
Commit:     Andriy Gapon <avg@FreeBSD.org>
CommitDate: 2021-03-23 11:01:20 +0000

    pchtherm: fix a wrong bit and a wrong register use

    Probably just copy-paste errors that slipped in.

    (cherry picked from commit 5084dde5f087264cf9a826569d1152c65d88a0fe)

    PR:             253915
    Reported by:    Michal Vanco <michal.vanco@gmail.com>

 sys/dev/intel/pchtherm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)