Bug 79080 - acpi thermal changes freezes HP nx6110
Summary: acpi thermal changes freezes HP nx6110
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: i386 (show other bugs)
Version: 5.4-PRERELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-acpi (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-03-21 11:20 UTC by Juho Vuori
Modified: 2008-01-12 22:36 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Juho Vuori 2005-03-21 11:20:00 UTC
On HP nx6110 (intel 910gml chipset, celeron-m) and ACPI enabled:
When system temperature raises/drops above/below hw.acpi.thermal.tz0._ACx
levels the system freezes. The system freezes before the system fan gets
switched on/off by the event. If the system is compiled with 4BSD scheduler,
the freeze will last forever, with ULE scheduler, it is just a few seconds
long. It is caused by an acpi (irq9) interrupt storm. With acpi disabled,
the system works perfectly.

How-To-Repeat: Take HP nx 6110 laptop. Get freebsd with acpi compiled in and arange so
that the system temperature raises/drops over _ACx levels.
Comment 1 Tilman Keskinoz freebsd_committer freebsd_triage 2005-04-22 13:29:16 UTC
Responsible Changed
From-To: freebsd-i386->freebsd-acpi

Over to acpi Mailinglist
Comment 2 pryd 2005-11-30 13:15:49 UTC
Hello,

I am also using nx6110 and having the same problem. It also seems that 
USB works only if ACPI is enabled (and, having no ports than USB, it 
seems to be quite vital). Another astonishing connection is, that with 
ACPI enabled, the notebook's touchpad started to be "tappable" (when i 
tap to the surface, it performs a click), which never worked, but it 
somehow started with enabling ACPI (and remains also when I disable it 
again) -- it is out of my immagination and also out of toppic...

As was told, notebook dies (when using SCHED_4BSD scheduler), or freezes 
for approx. 2 seconds (when using SCHED_ULE), when the temperature 
raises/drops above/below hw.acpi.thermal.tz0._ACx.

There happens to appear a storm of acpi TZ_NOTIFY_TEMPERATURE messages. 
When using SCHED_ULE, then --- after 2 seconds --- one occurence of 
TZ_NOTIFY_LEVELS appears, then again one appearence of 
TZ_NOTIFY_TEMPERATURE, and then the system continues in working well, 
until another raise/drop happens.

It seems that there should come some answer/action to the 
TZ_NOTIFY_TEMPERATURE from the acpi_thermal driver, but it never happens 
(or happens quite late). It seems to me that the answer is only "waked 
up" (acpi_tz_signal function in acpi_thermal.c:640)  and is to be 
performed by some other thread. I thought of moving that code "inside" 
the actual thread, maybe it would help. But I do not understand ACPI at 
all, so I do not know, WHAT the hw is expecting to happen:((.

My "backup" idea was to set the ACx temperatures to values that cannot 
be raised/dropped above/below, so the freezes wouldn't appear. And also 
set some reasonable fan speed.

There are 4+1 fan speeds (0=fastest, ... 3=slowest; -1=off), it is 
possible to control their speed via sysctl hw.acpi.thermal.tz0.active=n, 
which works well (no freeze when changing "by hand").

There is following list of temperatures in hw.acpi.thermal.tz0._ACx:

hw.acpi.thermal.tz0._ACx: 80.0C 70.0C 60.0C 45.0C -1 -1 -1 -1 -1 -1

If I persuaded the ntb to have the list for example  80.0C 5.0C 4.0C 
3.0C -1 -1 -1..., than the storm perhaps never happens.

It seems to me that it is somehow possible to change the values in the 
list (but I don't know, where they are dictated by sw or hw): when the 
temperature (for example) raises above 60.0C, the list is (somehow) 
smartly changed to 80.0C 70.0C 55.0C 45.0C. Notice that 60.0 changed to 
55.0, so the temperature is not going to lower below the limit 
immediately...

I have no experiences or knowledge of ACPI, I just tried to use sysctl 
to set the _ACx, but it is read-only. Is it possible to be set by sw? 
Has anybody idea of how to accomplish?

Pavel Rydvan

P.S.: Another astonishing connection is, that with ACPI enabled, the 
notebook's touchpad started to be "tappable" (when i tap to the surface, 
it performs a click), which never worked in FreeBSD with xorg, but it 
somehow appeared with enabling ACPI (and remains also when I disable it 
again) -- it is out of my immagination but it is also off topic...
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2007-04-28 10:11:30 UTC
State Changed
From-To: open->feedback

Is this still a problem with 6.2?
Comment 4 Nate Lawson 2007-04-28 19:37:25 UTC
Juho Vuori wrote:
> i don't know. The affected laptop is not running freebsd at the moment.
> I'm afraid I can't do anything about this any time soon.
> 
> Juho Vuori
> 
> Mark Linimon wrote:
>> Synopsis: acpi thermal changes freezes HP nx6110
>>
>> State-Changed-From-To: open->feedback
>> State-Changed-By: linimon
>> State-Changed-When: Sat Apr 28 09:11:30 UTC 2007
>> State-Changed-Why: Is this still a problem with 6.2?
>>
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=79080
>

If you could post a link to the ASL, that would be really helpful.  You
can boot a live CD like http://www.freesbie.org/ even if not running
FreeBSD.

acpidump -dt | gzip -c9 > hp-nx6110.asl.gz

-- 
Nate
Comment 5 Mark Linimon freebsd_committer freebsd_triage 2007-05-06 00:37:39 UTC
State Changed
From-To: feedback->suspended

Suspended awaiting any further information from submitter.
Comment 6 Yousif Hassan 2008-01-05 18:19:08 UTC
The problem is still found in the most recent 7.0 RC code as well.
Has something to do with a Mutex lock/unlock problem when the thermal
zone change occurs - it doesn't appear to be an interrupt storm any
longer.

It is assuredly ACPI-related, because disabling ACPI makes the freezes
go away.  However, this laptop does not function well without ACPI so
it's not a good workaround.  USB devices do not work w/o ACPI, as well
as other hardware.

There are several suggested workarounds I tried, none of which resoloved
the issue.  These included building the kernel with apic, disabling apic,
manually changing the hw.acpi.thermal.tz0.active number (my nx6110
seems to want to keep it at 1 no matter what), and using the ULE
scheduler rather than the 4BSD.  Again, none of the above workarounds,
in any combination, solved the issue.

INFORMATION
-----------
Turning on debugging, the following appears right before the lock,
as soon as temperature rises enough to trigger a change in the zone:

acpi_tz0: _AC3: temperature 68.0 >= setpoint 45.0
acpi_tz0: _AC2: temperature 68.0 >= setpoint 55.0
acpi_tz0: _AC3: temperature 67.0 >= setpoint 45.0
acpi_tz0: _AC2: temperature 67.0 >= setpoint 55.0
...etc...
and then:
ACPI Exception (utmutex-0376): AE_TIME, Thread 28 could not acquire Mutex 
[0] [20070320]
ACPI Error (exutils-0180): Could not acquire AML Interpreter mutex 
[20070320]
ACPI Error (utmutex-0421): Mutex [0] is not acquired, cannot release 
[20070320]
ACPI Error (exutils-0250): Could not release AML Interpreter mutex 
[20070320]
ACPI Exception (utmutex-0376): AE_TIME, Thread 28 could not acquire Mutex 
[0] [20070320]
ACPI Error (exutils-0180): Could not acquire AML Interpreter mutex 
[20070320]
ACPI Error (psparse-0626): Method parse/execution failed [\_TZ_.C242] (Node 
0xc321c220), AE_TIME
ACPI Error (psparse-0626): Method parse/execution failed [\_TZ_.TZ1_._TMP] 
(Node 0xc321b9c0), AE_TIME
ACPI Error (utmutex-0421): Mutex [0] is not acquired, cannot release 
[20070320]
ACPI Error (exutils-0250): Could not release AML Interpreter mutex 
[20070320]
ACPI Error (psparse-0626): Method parse/execution failed [\_TZ_.C242] (Node 
0xc321c220), AE_TIME
ACPI Error (psparse-0626): Method parse/execution failed [\_TZ_.TZ2_._TMP] 
(Node 0xc321b8c0), AE_TIME
ACPI Error (utmutex-0421): Mutex [0] is not acquired, cannot release 
[20070320]
ACPI Error (exutils-0250): Could not release AML Interpreter mutex 
[20070320]

(the errors continue to repeat ad infinitum, and each TZ reports problems)

As a result, you will eventually see:

acpi_tz0: error fetching current temperature -- AE_TIME
acpi_tz1: error fetching current temperature -- AE_TIME
(..etc...)

The interesting thing is that THIS PROBLEM DOES NOT APPEAR in FreeBSD
6.2-RELEASE nor in any of the 6.3-RC variants.  It's unique to FreeBSD
7, and it involves some of the new ACPI mutex code.

This is definitely a regression for this particular laptop since it worked 
well
in 6.x - so as such, maybe it would be worthwhile to investigate this bug.
It seems general enough that it could affect other laptop ASLs as well.

The ASL dump AND a sysctl dump can be found:
http://www.far-far-away.com/~yousif/freebsd/

Please let me know if more information is needed.

--Yousif
Comment 7 njl freebsd_committer freebsd_triage 2008-01-12 22:35:47 UTC
State Changed
From-To: suspended->closed

Patch tested and committed.  Please cvsup if running -current.  Patch 
will be MFCd to 7.0