Bug 194884 - [acpi] Asus UX31E USB hangs during suspend, due to putting the USB controllers into D3 state
Summary: [acpi] Asus UX31E USB hangs during suspend, due to putting the USB controller...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Adrian Chadd
URL: https://twitter.com/kena42/status/681...
Keywords: needs-qa, patch
Depends on:
Blocks:
 
Reported: 2014-11-07 19:08 UTC by Adrian Chadd
Modified: 2019-01-21 09:56 UTC (History)
9 users (show)

See Also:
koobs: mfc-stable10?
koobs: mfc-stable9?


Attachments
Proposed fix (431 bytes, patch)
2014-11-09 06:02 UTC, Adrian Chadd
no flags Details | Diff
A proposed fix + acpi.c patch (1003 bytes, patch)
2014-11-10 20:36 UTC, Jung-uk Kim
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adrian Chadd freebsd_committer freebsd_triage 2014-11-07 19:08:35 UTC
The laptop won't suspend with the USB drivers (ehci, xhci) loaded.

Suspend bounce works fine - everything gets powered down fine, then comes back up.

Unloading the USB drivers fixes the problem, but that's hiding the underlying causes - the trick is the transition of the USB ports into D3.

The ACPI BIOS has this:

           Device (EHC1)
            {
                Name (_ADR, 0x001D0000)  // _ADR: Address
                OperationRegion (PWKE, PCI_Config, 0x62, 0x04)
                Field (PWKE, DWordAcc, NoLock, Preserve)
                {
                        ,   1, 
                    PWUC,   8
                }

                Method (_PSW, 1, NotSerialized)  // _PSW: Power State Wake
                {
                    If (Arg0)
                    {
                        Store (Ones, PWUC) /* \_SB_.PCI0.EHC1.PWUC */
                    }
                    Else
                    {
                        Store (0x00, PWUC) /* \_SB_.PCI0.EHC1.PWUC */
                    }
                }

                Method (_S3D, 0, NotSerialized)  // _S3D: S3 Device State
                {
                    Return (0x02)
                }

                Method (_S4D, 0, NotSerialized)  // _S4D: S4 Device State
                {
                    Return (0x02)
                }

.. so, we should be trying to put it into D2, not D3.

But then:

        ehci0 pnpinfo vendor=0x8086 device=0x1c26 subvendor=0x1043 subdevice=0x1427 class=0x0c0320 at slot=29 function=0 handle=\_SB_.PCI0.EHC1

ehci0@pci0:0:29:0:	class=0x0c0320 card=0x14271043 chip=0x1c268086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '6 Series/C200 Series Chipset Family USB Enhanced Host Controller'
    class      = serial bus
    subclass   = USB
    cap 01[50] = powerspec 2  supports D0 D3  current D0
    cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
    cap 13[98] = PCI Advanced Features: FLR TP

.. we shouldn't be putting it into D2, because the device doesn't support it.

However, we are actually transitioning it into D3. I don't have a bootverbose output here, but just assume we are.

If I leave the driver loaded but I disable the suspend-to-d3 option, things suspend/resume fine.

If I set the unconfigured-devices-get-d3 option and unload the usb modules, then the ehci controller ends up at d3 when I unload it. Then if I suspend, the laptop hangs.

So, jhb@ asked me to add some debugging. Which I did, and I got this:

acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP01: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP01: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
ath_hal_reg_write: reg=0x000000a0, val=0x00000000, pm=1
ath_hal_reg_write: reg=0x000000ac, val=0x00000000, pm=1
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP02: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP02: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP02: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP02: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP04: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP04: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
uhub0: at usbus0, port 1, addr 1 (disconnected)
ugen0.2: <vendor 0x8087> at usbus0 (disconnected)
uhub1: at uhub0, port 1, addr 2 (disconnected)
ugen0.3: <Azurewave> at usbus0 (disconnected)
ugen0.4: <Generic> at usbus0 (disconnected)
ugen0.5: <Atheros Communications> at usbus0 (disconnected)
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=3
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
info: [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP02: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0.RP02: now dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: dstate=0
acpi_device_pwr_for_sleep: \134_SB_.PCI0: now dstate=0
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
uhub_attach: Turning port 1 power on
uhub_attach: Turning port 2 power on
uhub0: 2 ports with 2 removable, self powered
ugen0.2: <vendor 0x8087> at usbus0
uhub1: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus0
uhub_attach: Turning port 1 power on
uhub_attach: Turning port 2 power on
uhub_attach: Turning port 3 power on
uhub_attach: Turning port 4 power on
uhub_attach: Turning port 5 power on
uhub_attach: Turning port 6 power on
uhub_attach: Turning port 7 power on
uhub_attach: Turning port 8 power on
uhub1: 8 ports with 8 removable, self powered
ugen0.3: <Azurewave> at usbus0
ugen0.4: <Generic> at usbus0
ugen0.5: <Atheros Communications> at usbus0

These printf()s are in acpi_device_pwr_for_sleep(), printing out the acpi_name(handle) for the handle we're doing the lookup on. It turns out that 'dev' here is the underlying bus, not the device it's attached to. So, all these lookups are for PCIx, not the leaf devices - and it never sees the EHC* nodes.

So I'm not sure what to do here. Should acpi_device_pwr_for_sleep() be being called for leaf nodes rather than just the busses?

As a side note - I think the xhci controller does the same thing, but I'm going to worry about that particular problem -after- we solve the ehci problem.

The full files can be found at http://people.freebsd.org/~adrian/asus_zenbook/
Comment 1 Adrian Chadd freebsd_committer freebsd_triage 2014-11-09 06:02:56 UTC
Created attachment 149219 [details]
Proposed fix
Comment 2 Johannes Jost Meixner freebsd_committer freebsd_triage 2014-11-10 07:12:55 UTC
Proposed fix makes Suspend/Resume on this ASUS UX32VD useful:
Before the fix, I could suspend/resume, music would continue playing but the display wouldn't come back on.

With this fix, and acpi_video loaded, I can suspend/resume properly and have everything working as before.
Comment 3 John Baldwin freebsd_committer freebsd_triage 2014-11-10 20:18:34 UTC
Comment on attachment 149219 [details]
Proposed fix

Hah, that's a dumb bug.  Nice fix.
Comment 4 Jung-uk Kim freebsd_committer freebsd_triage 2014-11-10 20:36:25 UTC
Created attachment 149269 [details]
A proposed fix + acpi.c patch

I think you need to fix r214072, too.  It seems I just copied the logic from r211430.
Comment 5 Adrian Chadd freebsd_committer freebsd_triage 2014-11-10 21:50:26 UTC
Ok, that looks sane.

Just need jhb@ to weigh in and OK it and I'll commit both patches to -HEAD.

Thanks!
Comment 6 John Baldwin freebsd_committer freebsd_triage 2014-11-11 17:03:42 UTC
Looks good to me, fire away!
Comment 7 commit-hook freebsd_committer freebsd_triage 2014-11-11 17:14:51 UTC
A commit references this bug:

Author: adrian
Date: Tue Nov 11 17:14:12 UTC 2014
New revision: 274386
URL: https://svnweb.freebsd.org/changeset/base/274386

Log:
  Use the correct device (child) when asking the bus layer about which power
  state said device should go into.

  This was a snafu introduced in the ACPI/PCI awareness separation.

  When putting a device into a power state, the bus (and thus firmware,
  eg ACPI) should be asked before hand to check whether the device
  can indeed go into that power state.

  There's a set of nodes in ACPI under each device - the _SxD nodes - which
  state which ACPI power state to put the device into when the system is
  going into power save state 'x'.  So when going into S3, the existence
  of an _S3D node would override whatever the system was trying to do.

  By default the PCI code wants to put devices into D3 before suspending.

  I have a laptop here (Asus Zenbook - check the PR) whose EHCI controller
  really wants to be in D2 during suspend, not D3.  So if we put it into
  D3 and then try to enter S3, everything hangs.  The device itself
  can go into D3 - it just can't be there when the call to ACPI to enter
  S3 occurs.  The PCI patch fixes this.

  jkim@ noticed that the same is needed for the ACPI child device
  enumeration.

  Thankyou to Matt Dillon (the programmer, not the actor) for buying me
  this particular laptop so I could debug the issues with the Atheros
  AR9485 that is in it.  It's his fault that I ended up with this
  laptop and was sufficiently annoyed by the lack of USB suspend
  to go down this rabbit hole.

  Tested:

  * Thinkpad T400
  * Thinkpad X230
  * Thinkpad T42
  * Thinkpad T60
  * Asus Zenbook (see PR)
  * Asus EEEPC 701
  * Asus EEEPC 1001PX

  TODO:

  * Figure out what we should do about devices we unload drivers for
    that want to be in a specific state when entering S3 / S4 -
    the "put devices into D3 if they're not bound to a driver" option
    may also mess with things.

  PR:		kern/194884
  Reviewed by:	jhb, jkim
  MFC after:	1 week
  Relnotes:	yes
  Sponsored by:	Matt Dillon <dillon@apollo.backplane.com> (hardware)

Changes:
  head/sys/dev/acpica/acpi.c
  head/sys/dev/pci/pci.c
Comment 8 commit-hook freebsd_committer freebsd_triage 2014-11-11 19:43:04 UTC
A commit references this bug:

Author: jkim
Date: Tue Nov 11 19:42:10 UTC 2014
New revision: 274397
URL: https://svnweb.freebsd.org/changeset/base/274397

Log:
  Use the correct device.  Note this commit complements r274386.

  PR:		194884

Changes:
  head/sys/dev/acpica/acpi.c
Comment 9 Adrian Chadd freebsd_committer freebsd_triage 2015-02-02 05:17:50 UTC
Fixed!
Comment 10 Kubilay Kocak freebsd_committer freebsd_triage 2015-12-29 11:50:00 UTC
It appears these commits may not have made stable/*

Can original committers to HEAD (now Assigned/CC) please check to confirm this please, and set mfc_* flags to + once committed
Comment 11 Kubilay Kocak freebsd_committer freebsd_triage 2015-12-29 12:00:53 UTC
Keep freebsd-acpi informed about this issue, so that it makes 10.3-RELEASE
Comment 12 Adrian Chadd freebsd_committer freebsd_triage 2015-12-29 19:37:22 UTC
if someone's going to backport this to stable/10 then please hit up jhb@ for the full story.

IIRC, there were a bunch of things unearthed around this change that required some subsequent commits.

(I don't know if everything has been resolved with it in -HEAD yet; I need to check -acpi and -current..)
Comment 13 Raphael 'kena' Poss 2015-12-30 09:30:42 UTC
Well the combined two patches amount to 3 lines of code changed, the patch applies cleanly to stable/10 and you have an independent confirmation that the bug identified in the title of this PR disappears when applying this patch on stable/10. 

I have also investigated the commit and subsequent changes in pci.c and acpi.c and was unable to find any related changes that also need backporting. Curious to see what the "unearthed bunch of things" becomes.
Comment 14 Ian Smith 2015-12-30 10:44:07 UTC
I applied these patches (both acpi.c and pci.c) to my then 9.2-R system
(Lenovo X200) on 2014-11-14.  I'd never had Adrian's problem with suspend
but the patches did get rid of 3 apparently harmless error messages, and
add one new one, also apparently harmless.  It's been running fine  since. 

Having updated it to stable/9 a month ago, I assumed these would have been
included, but checking dmesg I confirm they weren't, but are safe for 9.x

This is the new one introduced by the patches (that has now gone)
-pcib0: failed to set ACPI power state D2 on \134_SB_.PCI0: AE_BAD_PARAMETER

And these are the ones the two  patches removed (thus, have returned :)
+pci0: failed to set ACPI power state D2 on \134_SB_.PCI0.EXP0: AE_BAD_PARAMETER
+pci0: failed to set ACPI power state D2 on \134_SB_.PCI0.EXP2: AE_BAD_PARAMETER
+pci0: failed to set ACPI power state D2 on \134_SB_.PCI0.EXP3: AE_BAD_PARAMETER

Ian
Comment 15 Oleksandr Tymoshenko freebsd_committer freebsd_triage 2019-01-21 09:56:19 UTC
There is a commit referencing this PR, but it's still not closed and has been inactive for some time. Closing the PR as fixed but feel free to re-open it if the issue hasn't been completely resolved.

Thanks