Bug 238037 - [PATCH] Implement ig4 suspend/resume
Summary: [PATCH] Implement ig4 suspend/resume
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
Keywords: patch
Depends on:
Reported: 2019-05-22 00:03 UTC by Austin Shafer
Modified: 2019-12-22 00:47 UTC (History)
4 users (show)

See Also:

ig4 suspend/resume patch (4.64 KB, patch)
2019-05-22 00:03 UTC, Austin Shafer
no flags Details | Diff
patch to fix bus_barrier code (1.86 KB, patch)
2019-06-29 18:17 UTC, J.R. Oldroyd
no flags Details | Diff
patch to fix bus_barrier code (1.48 KB, patch)
2019-06-30 09:11 UTC, J.R. Oldroyd
no flags Details | Diff
ig4 suspend/resume patch (4.64 KB, patch)
2019-07-01 01:15 UTC, Austin Shafer
no flags Details | Diff
patch with ig4_iic.c ig4iic_attach() updates from DragonflyBSD (1.13 KB, patch)
2019-07-01 13:25 UTC, J.R. Oldroyd
no flags Details | Diff
ig4 suspend with devidle set [needs testing] (4.97 KB, patch)
2019-07-07 22:11 UTC, Austin Shafer
no flags Details | Diff
ig4 suspend/resume latest patch [needs testing] (6.30 KB, patch)
2019-07-18 14:14 UTC, Austin Shafer
no flags Details | Diff
ig4 suspend/resume with other improvements (40.94 KB, patch)
2019-09-18 00:44 UTC, Vladimir Kondratyev
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Austin Shafer 2019-05-22 00:03:32 UTC
Created attachment 204527 [details]
ig4 suspend/resume patch

ig4(4) does not support suspend/resume at the moment. Fixes Asus i2c touchpads not working after resuming.

This patch saves all writable registers to a field in the device's ig4iic_softc struct, and restores them after resuming. It seems like iicbus and iic should also have the bus_generic_* devmethods added to them in case i2c devices want to have their suspend/resume methods called? I didn't include it in this patch but I can easily add it.

A week ago I took a stab at running -CURRENT on my roommate's Asus Q405U. I used the following review to get the touchpad to work. The touchpad stops working after resuming (although aside from this the laptop resumes wonderfully). I don't have physical access to the machine anymore so I'm submitting this in the hopes that someone else will find it useful.

iichid support: https://reviews.freebsd.org/D16698

(A weird quirk with the above patch on the Q405U is I sometimes had to run 'i2c -s -f /dev/iic1' like someone so nicely mentioned in the comments. I think this is the last issue before the newer Asus laptops work fully on -CURRENT)

I'm new to this, so please let me know if there is anything I need to fix.

Comment 1 J.R. Oldroyd 2019-06-29 18:17:39 UTC
Created attachment 205417 [details]
patch to fix bus_barrier code

Looks like there is an error in the bus_barrier code in the new suspend and resume functions.  Patch attached.
Comment 2 J.R. Oldroyd 2019-06-29 18:20:35 UTC
By the way, thanks for this patch, Austin.  I posted on the D16698 phabricator, where I am using this too, that this works great.
Comment 3 J.R. Oldroyd 2019-06-30 09:11:11 UTC
Created attachment 205432 [details]
patch to fix bus_barrier code

read barrier should indeed be before read, but write barrier needed after writes
Comment 4 Andriy Gapon freebsd_committer 2019-06-30 20:47:31 UTC
(In reply to J.R. Oldroyd from comment #3)
Do those barriers make any difference at all (given x86 platform) ?
Also, it would make sense -- if only for consistency -- to pass IG4_REG_RESETS_SKL to the barrier when accessing that register.
Comment 5 Austin Shafer 2019-07-01 01:15:36 UTC
Created attachment 205447 [details]
ig4 suspend/resume patch

Thanks for the feedback! I'm glad I could help.

You guys are definitely right, I updated my patch with the write barrier changes, and made sure to use IG4_REG_RESETS_SKL for skylake.

As for if the barriers are needed, I never really tested but assumed it was required because the `reg_write` wrapper in ig4_iic.c uses it. I added the barriers just to be extra sure that the reset went through to avoid any nastiness on suspend or resume.

Please let me know if you have any other advice for things I should change.
Comment 6 J.R. Oldroyd 2019-07-01 08:42:50 UTC
The barriers don't seem to make much difference, no, but since the code is there it should be correct.  It may be significant on other platforms.
Comment 7 J.R. Oldroyd 2019-07-01 08:48:03 UTC
There may be another bug.

After a suspend/resume on a Skylake system, the new ims driver (in phabricator D16698) does not reset properly.  Reads fail with the message from the ims driver "no data received".  A bus scan using:
    i2c -f /dev/iic1 -s
resets things after which it works ok again.

I am trying to figure out if we are still not saving and restoring all needed registers on Skylake, or if the problem is that the bus needs some reinitialization.  I suspect the latter given that the scan clears things, but I am mentioning it here too.
Comment 8 Austin Shafer 2019-07-01 12:11:24 UTC
(In reply to J.R. Oldroyd from comment #7)

I remember running into this problem while working on this patch. I think I know what's causing it, but couldn't find a way to fix it. I also don't think it's related to this bug's patch, as it would happen to me when first booting.

Basically because of the way that the iichid attaches, I think it is possible for the iichid device to be initialized before the i2c bus. The iichid will try to read interrupts before they are available and find no data until things are reset.

From D16698:
"I tried a few approaches to identify and attach an iichid device by querying ACPI directly from the iichid module. That failed on two details: parent association (in general HID over I2C devices appear below ACPI0 in NewBus but it is necessary to have them below the iicbus*/iic* nodes, moving these node in NewBus failed for me with a panic) and also interrupt handling (apparently, interrupts must be handled by the nodes the IRQs are assigned to (i.e. the node below ACPI0). Trying to attach to an IRQ from another node (i.e. a node below iicbus*/i2c*) fails)"

So the iichid is below ACPI0, and I think that gets attached before iicbus. I have a vague memory of adding printf's to show that this was happening. I'll try to dig around and see if I have any patches of what I was trying to do to fix this.

I'm no expert on newbus device ordering, so if I'm wrong I'd love an explanation.
Comment 9 J.R. Oldroyd 2019-07-01 13:25:48 UTC
Created attachment 205461 [details]
patch with ig4_iic.c ig4iic_attach() updates from DragonflyBSD

The iichid/ims init problem on boot is fixed for me.  I had it too some time back.  The fix was (I think) to load it using:

	kld_load="iic ig4 acpi_iichid"

in /etc/rc.conf.

Now I did also pull in from Dragonfly a handful of updates to the ig4_iic.c attach code and I am uploading that patch here.  At this point I don't recall if these were also needed in order for iichid to init properly on boot.

At any rate, iichid inits properly on boot 100% of the time for me now.

It's just on resume that it fails.  It fails after the system has been suspended for a while, say 10 seconds or more.  If I suspend and then immediately resume after the power goes off, it does actually resume ok most of the time.  That's what led me to think we might be missing some registers on resume.
Comment 10 Austin Shafer 2019-07-01 13:36:07 UTC
(In reply to J.R. Oldroyd from comment #9)

Well loading the modules manually in that order (ig4 before acpi_iichid) would get rid of the dependency issues I mentioned. They won't get resumed in the same order because it's done automatically according to the newbus node location, and I think you'd still run into the same issue.

With that being said, it's definitely possible I'm missing registers. It is suspicious that it works fine when it immediately resumes. Does it work 100% of the time when immediately resuming? Maybe if the dragonfly updates are what actually fixes it I can just add them to the resume methods.

I wish I could test this myself. I don't have access to the laptop I originally used, and I'm not sure if I'll be able to find another.
Comment 11 marc.priggemeyer 2019-07-06 20:23:14 UTC
Since I did not yet have any success with suspend/resume on my laptop (I have to hard reset it because my display fails), I would like to point out this comment:

It's something I proposed in #221777 to correctly initialize the chipset (ig4 to be exact). The Intel PCH datasheet is very specific on how to handle initialization sequences. I can imagine that you will need to check the IG4_REG_DEVIDLE_CTRL register for the idle state and clear the reset flag.
It should basically be the same code snippet as in the referenced comment.

Going one step further, it might even be useful to actively put the controller into idle state by setting the DEVICE_IDLE bit in the IG4_REG_DEVIDLE_CTRL register during suspend.
Comment 12 Austin Shafer 2019-07-07 22:11:23 UTC
Created attachment 205573 [details]
ig4 suspend with devidle set [needs testing]

Thanks for the input. Placing the device in the idle state before suspending sounds like a good idea. The PCH datasheet does have a note saying this should be done before any resets so I think you are right. Maybe I missed something trivial while reading the datasheets, but why in `ig4iic_attach` is this only done for skylake and not also done for haswell?

I went ahead and added this to a new patch, it would be amazing if someone could try it out and let me know if it makes any difference. Thanks!
Comment 13 J.R. Oldroyd 2019-07-08 08:42:07 UTC
I have already tried this and, unfortunately, it does not help.  I have also tried it with the DELAY(1000) in there too (same as in the iic_attach() code).  Same problem: it still requires the "iic -f /dev/iic1 -s" after resume to reset it.  Even though it doesn't solve the problem, I think this code should be there per the spec.

While we're doing this, I suggest moving the definitions of reg_read() and reg_write() from where they are in ig4_iic.c to ig4_var.h so that they can also be used here in ig4_pci.c.  That makes the code here easier to read.
Comment 14 Austin Shafer 2019-07-08 13:56:42 UTC
(In reply to J.R. Oldroyd from comment #13)

I agree on adding reg_read/write to clean things up. It's looking a little cluttered as is. I'll throw that in the patch soon.

As of now it's a little difficult to tell if this suspend/resume patch is the reason the touchpad needs a reset, or if it is the iichid initialization order. It may be a good idea to wait for iichid to be rewritten to use the ACPI namespace walking that was mentioned in D16698. If that works, then it'll be easier to verify this patch does. 

It sounds like markj is already working on this. I can't test anything, but if help/manpower is needed I am happy to do the legwork.
Comment 15 marc.priggemeyer 2019-07-08 21:11:20 UTC
I updated the diff in https://reviews.freebsd.org/D16698 and added a sysctl to send the human interface device to sleep and wake it up again. I would kindly ask you to check whether sending the (HID) device to sleep manually before suspending and waking it up manually after resuming helps. I'm hoping that putting the device into controlled sleep will make the bus scan obsolete.

go to sleep:
sysctl dev.iichid.0.power_state=1

wake up:
sysctl dev.iichid.0.power_state=0
Comment 16 Vladimir Kondratyev freebsd_committer 2019-07-17 00:43:04 UTC
The suspend/resume patch is obviously broken in a way it handles IG4_REG_I2C_EN.
It should set IG4_REG_I2C_EN to 0 before restoring of other registers and than set it to saved value only *after* all other registers have been restored. That is due to writes to some registers can succeed only when IC_ENABLE=0. 
Note: setting IG4_REG_I2C_EN to 0 should be acknowledged by controller. See set_controller() source for example.
Comment 17 Vladimir Kondratyev freebsd_committer 2019-07-17 01:02:06 UTC
(In reply to Vladimir Kondratyev from comment #16)
> It should set IG4_REG_I2C_EN to 0 before restoring of other registers and than set it to saved value only *after* all other registers have been restored.
The simplest hack that works for me is just move IG4_REG_I2C_EN to the end of register list.
Comment 18 Austin Shafer 2019-07-18 14:14:16 UTC
Created attachment 205867 [details]
ig4 suspend/resume latest patch [needs testing]

Thanks for catching this, I made an updated diff that I hope fixes the issue. Please let me know if I made any mistakes. A couple things were changed:

* cleaned things up by using reg_write/reg_read. These were moved into ig4_var.h.
* also moved set_controller to ig4_var.h to avoid duplicating its code.
* disabled controller, and made sure IG4_REG_I2C_EN is the last register to be restored

If all goes well I should be getting my hands on another laptop with an i2c trackpad in the next few days. Then I should finally be able to test things myself.
Comment 19 Vladimir Kondratyev freebsd_committer 2019-09-18 00:44:30 UTC
Created attachment 207593 [details]
ig4 suspend/resume with other improvements

It is appeared that just save/restore controller registers is not enough to do a suspend/resume. Software context should be preserved too (it contains current slave address) and resume path requires polling support from ig4 as interrupts are not working at this moment at least on my laptop.
I tried to address these and some others issues in my version of ig4 driver which is attached.
It requires at least post r352243 current to be applied cleanly, but can be easily modified to be applicable to FBSD12
Comment 20 commit-hook freebsd_committer 2019-11-03 21:01:46 UTC
A commit references this bug:

Author: wulf
Date: Sun Nov  3 21:00:57 UTC 2019
New revision: 354308
URL: https://svnweb.freebsd.org/changeset/base/354308

  [ig4] Add suspend/resume support

  That is done with re-execution of controller initialization procedure
  from resume handler.

  PR:		238037

Comment 21 Austin Shafer 2019-11-04 00:08:53 UTC
It's great to see a commit pushing this through! I wish I could have had a machine to play around with this on, but I've been following the phabricator thread and the other ig4 commits.

Unless there are any remaining concerns it looks like this can be resolved?
Comment 22 Vladimir Kondratyev freebsd_committer 2019-11-04 23:13:04 UTC
(In reply to Austin Shafer from comment #21)
> Unless there are any remaining concerns it looks like this can be resolved?

Yes. I have got several reports that suspend/resume is working.
Comment 23 commit-hook freebsd_committer 2019-12-22 00:47:04 UTC
A commit references this bug:

Author: wulf
Date: Sun Dec 22 00:46:08 UTC 2019
New revision: 355994
URL: https://svnweb.freebsd.org/changeset/base/355994

  MFC r354291 - r354322, r354327, r355596

  [ig4] Give common name to PCI and ACPI device drivers

  [ig4] Handle controller startup errors

  Obtained from:	DragonflyBSD (509820b)

  [ig4] Only enable interrupts when we want them. Otherwise keep mask at 0.

  Obtained from:	DragonflyBSD (d7c8555)

  [ig4] Drop driver's internal RX FIFO

  [ig4] Do not wait for interrupts in set_controller() routine

  [ig4] Reduce scope of io_lock

  [ig4] Ignore stray interrupts

  [ig4] We actually need to set the Rx threshold register one smaller.

  Obtained from:	DragonflyBSD (02f0bf2)


  [ig4] Stop I2C controller after checking that it's kind of functional.

  Obtained from:	DragonfliBSD (0b3eedb)

  [ig4] disable controller before initialization of clock counters

  [ig4] Add support for polled mode

  [ig4] Allow enabling of polled mode from iicbus allocation callback

  [ig4] Do not wait until interrupts are enabled at attach stage

  [cyapa] Postpone start of the polling thread until sleep is available

  [ig4] dump IG4_REG_COMP_PARAM1 and IG4_REG_COMP_VER registers unconditionally

  [ig4] Set clock registers based on controller model

  [ig4] Implement burst mode for data reads

  [ig4] Add suspend/resume support

  PR:		238037

  [ig4] Remove dead code inherited from DragonflyBSD

  [ig4] Rewrite ig4iic_write routine to use TX_EMPTY status flag

  [ig4] Convert last remaining usage of TX_NOTFULL status to TX_EMPTY

  [ig4] Use interrupts for waiting for empty TX FIFO

  [ig4] Convert polling loop from status-based to interrupt-based

  [ig4] Improve error detection

  [ig4] Set STOP condition and flush TX/RX FIFOs on error

  [ig4] On SkyLake controllers issue reset on attach unconditionally.

  [ig4] wait for bus stop condition after stop command issued

  [ig4] Minor improvement of write pipelining

  [ig4] Add generic resource methods to bus interface

  [ig4] Add support for CannonLake controllers

  PR:		240485
  Submitted by:	Neel Chauhan <neel@neelc.org>

  [ig4] Enable additional registers support on Appolo Lake controllers

  [ig4] Convert ithread interrupt handler to filter based one.

  [ig4] Try to workaround MIPS namespace pollution issue

  [ig4] Remove unused methods from bus interface

  Suggested by:	jhb

  Reviewed by:		imp (previous version)
  Differential Revision:	https://reviews.freebsd.org/D22016

_U  stable/12/