Bug 238037 - [PATCH] Implement ig4 suspend/resume
Summary: [PATCH] Implement ig4 suspend/resume
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs mailing list
Keywords: patch
Depends on:
Reported: 2019-05-22 00:03 UTC by Austin Shafer
Modified: 2019-07-18 14:14 UTC (History)
3 users (show)

See Also:

ig4 suspend/resume patch (4.64 KB, patch)
2019-05-22 00:03 UTC, Austin Shafer
no flags Details | Diff
patch to fix bus_barrier code (1.86 KB, patch)
2019-06-29 18:17 UTC, J.R. Oldroyd
no flags Details | Diff
patch to fix bus_barrier code (1.48 KB, patch)
2019-06-30 09:11 UTC, J.R. Oldroyd
no flags Details | Diff
ig4 suspend/resume patch (4.64 KB, patch)
2019-07-01 01:15 UTC, Austin Shafer
no flags Details | Diff
patch with ig4_iic.c ig4iic_attach() updates from DragonflyBSD (1.13 KB, patch)
2019-07-01 13:25 UTC, J.R. Oldroyd
no flags Details | Diff
ig4 suspend with devidle set [needs testing] (4.97 KB, patch)
2019-07-07 22:11 UTC, Austin Shafer
no flags Details | Diff
ig4 suspend/resume latest patch [needs testing] (6.30 KB, patch)
2019-07-18 14:14 UTC, Austin Shafer
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Austin Shafer 2019-05-22 00:03:32 UTC
Created attachment 204527 [details]
ig4 suspend/resume patch

ig4(4) does not support suspend/resume at the moment. Fixes Asus i2c touchpads not working after resuming.

This patch saves all writable registers to a field in the device's ig4iic_softc struct, and restores them after resuming. It seems like iicbus and iic should also have the bus_generic_* devmethods added to them in case i2c devices want to have their suspend/resume methods called? I didn't include it in this patch but I can easily add it.

A week ago I took a stab at running -CURRENT on my roommate's Asus Q405U. I used the following review to get the touchpad to work. The touchpad stops working after resuming (although aside from this the laptop resumes wonderfully). I don't have physical access to the machine anymore so I'm submitting this in the hopes that someone else will find it useful.

iichid support: https://reviews.freebsd.org/D16698

(A weird quirk with the above patch on the Q405U is I sometimes had to run 'i2c -s -f /dev/iic1' like someone so nicely mentioned in the comments. I think this is the last issue before the newer Asus laptops work fully on -CURRENT)

I'm new to this, so please let me know if there is anything I need to fix.

Comment 1 J.R. Oldroyd 2019-06-29 18:17:39 UTC
Created attachment 205417 [details]
patch to fix bus_barrier code

Looks like there is an error in the bus_barrier code in the new suspend and resume functions.  Patch attached.
Comment 2 J.R. Oldroyd 2019-06-29 18:20:35 UTC
By the way, thanks for this patch, Austin.  I posted on the D16698 phabricator, where I am using this too, that this works great.
Comment 3 J.R. Oldroyd 2019-06-30 09:11:11 UTC
Created attachment 205432 [details]
patch to fix bus_barrier code

read barrier should indeed be before read, but write barrier needed after writes
Comment 4 Andriy Gapon freebsd_committer 2019-06-30 20:47:31 UTC
(In reply to J.R. Oldroyd from comment #3)
Do those barriers make any difference at all (given x86 platform) ?
Also, it would make sense -- if only for consistency -- to pass IG4_REG_RESETS_SKL to the barrier when accessing that register.
Comment 5 Austin Shafer 2019-07-01 01:15:36 UTC
Created attachment 205447 [details]
ig4 suspend/resume patch

Thanks for the feedback! I'm glad I could help.

You guys are definitely right, I updated my patch with the write barrier changes, and made sure to use IG4_REG_RESETS_SKL for skylake.

As for if the barriers are needed, I never really tested but assumed it was required because the `reg_write` wrapper in ig4_iic.c uses it. I added the barriers just to be extra sure that the reset went through to avoid any nastiness on suspend or resume.

Please let me know if you have any other advice for things I should change.
Comment 6 J.R. Oldroyd 2019-07-01 08:42:50 UTC
The barriers don't seem to make much difference, no, but since the code is there it should be correct.  It may be significant on other platforms.
Comment 7 J.R. Oldroyd 2019-07-01 08:48:03 UTC
There may be another bug.

After a suspend/resume on a Skylake system, the new ims driver (in phabricator D16698) does not reset properly.  Reads fail with the message from the ims driver "no data received".  A bus scan using:
    i2c -f /dev/iic1 -s
resets things after which it works ok again.

I am trying to figure out if we are still not saving and restoring all needed registers on Skylake, or if the problem is that the bus needs some reinitialization.  I suspect the latter given that the scan clears things, but I am mentioning it here too.
Comment 8 Austin Shafer 2019-07-01 12:11:24 UTC
(In reply to J.R. Oldroyd from comment #7)

I remember running into this problem while working on this patch. I think I know what's causing it, but couldn't find a way to fix it. I also don't think it's related to this bug's patch, as it would happen to me when first booting.

Basically because of the way that the iichid attaches, I think it is possible for the iichid device to be initialized before the i2c bus. The iichid will try to read interrupts before they are available and find no data until things are reset.

From D16698:
"I tried a few approaches to identify and attach an iichid device by querying ACPI directly from the iichid module. That failed on two details: parent association (in general HID over I2C devices appear below ACPI0 in NewBus but it is necessary to have them below the iicbus*/iic* nodes, moving these node in NewBus failed for me with a panic) and also interrupt handling (apparently, interrupts must be handled by the nodes the IRQs are assigned to (i.e. the node below ACPI0). Trying to attach to an IRQ from another node (i.e. a node below iicbus*/i2c*) fails)"

So the iichid is below ACPI0, and I think that gets attached before iicbus. I have a vague memory of adding printf's to show that this was happening. I'll try to dig around and see if I have any patches of what I was trying to do to fix this.

I'm no expert on newbus device ordering, so if I'm wrong I'd love an explanation.
Comment 9 J.R. Oldroyd 2019-07-01 13:25:48 UTC
Created attachment 205461 [details]
patch with ig4_iic.c ig4iic_attach() updates from DragonflyBSD

The iichid/ims init problem on boot is fixed for me.  I had it too some time back.  The fix was (I think) to load it using:

	kld_load="iic ig4 acpi_iichid"

in /etc/rc.conf.

Now I did also pull in from Dragonfly a handful of updates to the ig4_iic.c attach code and I am uploading that patch here.  At this point I don't recall if these were also needed in order for iichid to init properly on boot.

At any rate, iichid inits properly on boot 100% of the time for me now.

It's just on resume that it fails.  It fails after the system has been suspended for a while, say 10 seconds or more.  If I suspend and then immediately resume after the power goes off, it does actually resume ok most of the time.  That's what led me to think we might be missing some registers on resume.
Comment 10 Austin Shafer 2019-07-01 13:36:07 UTC
(In reply to J.R. Oldroyd from comment #9)

Well loading the modules manually in that order (ig4 before acpi_iichid) would get rid of the dependency issues I mentioned. They won't get resumed in the same order because it's done automatically according to the newbus node location, and I think you'd still run into the same issue.

With that being said, it's definitely possible I'm missing registers. It is suspicious that it works fine when it immediately resumes. Does it work 100% of the time when immediately resuming? Maybe if the dragonfly updates are what actually fixes it I can just add them to the resume methods.

I wish I could test this myself. I don't have access to the laptop I originally used, and I'm not sure if I'll be able to find another.
Comment 11 marc.priggemeyer 2019-07-06 20:23:14 UTC
Since I did not yet have any success with suspend/resume on my laptop (I have to hard reset it because my display fails), I would like to point out this comment:

It's something I proposed in #221777 to correctly initialize the chipset (ig4 to be exact). The Intel PCH datasheet is very specific on how to handle initialization sequences. I can imagine that you will need to check the IG4_REG_DEVIDLE_CTRL register for the idle state and clear the reset flag.
It should basically be the same code snippet as in the referenced comment.

Going one step further, it might even be useful to actively put the controller into idle state by setting the DEVICE_IDLE bit in the IG4_REG_DEVIDLE_CTRL register during suspend.
Comment 12 Austin Shafer 2019-07-07 22:11:23 UTC
Created attachment 205573 [details]
ig4 suspend with devidle set [needs testing]

Thanks for the input. Placing the device in the idle state before suspending sounds like a good idea. The PCH datasheet does have a note saying this should be done before any resets so I think you are right. Maybe I missed something trivial while reading the datasheets, but why in `ig4iic_attach` is this only done for skylake and not also done for haswell?

I went ahead and added this to a new patch, it would be amazing if someone could try it out and let me know if it makes any difference. Thanks!
Comment 13 J.R. Oldroyd 2019-07-08 08:42:07 UTC
I have already tried this and, unfortunately, it does not help.  I have also tried it with the DELAY(1000) in there too (same as in the iic_attach() code).  Same problem: it still requires the "iic -f /dev/iic1 -s" after resume to reset it.  Even though it doesn't solve the problem, I think this code should be there per the spec.

While we're doing this, I suggest moving the definitions of reg_read() and reg_write() from where they are in ig4_iic.c to ig4_var.h so that they can also be used here in ig4_pci.c.  That makes the code here easier to read.
Comment 14 Austin Shafer 2019-07-08 13:56:42 UTC
(In reply to J.R. Oldroyd from comment #13)

I agree on adding reg_read/write to clean things up. It's looking a little cluttered as is. I'll throw that in the patch soon.

As of now it's a little difficult to tell if this suspend/resume patch is the reason the touchpad needs a reset, or if it is the iichid initialization order. It may be a good idea to wait for iichid to be rewritten to use the ACPI namespace walking that was mentioned in D16698. If that works, then it'll be easier to verify this patch does. 

It sounds like markj is already working on this. I can't test anything, but if help/manpower is needed I am happy to do the legwork.
Comment 15 marc.priggemeyer 2019-07-08 21:11:20 UTC
I updated the diff in https://reviews.freebsd.org/D16698 and added a sysctl to send the human interface device to sleep and wake it up again. I would kindly ask you to check whether sending the (HID) device to sleep manually before suspending and waking it up manually after resuming helps. I'm hoping that putting the device into controlled sleep will make the bus scan obsolete.

go to sleep:
sysctl dev.iichid.0.power_state=1

wake up:
sysctl dev.iichid.0.power_state=0
Comment 16 Vladimir Kondratyev freebsd_committer 2019-07-17 00:43:04 UTC
The suspend/resume patch is obviously broken in a way it handles IG4_REG_I2C_EN.
It should set IG4_REG_I2C_EN to 0 before restoring of other registers and than set it to saved value only *after* all other registers have been restored. That is due to writes to some registers can succeed only when IC_ENABLE=0. 
Note: setting IG4_REG_I2C_EN to 0 should be acknowledged by controller. See set_controller() source for example.
Comment 17 Vladimir Kondratyev freebsd_committer 2019-07-17 01:02:06 UTC
(In reply to Vladimir Kondratyev from comment #16)
> It should set IG4_REG_I2C_EN to 0 before restoring of other registers and than set it to saved value only *after* all other registers have been restored.
The simplest hack that works for me is just move IG4_REG_I2C_EN to the end of register list.
Comment 18 Austin Shafer 2019-07-18 14:14:16 UTC
Created attachment 205867 [details]
ig4 suspend/resume latest patch [needs testing]

Thanks for catching this, I made an updated diff that I hope fixes the issue. Please let me know if I made any mistakes. A couple things were changed:

* cleaned things up by using reg_write/reg_read. These were moved into ig4_var.h.
* also moved set_controller to ig4_var.h to avoid duplicating its code.
* disabled controller, and made sure IG4_REG_I2C_EN is the last register to be restored

If all goes well I should be getting my hands on another laptop with an i2c trackpad in the next few days. Then I should finally be able to test things myself.