Bug 244436

Summary: Linuxkpi never run resume and suspend callback function
Product: Base System Reporter: Raffeale <dcp2k>
Component: kernAssignee: Hans Petter Selasky <hselasky>
Status: Closed FIXED    
Severity: Affects Only Me CC: hselasky, kib, markj, zeising
Priority: ---    
Version: 12.1-STABLE   
Hardware: Any   
OS: Any   

Description Raffeale 2020-02-26 16:19:31 UTC
I found a bug that Linuxkpi resume and suspend code never run when I run acpiconf -s 3
 
I just put pr_debug into suspend and resume function in linux_pci.c file and put drm debug information to amdgpu driver

The Linuxkpi’s suspend and resume was not called after running acpiconf -s 3
the bug caused the drm suspend and resume was not called.

I have already set company.linuxkpi.debug=1 in loader.conf

I did not find that debug information  in /var/log/messages
Comment 1 Hans Petter Selasky freebsd_committer 2020-03-03 15:57:59 UTC
Hi,

Have you had a look at "linux_pci_suspend()" in the LinuxKPI?

It might be the DRM KMS driver in FreeBSD has been patched to not run these callbacks.

Do you observe any missing functionality?

Thank you!

--HPS
Comment 2 Raffeale 2020-03-04 10:30:58 UTC
i had added some pr_debug in linux_pci_resume function,  
 I use strings to look at linuxkpi.ko ,I found  there is not any my debug information.  so i copy pr_debug definition code to linux_pci.c and recompile  linuxkpi,  I can find the debug. information in linuxkpi.ko but there is any my debug  information in dmesg when I run acpiconf -s 3.
I have already put debug message into suspend and resume function
Comment 3 Raffeale 2020-03-04 10:37:43 UTC
the kldstat list show linuxkpi.ko loaded.
before I run acpiconf  i have already reboot.

I am pretty sure that the linux_pci_suspend and linux_pci_resume was not run when I run acpiconf -s 3
Comment 4 Hans Petter Selasky freebsd_committer 2020-03-04 10:42:58 UTC
I think you need to set:

sysctl compat.linuxkpi.debug=1

Before those prints start working.

--HPS
Comment 5 Raffeale 2020-03-04 15:20:34 UTC
(In reply to Hans Petter Selasky from comment #4)
i am really sorry for this issue, it's my fault, i found that the linuxkpi install to /boot/modules , but the /boot/kernel has linuxkpi.ko too , so i just copy /boot/modules/linuxkpi.ko to /boot/kernel/ , it's work! thanks.

but i found the drm driver suspend was not called ,
why does drm driver suspend and resume not call?
the suspend how to work? will the suspend call each device driver's suspend function?

i really want to this in detail. please help me.
Comment 6 Raffeale 2020-03-05 15:27:43 UTC
today I test it again ,I found that the suspend and resume callback is null in linux_pci_register function. this issue cause suspend not work.  Please fix it
Comment 7 Hans Petter Selasky freebsd_committer 2020-03-06 11:35:03 UTC
Hi,

The suspend/resume method callbacks are defined by:

const struct dev_pm_ops i915_pm_ops = {
        /*
         * S0ix (via system suspend) and S3 event handlers [PMSG_SUSPEND,
         * PMSG_RESUME]
         */
        .prepare = i915_pm_prepare,
        .suspend = i915_pm_suspend,
        .suspend_late = i915_pm_suspend_late,
        .resume_early = i915_pm_resume_early,
        .resume = i915_pm_resume,

static const struct dev_pm_ops amdgpu_pm_ops = {
        .suspend = amdgpu_pmops_suspend,
        .resume = amdgpu_pmops_resume,
        .freeze = amdgpu_pmops_freeze,
        .thaw = amdgpu_pmops_thaw,
        .poweroff = amdgpu_pmops_poweroff,
        .restore = amdgpu_pmops_restore,
        .runtime_suspend = amdgpu_pmops_runtime_suspend,
        .runtime_resume = amdgpu_pmops_runtime_resume,
        .runtime_idle = amdgpu_pmops_runtime_idle,
};


The LinuxKPI call these functions, but maybe something is missing:

        else if (pmops != NULL && pmops->suspend != NULL) {
                error = -pmops->suspend(&pdev->dev);
                if (error == 0 && pmops->suspend_late != NULL)
                        error = -pmops->suspend_late(&pdev->dev);
        }

--HPS
Comment 8 Niclas Zeising freebsd_committer 2020-03-06 13:08:56 UTC
I'm not sure what the issue is, but at least when using i915 suspend and resume works without issues. I have not tested with ATI/AMD hardware though.
Comment 9 Raffeale 2020-03-06 14:54:34 UTC
(In reply to Hans Petter Selasky from comment #7)
yes , i knew that , i have put debug informat there.

when drm-driver call linux_register_driver or linux_register_drm_driver , i found linuxkp register function did not get suspend and resume callback.

can you solve this issue?
Comment 10 Raffeale 2020-03-06 14:56:10 UTC
it's linux_pci_register_driver function
Comment 11 Hans Petter Selasky freebsd_committer 2020-03-06 15:02:10 UTC
Could you upload the prints you've added as a diff?

Yes, we can solve this when the problem is nailed!
Comment 12 Hans Petter Selasky freebsd_committer 2020-03-06 15:06:54 UTC
I think this problem might be that the parent device does not forward the suspend callback to its children.

Can you show the output from:

devinfo

When the drm driver is loaded!

--HPS
Comment 13 Raffeale 2020-03-06 15:18:56 UTC
(In reply to Hans Petter Selasky from comment #12)
static int
_linux_pci_register_driver(struct pci_driver *pdrv, devclass_t dc)
{
        int error;

        pr_debug1("linux_pci_register_driver device_name:%s pmops->suspend:%x ,pmops->resume:%x \r\n",pdrv->name,(unsigned int)(pdrv->driver.pm->suspend),(unsigned int)(pdrv->driver.pm->resume));
        linux_set_current(curthread);
        spin_lock(&pci_lock);
        list_add(&pdrv->links, &pci_drivers);
        spin_unlock(&pci_lock);
        pdrv->bsddriver.name = pdrv->name;
        pdrv->bsddriver.methods = pci_methods;
        pdrv->bsddriver.size = sizeof(struct pci_dev);

        pr_debug1("linux_pci_register device_name:%s pdrv->suspend:%x , pdrv->resume:%x \r\n",pdrv->name,(unsigned int)(pdrv->suspend),(unsigned int)(pdrv->resume));
        pr_debug1("linux_pci_register device_name:%s pdrv->driver.pm->suspend:%x , pdrv->driver.pm->resume:%x \r\n",pdrv->name,(unsigned int)(pdrv->driver.pm->suspend),(unsigned int)(pdrv->driver.pm->resume));
        mtx_lock(&Giant);
        error = devclass_add_driver(dc, &pdrv->bsddriver,
            BUS_PASS_DEFAULT, &pdrv->bsdclass);
        mtx_unlock(&Giant);
        return (-error);
}

i add some debug in this function
Comment 14 Hans Petter Selasky freebsd_committer 2020-03-06 15:32:58 UTC
And can you show the resulting prints too?
Comment 15 Raffeale 2020-03-09 00:58:46 UTC
(In reply to Hans Petter Selasky from comment #14)
Mar  9 08:56:24 Raffeale kernel: [drm:drm_core_init] Initialized
Mar  9 08:56:24 Raffeale kernel: uhub0: 8 ports with 8 removable, self powered
Mar  9 08:56:24 Raffeale kernel: uhub2: 1 port with 1 removable, self powered
Mar  9 08:56:24 Raffeale kernel: [drm] amdgpu kernel modesetting enabled.
Mar  9 08:56:24 Raffeale kernel: [drm] amdgpu register atpx handler will run<6>[drm] linux_pci_register_drm_driver will run!
Mar  9 08:56:24 Raffeale kernel: linux_pci_register_driver device_name:drmn pmops->suspend:0 ,pmops->resume:0
Mar  9 08:56:24 Raffeale kernel: linux_pci_register device_name:drmn pdrv->suspend:0 , pdrv->resume:0
Mar  9 08:56:24 Raffeale kernel: linux_pci_register device_name:drmn pdrv->driver.pm->suspend:0 , pdrv->driver.pm->resume:0
Mar  9 08:56:24 Raffeale kernel: linux_pci_find found a match device driver_name:drmn , is_drm:1 ,device_id:15d8
Mar  9 08:56:24 Raffeale kernel: drmn0: <drmn> on vgapci0
Mar  9 08:56:24 Raffeale kernel: linux_pci_attach will find a match device with linux_pci_find
Mar  9 08:56:24 Raffeale kernel: <7>linux_pci_find found a match device driver_name:drmn , is_drm:1 ,device_id:15d8
Mar  9 08:56:24 Raffeale kernel: <7>linux_pci_attach found device pdrv->suspend:0 , pdrv->resume:0
Mar  9 08:56:24 Raffeale kernel: <7>linux_pci_attach will run
Mar  9 08:56:24 Raffeale kernel: <7>linux_pci_attach pdrv->prob will run!
Mar  9 08:56:24 Raffeale kernel: [drm] amdgpu_amdkfd_init  will runvgapci0: child drmn0 requested pci_enable_io
Comment 16 Hans Petter Selasky freebsd_committer 2020-03-09 07:57:30 UTC
Hi,

Which DRM driver version are you using?

Before your print can you do this:

#include <sys/kdb.h>

kdb_backtrace();
printf("....");

Then send the resulting backtrace!

Here is a patch you can try:

diff --git a/drivers/gpu/drm/drm_os_config.h b/drivers/gpu/drm/drm_os_config.h
index f8e67bae6..8bc271f01 100644
--- a/drivers/gpu/drm/drm_os_config.h
+++ b/drivers/gpu/drm/drm_os_config.h
@@ -139,6 +139,9 @@
 // Frame buffer compression on AMD DC
 #define        CONFIG_DRM_AMD_DC_FBC 1
 
+// Enable AMD power saving
+#define CONFIG_DRM_AMD_ACP 1
+
 // KMS framebuffer on VMware
 #define        CONFIG_DRM_VMWGFX_FBCON 1
Comment 17 Raffeale 2020-03-09 15:25:38 UTC
(In reply to Hans Petter Selasky from comment #16)
I use drm5.0 for freebsd12 . I got it from freebsddesktop github.
I will try your path tommorow and put the debug code  in the function.
Comment 18 Niclas Zeising freebsd_committer 2020-03-09 15:29:26 UTC
(In reply to Raffeale from comment #17)

That branch is not supported and not actively developed.  It is missing some stuff to make it work, and it's not a priority to get it working.  Please stick to the supported branch for FreeBSD 12.1, or try the 5.0 on CURRENT.
Comment 19 Raffeale 2020-03-09 15:59:16 UTC
(In reply to Niclas Zeising from comment #18)
but amdgpu drm not work in freebsd 12.1 and ports,I have already test it, the amdgpu module can't detect GPU correctly. tomorrow I will try to test drm ports  again.
Comment 20 Niclas Zeising freebsd_committer 2020-03-09 16:01:42 UTC
(In reply to Raffeale from comment #19)

It might be that your graphics card isn't supported by the driver in FreeBSD 12.  If that's the case, then you have to run CURRENT.
Comment 21 Raffeale 2020-03-10 00:08:17 UTC
(In reply to Niclas Zeising from comment #20)
the freebsd 12.1 drm driver has some bug for amd vega8 . but the drm5.0 works fine from freebsdDesktop github.
 I added my device in amdgpu_drv.c and recompile it.  it is down when I load the amdgpu driver.  
 so I have to use drm5.0 from github .

now, I want to know  why linuxkpi linux_pci_register_driver don't get suspend and resume callback function.
Comment 22 Raffeale 2020-03-12 11:48:47 UTC
thanks, the issue has been solved , the problem is from linuxkpi , its default compilation verion is lower than 50000, so linuxkpi.ko drops some member variable of struct, it cause this issue, just put LINUXKPI_VERION=50000 in linux Makefile,
thanks alot!