Bug 267421 - graphics/drm-510-kmod i915kms freeze with Skylake (Intel HD graphics 520) on 13.1-STABLE n252850-6b2bbf4ecaa
Summary: graphics/drm-510-kmod i915kms freeze with Skylake (Intel HD graphics 520) on ...
Status: Open
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-x11 (Nobody)
URL: https://www.freshports.org/graphics/d...
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2022-10-29 15:01 UTC by babz
Modified: 2023-06-12 22:17 UTC (History)
7 users (show)

See Also:


Attachments
kldload Freeze after 30 minutes (879.69 KB, image/jpeg)
2022-12-21 13:53 UTC, Dominic Fandrey
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description babz 2022-10-29 15:01:24 UTC
On Sept. 7, support for 13.0 was removed from the ports tree.

As a result, drm-fbsd13-kmod was removed, and drm-kmod now depends on drm-510-kmod instead (for 13.x versions).

On a stable/13 system, drm-510-kmod is broken (inserting i915kms.ko freezes the system). The previous port still works fine.

I'm using
FreeBSD 13.1-STABLE #27 n252850-6b2bbf4ecaa
on an intel skylake box (Intel HD graphics 520)
Comment 1 Graham Perrin freebsd_committer freebsd_triage 2022-10-29 16:05:50 UTC
Contexts: 

1. <https://github.com/freebsd/drm-kmod/issues/>

2. 6b2bbf4ecaa committed 2022-10-27 15:50:58 +0000


(In reply to babz from comment #0)

> … freezes the system) …

A kernel panic?
Comment 2 babz 2022-10-29 17:02:00 UTC
The screen goes blank immediately and the system stops responding.

I'm forced to reset the computer, and when I reboot, there is no log past the kldload.
Comment 3 Taka 2022-11-04 10:41:28 UTC
I have the same issue on my ThinkPad X230 with intel HD Graphics 4000.
I noticed the issue at least from 13.1 stable ISO on Oct 20.
When I kldload /boot/module/i915kms.ko, display goes blank immediately.
Comment 4 Serge Volkov 2022-12-18 18:31:18 UTC
I have the same issue on my HP 255 G3 (AMD E1-6010 APU with AMD Radeon R2 Graphics). Now I'm using FreeBSD 13.1-RELEASE-p5. I have been forced to use the VESA driver since the drm-fbsd13-kmod port was removed, because when I kldload /boot/modules/amdgpu.ko, display goes blank immediately.
Comment 5 Philippe Michel 2022-12-19 09:39:09 UTC
I see a similar problem on a Lenovo X250 / HD Graphics 5500.

System freezes (screen doesn't goes blank) at boot or when I kldload i915kms by hand instead of through rc.conf.

I noticed this somewhat later than reported in the earlier entries: on Dec 7 stable/13-abc542e347 with drm-510-kmod-5.10.113_8 was ok, stable/13-a3c07a933d with the same drm port is not.
Comment 6 Philippe Michel 2022-12-20 18:25:54 UTC
I now think the problem I see is different / not in the drm-kmod port.

I opened https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268492 with a workaround for my case.
Comment 7 Dominic Fandrey freebsd_committer freebsd_triage 2022-12-21 13:53:06 UTC
Created attachment 238957 [details]
kldload Freeze after 30 minutes

The attached picture is on an external monitor connected to the secondary Nvidia GPU (that cannot drive the internal screen).

What you see is a `tail -F /var/log/messages` on the left and the `kldload i915kms` on the right.

Upon the kldload call the primary notebook screen turns itself off immediately.
The xorg system still reacts to a degree. I can still interact with the mouse, switch windows focus, mark text etc..
I cannot get any keyboard input (is that still giant locked?).

You may also note that the Powerline clock is 30 minutes past the i3status clock (that's how long I waited after the kldload call). I'm assuming i3status does a system call that runs afoul of the deadlock that keeps kldload from completing.

The system runs on an i7-9750H with Coffeelake-H UHD Graphics 630, according to `pciconf -lv`.
Comment 8 Emmanuel Vadot freebsd_committer freebsd_triage 2022-12-22 09:15:48 UTC
(In reply to babz from comment #0)

What kind of monitors are attached ?
If multiple monitors or 4k/8k screen please try with only one smaller monitor.
I have no problems on my skylake fyi.
Comment 9 Emmanuel Vadot freebsd_committer freebsd_triage 2022-12-22 09:16:56 UTC
(In reply to Serge Volkov from comment #4)

Likely not the same bug, please open a detailed new PR.
Comment 10 Emmanuel Vadot freebsd_committer freebsd_triage 2022-12-22 09:18:13 UTC
(In reply to Taka from comment #3)

Did it worked before for you ?
You said 13.1-STABLE, does that mean that you recompiled the drm-510-kmod port or did you use the packages ?
There was some ABI break that was fixed recently so kernel modules compiled for 13.1 will work on 13.1-STABLE.
Comment 11 Emmanuel Vadot freebsd_committer freebsd_triage 2022-12-22 09:18:59 UTC
(In reply to Dominic Fandrey from comment #7)

Did it work before ?
It's unlikely the same bug as your hardware setup is very different.
I don't even know if this is supported right now.
Comment 12 Dominic Fandrey freebsd_committer freebsd_triage 2022-12-22 09:28:12 UTC
(In reply to Emmanuel Vadot from comment #11)

I reverted commit 50f61166f7b911a7807b3cb76d0f382a13fbafcd like Philippe suggested and that restores it into working condition. I'm running stable/13 as of yesterday + the revert commit.

I always do a:

# pkg which -q /boot/modules/* | sort -u | xargs -o portmaster -DB

run after installing a new kernel. So drm-kmod is rebuilt every time.

I don't know what's not supposed to be supported, it's a notebook with an IGPU driving the builtin panel and an RTX 2070 driving all external monitor outputs. That's not an unusual setup AFAIK.
Comment 13 Emmanuel Vadot freebsd_committer freebsd_triage 2022-12-22 09:38:25 UTC
(In reply to Dominic Fandrey from comment #12)

Then it's not related to this PR
Jump on https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268492

I honestly didn't knew that i915kms and nvidia driver could work together.
Comment 14 Dominic Fandrey freebsd_committer freebsd_triage 2022-12-22 10:57:24 UTC
(In reply to Emmanuel Vadot from comment #13)

They cannot, they can just coexist, so you get one on DISPLAY :0.0 and DISPLAY :0.1. You have to pick what you run on which display (you can even run different DMs on each one), there is no moving stuff between displays. It's mightily inconvenient. It's like having two Xorg instances in a single process. You can even bind different input devices to each one etc.

What doesn't work is to use one as a GPUDevice for the other and thus use their inputs and outputs as sources/sinks, which would be much more convenient. I.e. if that worked (like it does on Linux supposedly), I'd relegate the IGPU to being a sink just to drive the internal display and have all the rendering performed by the NVIDIA GPU.
Comment 15 Taka 2022-12-23 09:31:40 UTC
(In reply to Emmanuel Vadot from comment #10)

Yes, it worked.
It worked in 13.1 Release and in 13.1 Stable snapshot at some point.
I used the latest packages. Was it the reason of the problem?
Should I compile drm-kmod and so on by using ports?
Comment 16 O. Hartmann 2023-05-21 08:36:09 UTC
I hope this PR is still "hot", otherwise I need to open a new one.

On a Lenovo T560 running 13-STBALE, the HD520 graphics isn't working any more. Techniscal specifications see below.

Background:

Customized Kernel (but also crashes with GENERIC). Port graphics/drm-510-kmod is rebuilt every time the kernel is rebuild, if the module(s) is (are) downloaded via pkg from the official FreeBSD package repository, one will receive an error message on loading the module into the kernel like:

[...]
login: link_elf_obj: symbol __lkpi_fpu_ctx_level undefined
linker_load_file: /boot/modules/i915kms.ko - unsupported file type
link_elf_obj: symbol __lkpi_fpu_ctx_level undefined
linker_load_file: /boot/modules/i915kms.ko - unsupported file type
link_elf_obj: symbol __lkpi_fpu_ctx_level undefined
linker_load_file: /boot/modules/i915kms.ko - unsupported file type
link_elf_obj: symbol __lkpi_fpu_ctx_level undefined
linker_load_file: /boot/modules/i915kms.ko - unsupported file type
link_elf_obj: symbol __lkpi_fpu_ctx_level undefined
linker_load_file: /boot/modules/i915kms.ko - unsupported file type
link_elf_obj: symbol __lkpi_fpu_ctx_level undefined
linker_load_file: /boot/modules/i915kms.ko - unsupported file type
link_elf_obj: symbol __lkpi_fpu_ctx_level undefined
linker_load_file: /boot/modules/i915kms.ko - unsupported file type
Apr 22 11:47:42 <4.5> hermann login[1792]: ROOT LOGIN (root) ON ttyv1
link_elf_obj: symbol __lkpi_fpu_ctx_level undefined
linker_load_file: /boot/modules/i915kms.ko - unsupported file type                       
[...]

The kernel has been compiled with debugging facilities ON as defaulted in GENERIC.

Phenomenon:

Starting FreeBSD and loading the drm-kmod module i915.ko works fine. I also get to the point with the presentation of the xdm login. Starting windowmaker (x11-wm/windowmaker), the Labtop immediately goes blank/dark and reboots. There is no trace of a core dump or a debugger console as expected.

Changing the windowmanager from wmaker to, say, blackbox (x11-wm/blackbox) or twm (x11-wm/twm, mitigates the problem, does mean: the OS starts the GUI. But in case of twm, opening xterm results in the very same crash behaviour as with wmaker. Running blackbox gives some relaxed behaviour, but starting any(!) larger X11 application, like LibreOffice (soffice), Firefox or Thundebird. But even with clients with an obvious smaller memory foortprint it takes a while until a crash occurs.

This behaviour is seen on a couple of T560s around here with 13.2-RELENG (not well tested, just a quick check), 13-STABLE as shown below.

I have no exact point in time when the problems occur, bu it was the transition between 13-STABLE -> 13.2-STABLE before the official RELEASE of 13.2 (running the 13-STABLE by compiling the whole sources on a regular basis).

While keeping the kernel and the way it is compiled  an invariant, I changed from seld-made poudriere built ports (with recent 13-STABLE jail) to official FreeBSD package repository and back - with no effect on the result of a crash.

As reported, I can not provide with a coredump or anything sent by the kernel debugger since there is none of such a postmoretm info.

The T560 is running the latest firmware/BIOS available from Lenovo (V1.45)



[...]
FreeBSD 13.2-STABLE #12 stable/13-n255406-69ce8ed3c650: Thu May 18 08:03:48 CEST 2023 amd64.

All ZFS (zfsroot). UEFI boot.


[... TECH SPEC ...]
mptable_probe: MP Config Table has bad signature:
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.2-STABLE #12 stable/13-n255406-69ce8ed3c650: Thu May 18 08:03:48 CEST 2023
    root@hermann:/usr/obj/usr/src/amd64.amd64/sys/HERMANN amd64
FreeBSD clang version 15.0.7 (https://github.com/llvm/llvm-project.git llvmorg-15.0.7-0-g8dfdcc7b7bf6)
VT(efifb): resolution 1920x1080
module zfsctrl already present!
CPU microcode: no matching update found
CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz (2800.00-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x406e3  Family=0x6  Model=0x4e  Stepping=3
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffafbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x29c6fbf<FSGSBASE,TSCADJ,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,NFPUSG,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PROCTRACE>
  Structured Extended Features3=0xbc002e00<MCUOPT,MD_CLEAR,TSXFA,IBPB,STIBP,L1DFL,ARCH_CAP,SSBD>
  XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
  IA32_ARCH_CAPS=0xc04<RSBA>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 17179869184 (16384 MB)
avail memory = 16464908288 (15702 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <LENOVO TP-N1K  >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 hardware threads
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
random: unblocking device.
Security policy loaded: MAC/ntpd (mac_ntpd)
Security policy loaded: TrustedBSD MAC/BSD Extended (mac_bsdextended)
ioapic0 <Version 2.0> irqs 0-119
Launching APs: 1 2 3
random: entropy device external interface
kbd1 at kbdmux0
efirtc0: <EFI Realtime Clock>
efirtc0: registered as a time-of-day clock, resolution 1.000000s
smbios0: <System Management BIOS> at iomem 0xb7064000-0xb706401e
smbios0: Version: 2.8, BCD Revision: 2.8
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS>
acpi0: <LENOVO TP-N1K>
acpi_ec0: <Embedded Controller: GPE 0x16, ECDT> port 0x62,0x66 on acpi0
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 24000000 Hz quality 950
Event timer "HPET" frequency 24000000 Hz quality 550
Event timer "HPET1" frequency 24000000 Hz quality 440
Event timer "HPET2" frequency 24000000 Hz quality 440
Event timer "HPET3" frequency 24000000 Hz quality 440
Event timer "HPET4" frequency 24000000 Hz quality 440
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1808-0x180b on acpi0
acpi_lid0: <Control Method Lid Switch> on acpi0
acpi_button0: <Sleep Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0xe000-0xe03f mem 0xe0000000-0xe0ffffff,0xc0000000-0xdfffffff irq 16 at device 2.0 on pci0
vgapci0: Boot video device
xhci0: <Intel Sunrise Point-LP USB 3.0 controller> mem 0xe1220000-0xe122ffff at device 20.0 on pci0
xhci0: 32 bytes context size, 64-bit DMA
usbus0 on xhci0
usbus0: 5.0Gbps Super Speed USB v3.0
pchtherm0: <Skylake PCH Thermal Subsystem> mem 0xe124b000-0xe124bfff at device 20.2 on pci0
pci0: <simple comms> at device 22.0 (no driver attached)
ahci0: <Intel Sunrise Point-LP AHCI SATA controller> port 0xe080-0xe087,0xe088-0xe08b,0xe060-0xe07f mem 0xe1248000-0xe1249fff,0xe124f000-0xe124f0ff,0xe124d000-0xe124d7ff at device 23.0 on pci0
ahci0: AHCI v1.31 with 1 6Gbps ports, Port Multiplier not supported
ahcich1: <AHCI channel> at channel 1 on ahci0
pcib1: <ACPI PCI-PCI bridge> at device 28.0 on pci0
pci1: <ACPI PCI bus> on pcib1
rtsx0: <2.1g Realtek RTS522A PCIe SD Card Reader> mem 0xe1100000-0xe1100fff at device 0.0 on pci1
rtsx0: Interrupt card inserted/removed
rtsx0: Card absent
rtsx0: No card is detected
pcib2: <ACPI PCI-PCI bridge> at device 28.2 on pci0
pci2: <ACPI PCI bus> on pcib2
pci2: <network> at device 0.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
pci0: <memory> at device 31.2 (no driver attached)
hdac0: <Intel Sunrise Point-LP HDA Controller> mem 0xe1240000-0xe1243fff,0xe1230000-0xe123ffff at device 31.3 on pci0
ichsmb0: <Intel Sunrise Point-LP SMBus controller> port 0xefa0-0xefbf mem 0xe124e000-0xe124e0ff at device 31.4 on pci0
[...]
Comment 17 O. Hartmann 2023-05-21 08:38:22 UTC
See also Bug 271391.
Comment 18 O. Hartmann 2023-06-12 22:17:59 UTC
In my case, the option

Option "AccelMethod" "sna"

in the card driver for the intel iGPU was the culprit. Commenting out this option returns the X11 in a work state again.