Bug 278930 - graphics/drm-kmod a blank screen in all TTYs till reboot if PulseAudio is installed (always) or if certain scenarios of playing Youtube videos on Firefox
Summary: graphics/drm-kmod a blank screen in all TTYs till reboot if PulseAudio is ins...
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Some People
Assignee: Jesper Schmitz Mouridsen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-12 09:29 UTC by Rostislav Krasny
Modified: 2024-10-29 18:40 UTC (History)
6 users (show)

See Also:
bugzilla: maintainer-feedback? (x11)


Attachments
Xorg.0.log (64.56 KB, text/plain)
2024-05-15 18:40 UTC, Rostislav Krasny
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rostislav Krasny 2024-05-12 09:29:04 UTC
In case a monitor is connected to a computer by a displayport and this monitor is in DisplayPort 1.2 mode, logging into any Desktop Environment (I tried LXQT and XFCE) makes screen completely blank, even after trying to switch into another virtual console (Ctrl+Alt+F1, F2... Fn). The system isn't stuck and could be properly rebooted by pressing the power button (most likely through SSH too). Before login into DE display managers like sddm and lightdm work properly even when DisplayPort 1.2 is enabled.

Hardware in my case: the i7-4790 CPU with integrated graphics that is the only graphic card; Dell U2414H monitor.

Windows 10 Pro works properly on this hardware and with that monitor connected through mini displayport with and without the DisplayPort 1.2 mode is enabled in the monitor itself.

System: FreeBSD 14.1-BETA2 x86_64.

The drm-kmod drivers set from ports is installed and i915kms added into the kld_list in /etc/rc.conf

For more information read my emails in the freebsd-stable mailing list:
https://lists.freebsd.org/archives/freebsd-stable/2024-May/002132.html
Comment 1 Marcin Cieślak 2024-05-12 09:41:23 UTC
I understand from your emails that the X environment starts (login display manager), window manager afterwards - but after about 5 seconds the screen goes blank
.
Can you dump /var/log/X0.log files? Also the output of "dmesg" output and /var/log/messages? (You might need to script something to get the logs out of the machine for example using at(1) to copy dmesg/the logs while you cannot see the screen).

Can you disable the login manager, just log in to the text console and see if "startx" provides a working environment? Please start something very simple like "twm" (available as x11-wm/twm from ports). 

The following $HOME/.xinitrc should start twm:

exec /usr/local/bin/twm
Comment 2 Rostislav Krasny 2024-05-13 17:52:20 UTC
Hi Marcin and thank you for your quick response.

After further investigation it turned out that the problem is not related to the DisplayPort 1.2 mode at all and I was just got confused by circumstances. Let me tell all the story again.

1. I installed FreeBSD 14.1-BETA2.
2. I installed LXQT, sddm, Xorg and video drivers. Then I tried to log into the LXQT session and ran into the blank screen problem for the first time.
3. Without removing LXQT and sddm (but I disabled sddm in the /etc/rc.conf) I installed XFCE and lightdm. When I tried the XFCE I ran into the blank screen problem second time and though this is some global issue.

4. I decided to try something different and installed the latest snapshot of DragonFlyBSD with Xorg and some dm instead of the previously installed FreeBSD. DragonFlyBSD has the blank screen problem as well, even in the dm login page. This was the first time when I opened my internal monitor settings and disabled the DisplayPort 1.2 support. This fixed the problem in the DragonFlayBSD and I decided to go back to the FreeBSD 14.1-BETA2. That "fix" confused me and this is why I thought the problem in FreeBSD is the same.

5. I installed FreeBSD 14.1-BETA2 again.
6. Installed Xorg, video drivers, XFCE and lightdm but not LXQT and video works fine even with DisplayPort 1.2 enabled.

Installing LXQT even without sddm brings this problem back. Running "pkg delete lxqt && pkg autoremove" fixes it. So this is something in LXQT and it brakes all other installed DEs.

I will change the title of this bug report.
Comment 3 Rostislav Krasny 2024-05-14 12:58:24 UTC
Now I've tried Cinnamon and it has the same destructive impact to my system as LXQT. Unlike LXQT the Cinnamon DE is based on GTK and not on QT. XFCE is also based on GTK but works properly, so it looks like the root of this bug is not in this or that widget toolkit but in something common between Cinnamon and LXQT. Maybe in PulseAudio?
Comment 4 Rostislav Krasny 2024-05-14 13:29:05 UTC
Indeed, this is PulseAudio - the final culprit!

I completely removed the previously installed Cinnamon:
1. pkg delete cinnamon
2. pkg autoremove
3. rmuser pulse  (removes both the user and its main group)
4. pw groupdel pluse-access && pw groupdel pluse-rt
And restarted.

I checked that the remaining XFCE works properly again and logged out.
Then I installed only pulseaudio (actually it brings a few other dependencies) by the "pkg install pluseaudio" command and tried to login as a regular user and ran "startx" with XFCE again. This time my screen became blank after about 5 seconds again and can't show anything even on other virtual consoles (Ctrl+Alt+Fn) as it was with LXQT and Cinnamon. But this time neither LXQT nor Cinnamon were installed.

Then I properly rebooted by pressing Ctrl+Alt+Del, logged in as root in the first console, ran the "pkg delete pulseaudio" and without removing anything else (users, groups, dependencies) logged in as a regular user in the second console and started XFCE by running startx. Now XFCE works properly again.

So the PulseAudio package is the root of this issue!

How could it be related to video? The Intel HD 4600 video controller, integrated into my i7-4790 CPU can also work as a sound controller that works through the monitor's hdmi or displayport connectors.

% cat /dev/sndstat
Installed devices:
pcm0: <Intel Haswell (HDMI/DP 8ch)> (play)
pcm1: <Realtek ALC221 (Analog)> (play/rec) default
pcm2: <Realtek ALC221 (Analog 2.0+HP)> (play)
No devices installed from userspace.

Even when the pcm0 device isn't used and isn't marked as default the PulseAudio interacts with it, probably. And because this is almost the same hardware that is responsible to video as well something goes wrong and my screen becomes blank. This is just a theory.
Comment 5 Rostislav Krasny 2024-05-14 14:09:08 UTC
Now I've found a way to reproduce this or at least a very similar issue even without PulseAudio installed.

1. Logged out from XFCE session
2. Changed the default sound device from my pcm1 to my pcm0 by running "sysctl hw.snd.default_unit=0"
3. Logged in XFCE again.
4. Open Youtube in Firefox and started to play some video
5. Start rewinding and/or forwarding the video
6. When the video is still playing try to go back to the first page of Youtube by clicking on its emblem in the left top corner of the Youtube site

In the 5th step I have pauses of 2-3 seconds each every time I do rewind or forward of the video. During each of such pause my screen is blank and sound is also paused.

In the 6th step I got an infinite pause, i.e. my screen became blank forever (till reboot) and sound is also stopped to work.

Without steps 5 and 6 FXCE work properly. It seems like PulseAudio is constantly triggers something similar to what steps 5 and 6 did.

Also take a look on the following quote from my /var/log/messages

May 14 16:04:20 pluto pkg[1128]: pulseaudio-16.1_4 deinstalled
May 14 16:39:06 pluto devd[790]: notify_clients: send() failed; dropping unresponsive client
May 14 16:39:06 pluto syslogd: last message repeated 3 times
May 14 16:40:23 pluto kernel: drmn0: [drm] *ERROR* uncleared fifo underrun on pipe A
May 14 16:40:23 pluto kernel: drmn0: [drm] *ERROR* CPU pipe A FIFO underrun
May 14 16:41:21 pluto kernel: .
May 14 16:41:21 pluto syslogd: exiting on signal 15
May 14 16:42:03 pluto syslogd: kernel boot file is /boot/kernel/kernel
May 14 16:42:03 pluto kernel: Waiting (max 60 seconds) for system process `vnlru' to stop... done
May 14 16:42:03 pluto kernel: Waiting (max 60 seconds) for system process `syncer' to stop...
May 14 16:42:03 pluto kernel: Syncing disks, vnodes remaining... 7 0 done
May 14 16:42:03 pluto kernel: All buffers synced.

According to my shell commands history I changed the default sound device at 16:38 when PulseAudio was already deinstalled.

   208	16:38	sysctl hw.snd.default_unit=0

So messages from devd and from kernel about drmn0 could be related to this issue.
The 16:41:21 time is probably when I pressed Ctrl+Alt+Del to reboot after the screen became blank.

I think I saw very similar previously with LXQT but didn't paid enough attention then.
Comment 6 Marcin Cieślak 2024-05-14 23:32:52 UTC
1. Can you publish /var/log/Xorg.0.log during the 2-3 seconds pause ?
2. Can you start the Firefox from the terminal and log the output from Firefox (it should print lots of messages while running) and post the output here during the pause?
Comment 7 Marcin Cieślak 2024-05-14 23:41:23 UTC
Any pauses happen if you play a video from the disk using mplayer or mpv from the command line?

https://github.com/freebsd/drm-kmod/issues/14#issuecomment-663682796 suggests adding the following line to /boot/loader.conf

hw.i915kms.enable_psr=0

and rebooting
Comment 8 Rostislav Krasny 2024-05-15 18:31:36 UTC
Warner Losh at freebsd-stable mailing list asked me to try with built from the sources drm-kmod drivers. And I did it using after installing all the sources from:

1. http://ftp.freebsd.org/pub/FreeBSD/releases/arm64/14.1-BETA2/src.txz
2. http://ftp.freebsd.org/pub/FreeBSD/releases/arm64/14.1-BETA2/ports.txz

Now the problem is different. When the sound device is not Intel video everything works properly. But with sysctl hw.snd.default_unit=0 Firefox can't play youtube videos properly. They are played sometimes at 1, sometimes at 0 fps and with no sound. 

I see in the first console many lines like this:

hdac0: Unexpected unsolicited response from address 0: 00000000

And after a long block of such lines I see lines like this:

hdac0: Command 0x00270d01 timeout on address 0

but with different "Command" number.

I see following in the /var/log/messages

ay 15 21:10:05 pluto devd[778]: check_clients:  dropping disconnected client
May 15 21:10:05 pluto syslogd: last message repeated 3 times
May 15 21:10:47 pluto kernel: hdac0: Unexpected unsolicited response from address 0: 00000000
May 15 21:10:47 pluto syslogd: last message repeated 220 times
May 15 21:10:47 pluto kernel: hdac0: Command 0x00270d01 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00270610 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00272d01 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373400 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373411 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x003734f2 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x003734f3 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x003734f4 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x003734f5 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x003734f6 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x003734f7 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00370740 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373000 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373200 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373000 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373100 timeout on address 0
May 15 21:10:47 pluto syslogd: last message repeated 31 times
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373000 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373184 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373101 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x0037310a timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373170 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373101 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373100 timeout on address 0
May 15 21:10:47 pluto syslogd: last message repeated 2 times
May 15 21:10:47 pluto kernel: hdac0: Command 0x00373000 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Command 0x003732c0 timeout on address 0
May 15 21:10:47 pluto kernel: hdac0: Reset setting timeout
Comment 9 Rostislav Krasny 2024-05-15 18:40:45 UTC
Created attachment 250683 [details]
Xorg.0.log
Comment 10 Rostislav Krasny 2024-05-15 18:43:45 UTC
Hi Marcin,

Just attached my Xorg.0.log and it has no error (no (EE) line).
This is with locally built drm-kmod
With previously installed drm-kmod from packages there also were no error.
Comment 11 Rostislav Krasny 2024-05-15 18:54:18 UTC
Hi Warner,

Actually the installed from source and installed from packages "drm-kmod"'s are of different versions. After "make deinstall" I tried to install the packages version again and have noticed for this:

# pkg install drm-kmod
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Updating database digests format: 100%
Checking integrity... done (1 conflicting)
  - drm-515-kmod-5.15.118_4 conflicts with drm-61-kmod-6.1.69_2 on /boot/modules/dmabuf.ko
Checking integrity... done (0 conflicting)
The following 3 package(s) will be affected (of 0 checked):

Installed packages to be REMOVED:
	drm-61-kmod: 6.1.69_2

New packages to be INSTALLED:
	drm-515-kmod: 5.15.118_4

Installed packages to be REINSTALLED:
	drm-kmod-20220907_3 (direct dependency changed: drm-515-kmod)

Number of packages to be removed: 1
Number of packages to be installed: 1
Number of packages to be reinstalled: 1

The operation will free 3 MiB.

Proceed with this action? [y/N]:
Comment 12 Rostislav Krasny 2024-05-15 19:07:51 UTC
Unfortunately adding the "hw.i915kms.enable_psr=0" line into my /boot/loader.conf

Another line that I have there is "screen.textmode=0" and changing it back to the default value of "screen.textmode=1" has no influence to this issue too.

Also it's very interesting why "pkg install drm-kmod" brought me a very different version of the drivers from one installed from the locally built port.
Comment 13 Rostislav Krasny 2024-05-21 22:57:45 UTC
Edited the title of this bug report again since this is now obvious like a driver bug. Will try again with just installed 14.1-BETA3. The version difference of the drm-kmod package installed by pkg and by ports was due to a wrong Git branch I used in ports. Sorry about that, I will use the latest quarterly branch next time.
Comment 14 Rostislav Krasny 2024-05-22 20:38:07 UTC
Rebuilt and reinstalled world and kernel today:

% uname -a
FreeBSD pluto.lan 14.1-BETA3 FreeBSD 14.1-BETA3 releng/14.1-n267657-0b367134dd92 GENERIC amd64

Installing drm-kmod from ports (git branch 2024Q2) still brings me version drm-61-kmod-6.1.69_2 instead drm-515-kmod-5.15.118_4 of the driver:

% pkg info | grep "[0-9]-kmod"
drm-61-kmod-6.1.69_2           DRM drivers modules

And this driver has completely different issue. I can't play any Youtube video on Firefox when set the default sound device is my Intel (HDMI/DP 8ch). Since I can't rewinding and forwarding videos I don't know about the original blank screen issue with this driver.

Also there are error messages I already described in my comment #8 when installed this driver from the main ports branch instead of the 2024Q2 one.

Still not understood why pkg and ports bring me different versions of this driver.

Anyway it seems that the sound capabilities of the drm-kmod driver were never tested by anybody on FreeBSD and this is why there are those two weird and nasty bugs in two completely different versions of the driver.
Comment 15 Rostislav Krasny 2024-05-28 00:24:29 UTC
Any update about this bug? Is this bug really in drm-kmod or maybe in LinuxKPI? Will this be fixed soon or will it take longer?