Bug 270509 - Various x11-drivers/xf86-video drivers erroneously assume that no kernel driver will attach to graphics cards (need local patches)
Summary: Various x11-drivers/xf86-video drivers erroneously assume that no kernel driv...
Status: Open
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-x11 (Nobody)
URL:
Keywords:
Depends on: 270869
Blocks:
  Show dependency treegraph
 
Reported: 2023-03-29 00:25 UTC by George Mitchell
Modified: 2024-05-31 04:09 UTC (History)
12 users (show)

See Also:
bugzilla: maintainer-feedback? (x11)


Attachments
Hack to delete the call to pci_device_has_kernel_driver(dev) (540 bytes, patch)
2023-03-29 23:04 UTC, George Mitchell
no flags Details | Diff
xf86-video-vesa patch (2.12 KB, patch)
2023-03-30 08:10 UTC, Emmanuel Vadot
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description George Mitchell 2023-03-29 00:25:05 UTC
After today's update, vesa mode is no longer able to detect my graphics card:

[    31.110] (WW) Falling back to old probe method for scfb
[    31.110] scfb trace: probe start
[    31.111] scfb trace: probe done
[    31.111] vesa: Ignoring device with a bound kernel driver
[    31.111] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[    31.111] (EE) Screen 0 deleted because of no matching config section.
[    31.111] (II) UnloadModule: "modesetting"
[    31.111] (EE) Screen 0 deleted because of no matching config section.
[    31.111] (II) UnloadModule: "vesa"

The "device with a bound kernel driver" message rings a bell.  Didn't something like this happen a few years ago?  Anyway, nothing else changed today other than the xorg-server update, so I can't account for why this would happen.
Comment 1 Mark Millard 2023-03-29 18:50:57 UTC
Which should it be: "vesa fails" (not UEFI) vs. "for scfb" (UEFI)?

vesa is documented to not be appropriate for UEFI. scfb is
documented as being appropriate for UEFI . . .

https://docs.freebsd.org/en/books/handbook/x11/ reports:

"VESA module must be used when booting in BIOS mode and SCFB module must be used when booting in UEFI mode"

There are separate ports:

xf86-video-scfb
xf86-video-vesa

The little bit of material that you present references both.

(Be warned: I'm no expert in the area. I'm just repeating
material that I read.)
Comment 2 George Mitchell 2023-03-29 19:12:42 UTC
I am booting with BIOS mode.  I have no explicit xorg.conf file; xorg-server figures out on its own to use the vesa module.  So far, I have avoided UEFI like the plague.
Comment 3 George Mitchell 2023-03-29 20:46:43 UTC
The "vesa: Ignoring device with a bound kernel driver" message comes from the x11-drivers/xf86-video-vesa port, the source file vesa.c.  However, the ChangeLog file (the file literally named ChangeLog in the port work directory after "make patch") shows no changes since September 2020.  Probably doesn't help debug the problem, but I thought I'd mention it.  ChangeLog says that the check for a kernel driver bound to the device was introduced December 8, 2010.
Comment 4 Mark Millard 2023-03-29 22:04:47 UTC
(In reply to George Mitchell from comment #2)

Hmm. Did your historical logs also show the lines that
contain "scfb" that you show here (or contain other
such scfb lines)? If xf86-video-scfb is installed,
may be it would be good to uninstall it, given your
BIOS use? That might stop it from trying scfb (beyond
a possible set of not-found activity).
Comment 5 George Mitchell 2023-03-29 23:04:30 UTC
Created attachment 241194 [details]
Hack to delete the call to pci_device_has_kernel_driver(dev)

This can not in any way be considered to be a fix, but it does let me temporarily work around the problem.  I'm eagerly awaiting someone doing some actual analysis of the problem.  Since there have not been any recent changes to xf86-video-vesa, perhaps there's something new in the pci_device_has_kernel_driver(dev) routine, wherever that may live.
Comment 6 George Mitchell 2023-03-30 00:18:32 UTC
(In reply to Mark Millard from comment #4)
Here are the references to scfb, and I think they are pretty consistent in every Xorg.0.log I have:

[...]
[    24.700] (==) Matched scfb as autoconfigured driver 2
[    24.700] (==) Matched vesa as autoconfigured driver 3
[...]
[    24.854] (II) LoadModule: "scfb"
[    24.854] (II) Loading /usr/local/lib/xorg/modules/drivers/scfb_drv.so
[    24.863] (II) Module scfb: vendor="X.Org Foundation"
[    24.863]    compiled for 1.21.1.4, module version = 0.0.5
[    24.863]    ABI class: X.Org Video Driver, version 25.2
[...]
[    24.876] (WW) Falling back to old probe method for scfb
[    24.876] scfb trace: probe start
[    24.876] scfb trace: probe done
[...]
[    25.066] (II) UnloadModule: "scfb"
[    25.066] (II) Unloading scfb
[...]

On a separate part of this issue, the title of the bug report probably needs to be changed.  pci_device_has_kernel_driver is defined in the port devel/libpciaccess.  One of the changes from three days ago seems to have removed a FreeBSD change to that subroutine.  I think I may know what to do next.
Comment 7 George Mitchell 2023-03-30 00:21:23 UTC
devel/libpciaccess: I think the deletion of files/patch-src_freebsd__pci.c might be a mistake.  It causes x11-driver/xf86-video-vesa to fail.
Comment 8 George Mitchell 2023-03-30 00:35:20 UTC
Okay, the correct fix for this is definitely in libpciaccess.  The patch in files/patch-src_freebsd__pci.c, just deleted 3 days ago, was created in August 2019 to fix this exact same problem I am having again!  That's why the "device with a bound kernel driver" seemed familiar to me!
Comment 9 Mark Millard 2023-03-30 00:41:46 UTC
(In reply to George Mitchell from comment #6)

That you have /usr/local/lib/xorg/modules/drivers/scfb_drv.so
indicates that xf86-video-scfb is installed for some reason.
Unless something is forcing it to be around, you likely could
delete the installation and so simplify things a bit.

Cool that you found the vesa related problem. Nice work.
Comment 10 George Mitchell 2023-03-30 00:45:49 UTC
The correct fix is to revert the part of commit 	df10dcefa427fcc3a4b1405387b12466dc5a9cdc that deleted files/patch-src_freebsd__pci.c.  The Makefile, distinfo, and pkg-plist changes are correct.  My attachment 241194 [details], which was a hack from the word go, is wrong and should be ignored.  I have verified that making this fix allows my system to run the VESA driver without problem once again.
Comment 11 George Mitchell 2023-03-30 01:40:01 UTC
I'm adding Niclas Zeising <zeising@FreeBSD.org> (the person who originally committed the fix for this bug) and Emmanuel Vadot <manu@FreeBSD.org> (the person who committed the removal of the fix) to the CC list in the hope of expediting the restoration of the fix.
Comment 12 George Mitchell 2023-03-30 01:40:58 UTC
Comment on attachment 241194 [details]
Hack to delete the call to pci_device_has_kernel_driver(dev)

Obsoleting my hack fix.
Comment 13 Emmanuel Vadot freebsd_committer freebsd_triage 2023-03-30 05:31:02 UTC
Mhm ok I'll have a look at that.
I didn't see any problems running without the "fix" and since there where no explanation about it I simply remove it.
Comment 14 Emmanuel Vadot freebsd_committer freebsd_triage 2023-03-30 08:10:48 UTC
Created attachment 241202 [details]
xf86-video-vesa patch

Can you check this patch against xf86-video-vesa ?
The commit itself explains the situation.
Comment 15 Emmanuel Vadot freebsd_committer freebsd_triage 2023-03-30 08:11:53 UTC
Oh I see that this is the hack that you posted earlier, does it work ?
Comment 16 George Mitchell 2023-03-30 18:48:35 UTC
The xf86-video-vesa patch definitely works.  My initial gut reaction was that it was odd that it should be required in this widely distributed package from outside FreeBSD world, and probably would never be adopted upstream.  But on further reflection I find myself asking why the driver cares about whether a kernel driver is attached, especially since in the FreeBSD world it seems unavoidable that the vgapci kernel driver will be attached.  Plus, among the only five ports I use that depend on libpciaccess, only xf86-video-vesa calls pci_device_has_kernel_driver.  So now I'm convinced that your version of my hack is the correct patch to apply.
Comment 17 commit-hook freebsd_committer freebsd_triage 2023-03-30 19:04:19 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=b6d89731d542582d2cc751e7a56b78df76a38ce7

commit b6d89731d542582d2cc751e7a56b78df76a38ce7
Author:     Emmanuel Vadot <manu@FreeBSD.org>
AuthorDate: 2023-03-30 08:05:52 +0000
Commit:     Emmanuel Vadot <manu@FreeBSD.org>
CommitDate: 2023-03-30 19:03:20 +0000

    x11-drivers/xf86-video-vesa: Add patch for ignoring if kernel as a driver

    The vesa driver checks if the kernel as a driver attached to the pci device.
    This used to work before df10dcefa427 ("devel/libpciaccess: Update to 0.17")
    because we had a patch in libpciaccess that always said that the kernel didn't
    had any driver attached. This is obviously not a correct way.
    The problem is that vgapci is always attached for us so for pci video devices
    we always have a driver attached.
    Ignoring the check in xf86-video-vesa seems the best way for us.

    PR:     270509
    Sponsored by:   Beckhoff Automation GmbH & Co. KG

 x11-drivers/xf86-video-vesa/Makefile                     |  2 +-
 x11-drivers/xf86-video-vesa/files/patch-src_vesa.c (new) | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)
Comment 18 George Mitchell 2023-03-30 19:43:13 UTC
I hope I'm doing this right, but it seems to me this bug is now fixed.  Was I right to close it?
Comment 19 Emmanuel Vadot freebsd_committer freebsd_triage 2023-03-31 05:08:27 UTC
(In reply to George Mitchell from comment #18)

Yup :)
Comment 20 Tijl Coosemans freebsd_committer freebsd_triage 2023-04-01 17:04:21 UTC
The following drivers also need to be patched:

xf86-video-ast
xf86-video-cirrus
xf86-video-mga
xf86-video-nv

(Or the original libpciaccess patch could be restored.)
Comment 21 wbe 2023-04-15 02:26:33 UTC
As Tijl Coosemans noted, the problem also affects nv and mga drivers.
Here the problem reported in the Xorg.#.log file is
"The PCI device [...] has a kernel module claiming it."
followed by "This driver cannot operate until it has been unloaded."
Even "Xorg -configure" doesn't work.

What pkg calls libpciaccess version 16 works.  The current version (17) does not, for me, with any driver (mga, nv, nvidia, vesa, modesetting, or any
others I tried) and any xorg.conf file or lack thereof I tried, though the
various choices (such as vesa) and settings sometimes got different complaints.

Manually reverting libpciaccess to version 16 fixed the problem.

Maybe it's the same issue as bug 239065?
In that case, 13.5 worked, 14 didn't, and newly released 16 did.
 -WBE
Comment 22 Alexey Dokuchaev freebsd_committer freebsd_triage 2023-04-15 07:21:29 UTC
(In reply to wbe from comment #21)
> Manually reverting libpciaccess to version [0.]16 fixed the problem.
After removing useless and auto-generated files, the difference between two versions looks manageable:

$ diff -rdupbw libpciaccess-0.1* | wc -l
    1127

Shouldn't be too hard to find the upstream change (commit) which broke things, so current version could be fixed instead of downgrading.
Comment 23 George Mitchell 2023-04-15 15:11:28 UTC
(In reply to wbe from comment #21)
I'm the one who put this misleading title on the bug.  This problem really seems to be in the video drivers incorrectly expecting that no kernel driver will attach to the graphics cards and not to a libpciaccess problem.  I'll fix the title now.
Comment 24 Sergiy 2023-04-15 20:22:21 UTC
Sorry for my stupid question. 
Will the suggested solution in this thread help resolve this (bug #267606)
Comment 25 George Mitchell 2023-04-15 21:16:05 UTC
(In reply to Sergiy from comment #24)
Seems very unlikely to me: this bug never caused core dumps.  It merely made the video drivers erroneously ignore the video card.
Comment 26 commit-hook freebsd_committer freebsd_triage 2023-04-16 08:35:37 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=9fa263c21fb287ab1edc085498d8ecb36b494614

commit 9fa263c21fb287ab1edc085498d8ecb36b494614
Author:     Tijl Coosemans <tijl@FreeBSD.org>
AuthorDate: 2023-04-15 15:31:17 +0000
Commit:     Tijl Coosemans <tijl@FreeBSD.org>
CommitDate: 2023-04-16 08:34:17 +0000

    x11-drivers/xf86-video-nv: Update to 2.1.22

    Also fix regression from df10dcefa427 [1]

    PR:             270869, 270509 [1]
    Approved by:    x11 (manu)

 x11-drivers/xf86-video-nv/Makefile                    |  5 ++---
 x11-drivers/xf86-video-nv/distinfo                    |  6 +++---
 x11-drivers/xf86-video-nv/files/patch-src-nv_driver.c | 19 ++++++++++++++++++-
 .../xf86-video-nv/files/patch-src_riva__xaa.c (gone)  | 14 --------------
 4 files changed, 23 insertions(+), 21 deletions(-)
Comment 27 Sergiy 2023-04-17 21:22:04 UTC
OK. Thanks for the answer.
Comment 28 wbe 2023-05-28 16:59:20 UTC
I'm pleased to report that the new xf86-video-nv 2.1.22 indeed now works with libpciaccess 0.17 for me.

It would be helpful if the mga driver (xf86-video-mga) were similarly upgraded to work with libpciaccess 0.17.

I don't recall whether the vesa driver had this particular problem ("... has a kernel module claiming it") or not, but I do recall that it didn't work for me with 0.17 as the driver for mga or nv hardware.
Comment 29 NAKAJI Hiroyuki 2023-06-28 05:25:57 UTC
(In reply to wbe from comment #28)

How about this patch for x11-drivers/xf86-video-mga?

diff -urN /usr/ports/x11-drivers/xf86-video-mga/files/patch-src_mga__driver.c ./x11-drivers/xf86-video-mga/files/patch-src_mga__driver.c
--- /usr/ports/x11-drivers/xf86-video-mga/files/patch-src_mga__driver.c	1970-01-01 09:00:00.000000000 +0900
+++ ./x11-drivers/xf86-video-mga/files/patch-src_mga__driver.c	2023-06-28 14:09:39.602063000 +0900
@@ -0,0 +1,18 @@
+--- src/mga_driver.c.orig	2018-12-08 10:08:01.000000000 +0900
++++ src/mga_driver.c	2023-06-28 14:08:34.513506000 +0900
+@@ -702,6 +702,7 @@
+     ScrnInfoPtr pScrn = NULL;
+     MGAPtr pMga;
+ 
++#ifndef __FreeBSD__
+     if (pci_device_has_kernel_driver(dev)) {
+ 	/* If it's a G200 server chip, it's probably on KMS, so bail; if not,
+ 	 * it might be using matroxfb, which is ok. */
+@@ -721,6 +722,7 @@
+ 	        return FALSE;
+ 	}
+     }
++#endif
+ 
+     /* Allocate a ScrnInfoRec and claim the slot */
+     pScrn = xf86ConfigPciEntity(pScrn, 0, entity_num, MGAPciChipsets,
Comment 30 wbe 2023-12-18 17:21:14 UTC
I haven't tried the patch because the system in question runs entirely on pkg packages.  I've watched for a new version of xf86-video-mga, but I haven't seen one so far.  I then hoped the update would be included with FreeBSD 14.0, but it wasn't.

I haven't had occasion to use git (ever).  My experience from the portsnap days is that trying to build a small piece (such as this driver) may try to drag in and build the whole hierarchy, and I'm not interested in building all of X11 just to get this piece, so I've shied away from trying.

If my worries are unfounded, and it's possible to build just xf86-video-mga using the otherwise existing 14.0-RELEASE pkg installed X11 headers and libraries, point me to a description of how to use git to do that and I'll give it a try.
Comment 31 NAKAJI Hiroyuki 2024-03-06 08:41:41 UTC
(In reply to wbe from comment #30)
Handbook explains how to use git: https://docs.freebsd.org/en/books/handbook/mirrors/#git

I updated my FUJITSU TX1330 M2 Server from 13.2-RELEASE-p10 to 13.3-RELEASE using freebsd-update. And my patch in comment #29 is still helpful with x11-drivers/xf86-video-mga and libpciaccess 0.18.

pciconf -lv says:

vgapci0@pci0:2:0:0:     class=0x030000 rev=0x05 hdr=0x00 vendor=0x102b device=0x0522 subvendor=0x1734 subdevice=0x11cc
    vendor     = 'Matrox Electronics Systems Ltd.'
    device     = 'MGA G200e [Pilot] ServerEngines (SEP1)'
    class      = display
    subclass   = VGA

Thanks.
Comment 32 George Mitchell 2024-03-06 20:59:10 UTC
To the best of my knowledge, this bug is fixed and I would be happy to close it.  But I have just the one specific type of video hardware.  Does anyone know of some hardware that still has a problem with this bug?  If not, I will close it as overcome by events.
Comment 33 wbe 2024-03-06 23:08:00 UTC
Ah, so you were waiting for me to try it.  I've been busy doing something else I wanted to finish before trying to fight building this from git (see Comment 30).  

If the mga change is basically the same as the nvidia change, I would expect it to work.  If any other drivers (e.g., the VESA driver) need the same fix, I would hope they get it, too. 

If the xf86-video-mga change appears as a pkg upgrade, I can try it pretty quickly and then follow up here.
Comment 34 George Mitchell 2024-03-07 20:09:05 UTC
Regrettably, there's no entry in the xf86-video-mga log that would indicate that this patch (comment #29) has been committed, and I'm not a committer.  So I guess it's too soon to close.
Comment 35 Mark Millard 2024-03-07 21:55:56 UTC
(In reply to George Mitchell from comment #34)

The same may be true of comment #20 's items:

xf86-video-ast
xf86-video-cirrus

since no commit reports are list here yet for them either.
Comment 36 Josmar Calin De Pierri 2024-05-08 12:18:15 UTC
(In reply to wbe from comment #30)

This is too my exact situation.
When xf86-video-mga appear as pkg I'll test it right away.
Comment 37 Paul Telles (Starcat) 2024-05-31 04:09:24 UTC
I encountered the same issue on an IBM server.

# dmidecode -t system
# dmidecode 3.5
# SMBIOS entry point at 0x7e7bf000
Found SMBIOS entry point in EFI, reading table from /dev/mem.
SMBIOS 2.7 present.

Handle 0x0015, DMI type 1, 27 bytes
System Information
	Manufacturer: IBM
	Product Name: IBM System x3300 M4 -[7382PBC]-
[...]
# pciconf -lv | grep -B3 display
vgapci0@pci0:4:0:0:	class=0x030000 rev=0x00 hdr=0x00 vendor=0x102b device=0x0534 subvendor=0x1014 subdevice=0x0405
    vendor     = 'Matrox Electronics Systems Ltd.'
    device     = 'G200eR2'
    class      = display
# pkg search xf86-video-mga
xf86-video-mga-2.0.0_4,3       X.Org mga display driver
# cat /var/log/Xorg.0.log
[   208.073] 
X.Org X Server 1.21.1.13
X Protocol Version 11, Revision 0
[   208.073] Current Operating System: FreeBSD bluey 14.1-RC1 FreeBSD 14.1-RC1 releng/14.1-n267678-4de43de58f51 GENERIC amd64
[   208.073]  
[   208.073] Current version of pixman: 0.42.2
[   208.073] 	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
[   208.073] Markers: (--) probed, (**) from config file, (==) default setting,
	(++) from command line, (!!) notice, (II) informational,
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   208.074] (==) Log file: "/var/log/Xorg.0.log", Time: Thu May 30 22:21:32 2024
[   208.134] (==) Using config directory: "/usr/local/etc/X11/xorg.conf.d"
[   208.134] (==) Using system config directory "/usr/local/share/X11/xorg.conf.d"
[   208.148] (==) No Layout section.  Using the first Screen section.
[   208.148] (==) No screen section available. Using defaults.
[   208.148] (**) |-->Screen "Default Screen Section" (0)
[   208.148] (**) |   |-->Monitor "<default monitor>"
[   208.149] (==) No device specified for screen "Default Screen Section".
	Using the first device section listed.
[   208.149] (**) |   |-->Device "Card0"
[   208.149] (==) No monitor specified for screen "Default Screen Section".
	Using a default monitor configuration.
[   208.149] (**) Allowing byte-swapped clients
[   208.149] (==) Automatically adding devices
[   208.149] (==) Automatically enabling devices
[   208.149] (==) Automatically adding GPU devices
[   208.149] (==) Automatically binding GPU devices
[   208.149] (==) Max clients allowed: 256, resource mask: 0x1fffff
[   208.307] (==) FontPath set to:
	/usr/local/share/fonts/misc/,
	/usr/local/share/fonts/TTF/,
	/usr/local/share/fonts/OTF/,
	/usr/local/share/fonts/Type1/,
	/usr/local/share/fonts/100dpi/,
	/usr/local/share/fonts/75dpi/,
	catalogue:/usr/local/etc/X11/fontpath.d
[   208.307] (==) ModulePath set to "/usr/local/lib/xorg/modules"
[   208.307] (II) The server relies on udev to provide the list of input devices.
	If no devices become available, reconfigure udev or disable AutoAddDevices.
[   208.307] (II) Module ABI versions:
[   208.307] 	X.Org ANSI C Emulation: 0.4
[   208.307] 	X.Org Video Driver: 25.2
[   208.307] 	X.Org XInput driver : 24.4
[   208.307] 	X.Org Server Extension : 10.0
[   208.310] (--) PCI:*(4@0:0:0) 102b:0534:1014:0405 rev 0, Mem @ 0x92000000/16777216, 0x917fc000/16384, 0x91800000/8388608, BIOS @ 0x????????/65536
[   208.311] (II) LoadModule: "glx"
[   208.339] (II) Loading /usr/local/lib/xorg/modules/extensions/libglx.so
[   208.407] (II) Module glx: vendor="X.Org Foundation"
[   208.407] 	compiled for 1.21.1.13, module version = 1.0.0
[   208.407] 	ABI class: X.Org Server Extension, version 10.0
[   208.407] (II) LoadModule: "mga"
[   208.408] (II) Loading /usr/local/lib/xorg/modules/drivers/mga_drv.so
[   208.423] (II) Module mga: vendor="X.Org Foundation"
[   208.423] 	compiled for 1.21.1.13, module version = 2.0.0
[   208.423] 	Module class: X.Org Video Driver
[   208.423] 	ABI class: X.Org Video Driver, version 25.2
[   208.423] (II) MGA: driver for Matrox chipsets: mga2064w, mga1064sg, mga2164w,
	mga2164w AGP, mgag100, mgag100 PCI, mgag200, mgag200 PCI,
	mgag200 SE A PCI, mgag200 SE B PCI, mgag200 EV Maxim,
	mgag200 ER SH7757, mgag200 eW Nuvoton, mgag200 eW3 Nuvoton,
	mgag200eH, mgag200eH3, mgag400, mgag550
[   208.424] (--) Using syscons driver with X support (version 2.0)
[   208.424] (--) using VT number 9

[   208.424] (EE) mga: The PCI device 0x534 at 04@00:00:0 has a kernel module claiming it.
[   208.424] (EE) mga: This driver cannot operate until it has been unloaded.
[   208.424] (EE) No devices detected.
[   208.424] (EE) 
Fatal server error:
[   208.424] (EE) no screens found(EE) 
[   208.424] (EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
[   208.424] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   208.424] (EE) 
[   208.424] (EE) Server terminated with error (1). Closing log file.
#