Bug 250700 - drm-kmod i915kms binary package not working on 12.2-RELEASE
Summary: drm-kmod i915kms binary package not working on 12.2-RELEASE
Status: Open
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-x11 (Nobody)
URL:
Keywords:
: 250678 (view as bug list)
Depends on:
Blocks:
 
Reported: 2020-10-28 18:59 UTC by R. Dash
Modified: 2020-12-31 02:29 UTC (History)
18 users (show)

See Also:


Attachments
Xorg.0.log in 12.2 Release with Intel HD 2000 graphics (5.44 KB, text/plain)
2020-10-29 06:16 UTC, mailto1979@rediffmail.com
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description R. Dash 2020-10-28 18:59:01 UTC
Xorg complains that it cannot find mesa when i915kms is installed using the binary package, but runs just fine when built locally out of ports.  Known bad on at least two systems.
Comment 1 Li-Wen Hsu freebsd_committer 2020-10-28 22:16:12 UTC
This is somehow an unfortunately known issue.  The root cause is that currently the 12.x packages are all built on 12.1 because it is the oldest supported version in 12 branch, theoretically, the packages should be working on 12.2 and later, but if a port uses internal KPI/KBIs, then its binary package has certain chance cannot be usable cross minor versions.

The workaround is like what you said, building from ports to let it use the 12.2 interfaces. The solution is still under working.
Comment 2 mailto1979@rediffmail.com 2020-10-29 01:20:40 UTC
I can confirm. Using Intel integrated graphics and is not working after the upgrade from 12.1 to 12.2 Release version. 
[    31.518] (EE) open /dev/dri/card0: No such file or directory
Comment 3 Teran McKinney 2020-10-29 03:37:38 UTC
I had tested i915kms on one of the 12.2 pre-releases (I forget if beta or RC). I didn't have any issues with it. Mine was compiled (by me) on 12.1 with the latest 12.1 patches.

Will need to try again.

Does X fail to start, or is this a logged error message?
Comment 4 mailto1979@rediffmail.com 2020-10-29 06:13:37 UTC
(In reply to Teran McKinney from comment #3)
Yes. X fails to start. I have attached the last Xorg.0.log.
Comment 5 mailto1979@rediffmail.com 2020-10-29 06:16:34 UTC
Created attachment 219191 [details]
Xorg.0.log in 12.2 Release with Intel HD 2000 graphics

There is no line where intel driver module is added by me. But, this is what it shows.
Comment 6 Niclas Zeising freebsd_committer 2020-10-29 08:39:40 UTC
As Li-Wen already stated, please build the driver locally from ports.
Comment 7 Masayoshi Fujimoto 2020-11-01 02:06:45 UTC
(In reply to mailto1979@rediffmail.com from comment #2)

Intel(R) HD Graphics 620 (Kaby Lake GT2) 
I got same error.
Comment 8 Matías Pizarro 2020-11-01 10:42:06 UTC
When building locally make sure you have the 12.2 sources in /usr/src.

It might be an edge case but it caught me :D (I still had the 12.1 sources)
Comment 9 Ryan Moeller freebsd_committer 2020-11-03 17:16:13 UTC
*** Bug 250678 has been marked as a duplicate of this bug. ***
Comment 10 stephan 2020-11-03 17:37:14 UTC
I understand that this problem will potentially occur each time when there is an update from x.y to x.y+1 while packages are still built for x.y (because x.y is not yet EOL).

In my opinion, this issue will bite many users that are not so experienced with FreeBSD and will stand in the way of increasing desktop usage, so maybe we should find a generic solution beyond fixing it for 12.2?
Comment 11 Niclas Zeising freebsd_committer 2020-11-03 19:46:08 UTC
(In reply to stephan from comment #10)
Yes, that would be quite nice, but so far no one has stepped up to work on this.
Comment 12 Teran McKinney 2020-11-10 16:28:39 UTC
I recompiled drm-kmod (and the fb12.0 one). kldload i915kms works fine and startx is fine, however anything OpenGL does not work.

dmesg shows some "Hangcheck timer elapsed... GPU hung" errors.

I can run OpenGL stuff inside Xephyr, though, which is what I'd expect.
Comment 13 P Kern 2020-12-12 23:25:17 UTC
(In reply to Niclas Zeising from comment #11)
Just spent 2 days bashing head on bricks until stumbled across this thread.
The rebuild-the-port workaround did the trick for a macmini5,1. Thanks!
What sort of "stepping up" work would be needed to save someone else from having to flail in the wilderness until the port is in synch?
Comment 14 rkoberman 2020-12-13 07:48:26 UTC
(In reply to P Kern from comment #13)
We need a mechanism to tag ports that build kmods and have a poudriere build of them for every minor release of FreeBSD instead of just the oldest This has bitten me over the years for several ports, but virtualbox-ose and drm-*-kmod ports are the ones that have hit repeatedly.
Comment 15 Bruce Lilly 2020-12-13 16:31:29 UTC
First, other things are affected: even after rebuilding drm-fbsd12.0-kmod, sddm dumps core and gdm just doesn't work.  Rebuilding both (and their many prerequisites) from source appears to work.  There are other programss that appear to have problems (e.g. a couple from recent logs):
kernel: pid 47952 (mate-terminal), jid 0, uid 1000: exited on signal 11 (core dumped)
kernel: pid 51154 (conftest), jid 0, uid 0: exited on signal 11 (core dumped)
It appears that I'll be dealing with update-related bugs for quite some time.

Second, this is an epic failure of release engineering: [as of this writing] neither the Release notes (https://www.freebsd.org/releases/12.2R/relnotes.html) nor Hardware notes (https://www.freebsd.org/releases/12.2R/hardware.html) mention or even hint at this set of problems.  Per contra, the Release notes state "[amd64,i386] Binary upgrades between RELEASE versions (and snapshots of the various security branches) are supported using the freebsd-update(8) utility."

In fact, update from 12.1 via freebsd-update didn't work.  `freebsd-update rollback` also didn't work, resulting in "ld-elf.so.1: /lib/libc.so.7: Unsupported relocation type 37 in non-PLT relocations".  System was left utterly unusable (couldn't even log in).

I then had to install from scratch, and of course installation of 12.2-RELEASE has the same problem with non-working display (even the console is unsatisfactory, because default 800x600 VESA resolution doesn't work well on a 1366x768 laptop display).  N.B. without another working OS or device, that would have been a complete disaster: no X essentially means no browser and no practical way to even search for a solution.

When it's working, FreeBSD is almost as usable as a desktop OS as a systemd{ebacle,isaster,estroyer}-less Linux distribution (e.g. Devuan).  "Almost" because of lack of some things (gparted, for example) and non-POSIX utilities (e.g. ps).  But as long as disasters such as the above utter failure of update/rollback/installation exist, I cannot in good conscience recommend FreeBSD to others.  So it's also an epic failure on the advocacy front.
Comment 16 Niclas Zeising freebsd_committer 2020-12-13 17:10:05 UTC
(In reply to Bruce Lilly from comment #15)

Hi!
The drm-fbsd12.0-kmod issue is unfortunate, but it is a known issue.  Ranting about it does not help much, what helps is supporting development.

That said, it looks like something else is hosed on your system apart from the issue with drm-fbsd12.0-kmod.  I have had no problem upgrading both test and production systems to 12.2 (I have compiled drm-fbsd12.0-kmod locally though), and an issue with ld-elf.so.1 would have been all over the mailing lists.  I also don't think that downgrading a system with freebsd-update is supported.  Using ZFS snapshots and rolling back those works though (as long as you don't do zpool upgrade), or using boot environments.

The coredump from conftest is usually autotools (configure scripts) that does things to detect the system, so that should be quite alright, and even expected if you are building ports on the system.

ps not being POSIX is for historical reasons.  While it might be unfortunate, it's just the way it is.
Comment 17 Bruce Lilly 2020-12-13 18:28:36 UTC
(In reply to Niclas Zeising from comment #16)
Known issues affecting a release ought to be documented clearly in the relevant release notes and hardware notes, no?

Pointing out the failure in documenting supposedly-known issues isn't a rant; it's pointing out bugs (in documentation and procedure) that ought to be corrected; release notes is exactly where "known issues" are supposed to be documented.

A statement in the release notes that freebsd-update is expected to work for amd64[,i386] binary upgrades of kernel (presumably including kernel modules) and unmodified userland utilities, if/when in fact it is known *not* to work is at best misleading and just plain wrong.  Instead, the release notes should clearly state (with at least as much emphasis as the final "Important" "This change does not affect[...]" note) that binary updates in fact won't work, at least for systems where kldstat lists "i915kms.ko" (note that I've given a hint about how to identify such systems).  Additionally, the release notes could provide some details about possible work-arounds.  Ideally, there wouldn't be a release until it was known to work (either a point release that is truly binary compatible, or a version bump with compatible binaries).

"Supporting development" is tied to advocacy, and undocumented or poorly-documented major "gotchas" are an impediment to advocacy (among other things). I'm not the first or only person to point this out in comments here.

The "Unsupported relocation type" resulted after attempting to rollback the failed update.  Had the update worked, I would not have had to rollback. Major incompatibilities (such that binary upgrades and rollbacks in fact do not work) make what is supposedly a point release into what is effectively a major version change; distributed binaries aren't compatible, and there's no path back short of reinstallation.  Note that the release notes specifically recommend freebsd-update, and freebsd-update(8) specifically addresses rolling back binary updates.

ZFS boot environments aren't applicable to UFS/FFS installations, and ZFS isn't typically used on laptops (or even most desktops); ZFS is targeted at large server applications.  Moreover, it isn't a solution to the upgrade problem; it's similar to having multiple installations of different versions, e.g. on different disks or partitions [and in fact I have other installations of 12.1, e.g. on a USB stick].  Abandoning an unusable ZFS boot environment is slightly easier (in the moment, not taking into account amortized administrative overhead of ZFS) than wiping a partition of a failed installation, but a non-working installation remains non-working, whether it's on ZFS or UFS.
Comment 18 Nicholas Edward Named 2020-12-30 21:25:27 UTC
(In reply to Niclas Zeising from comment #16)
> The drm-fbsd12.0-kmod issue is unfortunate, but it is a known issue.

It wasn't known to me, a humble user. I read the Release Notes, Errata, Hardware Notes, UPDATING and the Handbook before I upgraded, but I was still caught off guard and left without a working system until I stumbled across a Forum post, which solved the problem.

If this is not considered a software bug, it must surely be considered a documentation bug.

Why not add a line or two about the problem to one, two or all of the official documents listed above?

While we're at it, why not add a list of ALL kernel modules which must be rebuilt from ports after a point-release upgrade?
Comment 19 rob2g2 2020-12-30 23:47:20 UTC
This bug also hit me on my oldish Lenovo X220. Fully agree with Nicholas Edward Named in comment 18 - this is a documentation bug.
Comment 20 Denis Polygalov 2020-12-31 02:29:56 UTC
Changes in the documentation is the minimum of what should be done in my opinion. Also hoping that now amount of unhappy users reached critical mass I'm dare to mention my proposal for mitigation of this and many other potentially related problems described here 10 month ago:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241787#c9

Happy New Year everybody!
Denis.