Bug 235487

Summary: graphics/mesa-dri 18.3.2 plus xorg-server-1.18.4_11,1 results in segmentation fault
Product: Ports & Packages Reporter: Philip Homburg <pch-freebsd-bugs-1>
Component: Individual Port(s)Assignee: freebsd-x11 (Nobody) <x11>
Status: Closed Overcome By Events    
Severity: Affects Only Me CC: marko.cupac, martymac, rkoberman, zeising
Priority: --- Keywords: regression
Version: LatestFlags: bugzilla: maintainer-feedback? (x11)
Hardware: Any   
OS: Any   
Bug Depends on:    
Bug Blocks: 233034    
Attachments:
Description Flags
Xorg.0.log
none
possible workaround none

Description Philip Homburg 2019-02-04 12:00:28 UTC
Created attachment 201715 [details]
Xorg.0.log

On a Thinkpad X201 running 11.2-RELEASE-p8, starting X with xinit fails:

[   264.922] (II) GLX: Initialized DRI2 GL provider for screen 0
[   264.923] (EE) 
[   264.923] (EE) Backtrace:
[   264.924] (EE) 0: /usr/local/bin/X (OsInit+0x37a) [0x5ace1a]
[   264.925] (EE) 1: /lib/libthr.so.3 (_pthread_sigmask+0x544) [0x802607db4]
[   264.926] (EE) 2: /lib/libthr.so.3 (_pthread_getspecific+0xe12) [0x802607b82]
[   264.926] (EE) 3: ? (?+0xe12) [0x7fffffffffa5]
[   264.927] (EE) 4: /lib/libc.so.7 (memcpy+0x20) [0x80297d940]
[   264.928] (EE) 5: /usr/local/lib/dri/i965_dri.so (__driDriverGetExtensions_i965+0x2c3ac2) [0x807e78952]
[   264.929] (EE) 6: /usr/local/lib/dri/i965_dri.so (__driDriverGetExtensions_i915+0x64f16) [0x807953d46]
[   264.930] (EE) 7: /usr/local/lib/dri/i965_dri.so (__driDriverGetExtensions_i915+0x64094) [0x807952174]
[   264.930] (EE) 8: /usr/local/lib/dri/i965_dri.so (__driDriverGetExtensions_i965+0x3e81) [0x8078f9301]
[   264.931] (EE) 9: /usr/local/lib/dri/i965_dri.so (_init+0x17eac2) [0x80776dbf2]
[   264.932] (EE) 10: /usr/local/lib/dri/i965_dri.so (_init+0x17eb8e) [0x80776e0ae]
[   264.933] (EE) 11: /usr/local/lib/dri/i965_dri.so (__driDriverGetExtensions_i965+0x5896) [0x8078fc3d6]
[   264.933] (EE) 12: /usr/local/lib/dri/i965_dri.so (__driDriverGetExtensions_i965+0x46e4) [0x8078fa384]
[   264.934] (EE) 13: /usr/local/lib/dri/i965_dri.so (_init+0x1731f3) [0x807756c53]
[   264.935] (EE) 14: /usr/local/lib/dri/i965_dri.so (_init+0x17bff1) [0x807768861]
[   264.936] (EE) 15: /usr/local/lib/dri/i965_dri.so (_init+0x17a60e) [0x80776549e]
[   264.936] (EE) 16: /usr/local/lib/dri/i965_dri.so (_init+0x14ef29) [0x80770e789]
[   264.937] (EE) 17: /usr/local/lib/dri/i965_dri.so (_init+0x14f4ce) [0x80770f36e]
[   264.938] (EE) 18: /usr/local/lib/dri/i965_dri.so (_init+0x169f89) [0x807744889]
[   264.939] (EE) 19: /usr/local/lib/xorg/modules/libglamoregl.so (glamor_egl_create_argb8888_based_texture+0x182) [0x80692e5c2]
[   264.939] (EE) 20: /usr/local/lib/xorg/modules/libglamoregl.so (glamor_egl_create_textured_pixmap_from_gbm_bo+0xbf) [0x80692e92f]
[   264.940] (EE) 21: /usr/local/lib/xorg/modules/drivers/modesetting_drv.so (_init+0x3e67) [0x806316397]
[   264.941] (EE) 22: /usr/local/lib/xorg/modules/drivers/modesetting_drv.so (_init+0x3455) [0x806314f25]
[   264.942] (EE) 23: /usr/local/bin/X (xf86CrtcScreenInit+0x143) [0x4adc53]
[   264.943] (EE) 24: /usr/local/bin/X (remove_fs_handlers+0x446) [0x43b546]
[   264.943] (EE) 25: /usr/local/bin/X (_start+0x95) [0x425155]
[   264.944] (EE) 26: ? (?+0x95) [0x800836095]
[   264.944] (EE) 
[   264.944] (EE) Segmentation fault at address 0x0
[   264.944] (EE)
Comment 1 Jan Beich freebsd_committer freebsd_triage 2019-02-04 12:29:46 UTC
*** Bug 235488 has been marked as a duplicate of this bug. ***
Comment 2 Jan Beich freebsd_committer freebsd_triage 2019-02-04 12:39:59 UTC
mesa-dri was updated 2 times in short succession. Can you check whether regression occurs after 18.2.8 -> 18.3.2 or 18.1.9 -> 18.2.8?

(In reply to Philip Homburg from comment #0)
> [   263.582] Build Operating System: FreeBSD 11.2-RELEASE-p8 amd64 

Do you use i915kms from base system or from graphics/drm-kmod? If the latter is it the latest version i.e., drm-fbsd11.2-kmod-4.9.g20181023?

> [   263.583] (--) PCI:*(0:0:2:0) 8086:0046:17aa:215a rev 2, Mem @ 0xf2000000/4194304, 0xd0000000/268435456, I/O @ 0x00001800/8, BIOS @ 0x????????/65536

0x0046 is Ironlake (Arrandale), 1 generation earlier than SandyBridge which was confirmed to work.
Comment 3 Philip Homburg 2019-02-04 13:00:34 UTC
I have drm-fbsd11.2-kmod-4.11g2018121, but dmesg says

info: [drm] Initialized i915 1.6.0 20080730 for drmn0 on minor 0

So I think it is the base system one that got loaded.

I'll see if I can downgrade mesa-dri.
Comment 4 Philip Homburg 2019-02-04 18:07:17 UTC
It works with 18.1.9_3. I'll see if I can build 18.2.8
Comment 5 Philip Homburg 2019-02-07 10:29:41 UTC
mesa-dri 18.2.8 works as well.
Comment 6 Jan Beich freebsd_committer freebsd_triage 2019-02-07 11:02:09 UTC
Let's check if upstream has a fix then. Can you try mesa-dri 18.3.3 / 19.0.0-rc2?

(Ignore patch conflicts if *.rej files only affect PORTREVISION)

$ fetch -o /tmp/mesa-18.3.3.diff 'https://reviews.freebsd.org/D19099?download=true'
$ patch -Efsp0 -i /tmp/mesa-18.3.3.diff -d /usr/ports
$ make all deinstall install clean -C /usr/ports/graphics/mesa-dri

$ fetch -o /tmp/mesa-llvm7.diff 'https://bugs.freebsd.org/bugzilla/attachment.cgi?id=199665'
$ fetch -o /tmp/libdrm-2.4.97.diff 'https://bugs.freebsd.org/bugzilla/attachment.cgi?id=201643'
$ fetch -o /tmp/mesa-19.0.0.diff 'https://reviews.freebsd.org/D19100?download=true'
$ patch -Efsp1 -i /tmp/mesa-llvm7.diff -d /usr/ports
$ patch -Efsp1 -i /tmp/libdrm-2.4.97.diff -d /usr/ports
$ patch -Efsp0 -i /tmp/mesa-19.0.0.diff -d /usr/ports
$ make all deinstall install clean -C /usr/ports/graphics/libdrm
$ make all deinstall install clean -C /usr/ports/graphics/mesa-libs
$ make all deinstall install clean -C /usr/ports/graphics/mesa-dri
Comment 7 Philip Homburg 2019-02-07 12:18:01 UTC
Where do I find 18.3.3? Just change the version number?
Comment 8 Jan Beich freebsd_committer freebsd_triage 2019-02-07 12:42:45 UTC
(In reply to Philip Homburg from comment #7)
> Where do I find 18.3.3?

In review D19099 but I've posted instructions in comment 6.

> Just change the version number?

Only if you don't use a patch provided by someone. For patch-level updates bumping version and "make makesum" is often enough.
Comment 9 Philip Homburg 2019-02-08 10:13:05 UTC
19.0.0.rc2 fails the same way as 18.3.2
Comment 10 Martin Birgmeier 2019-02-17 08:41:54 UTC
I have crashes since a while which also seem to be related to 3D acceleration.

They are usually triggered by activities in Firefox, sometimes seemingly also by VirtualBox clients even though I run them mostly headless.

The setup is as follows:
- Thinkpad W520
- GF108 [Quadro 1000M] graphics card
- releng/12.0 with latest patches
- mesa-dri-18.3.2
- nvidia-driver-390.87_2
- xorg-server-1.18.4_11,1

Is this related although using the NVidia driver in this case?

I have another machine with similar crashes since about the same time, this time using Intel 82945GM (945GM GMCH). A third machine using an ATI Radeon HD 4290 does not seem to be affected. Both these machines use drm-fbsd12.0-kmod-4.16.g20190213, also with mesa-dri-18.3.2. The second machine does not even run any 3D clients, just a simple X server with oclock running on top of it (no graphical login manager).

-- Martin
Comment 11 Jan Beich freebsd_committer freebsd_triage 2019-02-18 00:05:01 UTC
Created attachment 202111 [details]
possible workaround

Philip, can you try the attached patch? Found from https://github.com/DragonFlyBSD/DeltaPorts/commit/630507ebacd5
If it doesn't help bisecting upstream commits (time-consuming due to patch churn) maybe required as I don't have hardware old enough to be compatible with drm-legacy-kmod.

(In reply to Martin Birgmeier from comment #10)
> Is this related although using the NVidia driver in this case?

nvidia-driver has its own DDX/DRI/libGL implementation, so the stability only depends on xorg-server and kernel changes. mesa-dri is only used for <GL/internal/dri_interface.h>. As GLVND isn't in ports yet libGL implementation cannot be switched at runtime.

> I have another machine with similar crashes since about the same
> time, this time using Intel 82945GM (945GM GMCH).

Maybe related if using iGPU. 945GM is Calistoga, even older than Ironlake.
https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units

> A third machine using an ATI Radeon HD 4290 does not seem to be affected.

Only Intel iGPU on drm-legacy-kmod or in-base drm2 appear to be affected.

> The second machine does not even run any 3D clients, just a simple X
> server with oclock running on top of it (no graphical login
> manager).

mesa-dri is used for DRI2 and AIGLX. However, according to bug 235203 it only affects modesetting(4x).
Comment 12 Jan Beich freebsd_committer freebsd_triage 2019-02-19 05:14:17 UTC
Comment on attachment 202111 [details]
possible workaround

Nevermind. This patch is nop on gen < 8 (Haswell or older).
Comment 13 Niclas Zeising freebsd_committer freebsd_triage 2019-02-19 08:54:13 UTC
(In reply to Martin Birgmeier from comment #10)

Hi!
It looks like this is a separate issue.  Can you create a separate PR for it, to avoid the confusion?
Thanks!
Comment 14 Marko Cupać 2019-03-28 10:16:12 UTC
X just crashed on me, I don't know if it's related to this PR. I'm using drm-legacy-kmod-g20190213 on 12.0-RELEASE-p3. I'm loading kernel module with kld_list="/boot/modules/i915kms.ko" in rc.conf.

pciconf -lv lists my display adapter as Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller

Here's Xorg.0.log:

[171900.678] (EE) 
[171900.679] (EE) Backtrace:
[171900.681] (EE) 0: /usr/local/lib/xorg/modules/input/mouse_drv.so (?+0x32ad000) [0x8036ada20]
[171900.684] (EE) 1: /lib/libthr.so.3 (pthread_sigmask+0x544) [0x800aec924]
[171900.687] (EE) 2: /lib/libthr.so.3 (pthread_getspecific+0xe12) [0x800aec6f2]
[171900.688] (EE) 3: ? (?+0xe12) [0x7fffffffffa5]
[171900.690] (EE) 4: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x80274ee22]
[171900.692] (EE) 5: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x80274e852]
[171900.694] (EE) 6: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x8024887f2]
[171900.696] (EE) 7: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x802487aa2]
[171900.698] (EE) 8: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x80248ee42]
[171900.699] (EE) 9: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x802188392]
[171900.701] (EE) 10: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x802188782]
[171900.702] (EE) 11: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x802490502]
[171900.703] (EE) 12: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x80248f7c2]
[171900.705] (EE) 13: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x80217a862]
[171900.706] (EE) 14: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x802173ee2]
[171900.708] (EE) 15: /usr/local/lib/dri/i965_dri.so (?+0xe12) [0x8021740f2]
[171900.709] (EE) 16: /usr/local/lib/xorg/modules/libglamoregl.so (?+0xe12) [0x801af7662]
[171900.711] (EE) 17: /usr/local/lib/xorg/modules/libglamoregl.so (?+0xe12) [0x801af79d2]
[171900.712] (EE) 18: /usr/local/lib/xorg/modules/libglamoregl.so (?+0xe12) [0x801aebad2]
[171900.714] (EE) 19: /usr/local/bin/X (?+0xe12) [0x37ad12]
[171900.715] (EE) 20: /usr/local/bin/X (?+0xe12) [0x330ed2]
[171900.717] (EE) 21: /usr/local/bin/X (?+0xe12) [0x286692]
[171900.718] (EE) 22: /usr/local/bin/X (?+0xe12) [0x28ff42]
[171900.719] (EE) 23: /usr/local/bin/X (?+0xe12) [0x279e12]
[171900.721] (EE) 24: ? (?+0xe12) [0x80043fe12]
[171900.721] (EE) 
[171900.721] (EE) Segmentation fault at address 0x39030
[171900.721] (EE) 
Fatal server error:
[171900.721] (EE) Caught signal 11 (Segmentation fault). Server aborting
[171900.721] (EE) 
[171900.721] (EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
[171900.721] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[171900.721] (EE) 
[171900.721] (II) AIGLX: Suspending AIGLX clients for VT switch
[171900.721] (EE) Server terminated with error (1). Closing log file.
Comment 15 Niclas Zeising freebsd_committer freebsd_triage 2019-07-29 12:44:29 UTC
Is this still an issue, even with updated graphics drivers?
Comment 16 Philip Homburg 2019-07-29 16:32:22 UTC
I just tried mesa-dri 18.3.2_3 on 11.2-RELEASE-p12 and that failed.
Comment 17 Marko Cupać 2019-07-29 18:27:39 UTC
(In reply to Niclas Zeising from comment #15)

I haven't experienced crash in a long time.

12.0-RELEASE-p6 amd64
drm-legacy-kmod-g20190523
mesa-dri-18.3.2_2
Comment 18 Niclas Zeising freebsd_committer freebsd_triage 2019-07-29 20:21:16 UTC
(In reply to Philip Homburg from comment #16)
> I just tried mesa-dri 18.3.2_3 on 11.2-RELEASE-p12 and that failed.

Which graphics driver are you using, the one in the base system, or one from drm-kmod?  For the latter to work, you need to add kld_list="/boot/modules/i915kms.ko" to /etc/rc.conf.  Can you try using that driver instead, alternatively, is there a chance you can update to FreeBSD 12.0?

Thank you!
Comment 19 Philip Homburg 2019-07-29 20:48:40 UTC
(In reply to Niclas Zeising from comment #18)

12.0 breaks IPv6 for me, so that's not an option.

Loading /boot/modules/i915kms.ko (from drm-fbsd11.2-kmod-4.11g20190424) gives a kernel panic:

Jul 29 22:39:20 brainpad kernel: IOPL = 0
Jul 29 22:39:20 brainpad kernel: current process                = 1100 (kldload)
Jul 29 22:39:20 brainpad kernel: trap number            = 12
Jul 29 22:39:20 brainpad kernel: panic: page fault
Jul 29 22:39:20 brainpad kernel: cpuid = 2
Jul 29 22:39:20 brainpad kernel: KDB: stack backtrace:
Jul 29 22:39:20 brainpad kernel: #0 0xffffffff80b3d5b7 at kdb_backtrace+0x67
Jul 29 22:39:20 brainpad kernel: #1 0xffffffff80af6b57 at vpanic+0x177
Jul 29 22:39:20 brainpad kernel: #2 0xffffffff80af69d3 at panic+0x43
Jul 29 22:39:20 brainpad kernel: #3 0xffffffff80f7827f at trap_fatal+0x35f
Jul 29 22:39:20 brainpad kernel: #4 0xffffffff80f782d9 at trap_pfault+0x49
Jul 29 22:39:20 brainpad kernel: #5 0xffffffff80f77aa7 at trap+0x2c7
Jul 29 22:39:20 brainpad kernel: #6 0xffffffff80f5810c at calltrap+0x8
Jul 29 22:39:20 brainpad kernel: #7 0xffffffff8249c51e at i915_ggtt_probe_hw+0x3e
Jul 29 22:39:20 brainpad kernel: #8 0xffffffff82486c78 at i915_driver_load+0x11b8
Jul 29 22:39:20 brainpad kernel: #9 0xffffffff82600f45 at linux_pci_attach+0x405
Jul 29 22:39:20 brainpad kernel: #10 0xffffffff80b2fce8 at device_attach+0x3b8
Jul 29 22:39:20 brainpad kernel: #11 0xffffffff80b314d9 at bus_generic_driver_added+0x89
Jul 29 22:39:20 brainpad kernel: #12 0xffffffff80b2d9da at devclass_driver_added+0x7a
Jul 29 22:39:20 brainpad kernel: #13 0xffffffff80b2d945 at devclass_add_driver+0x135
Jul 29 22:39:20 brainpad kernel: #14 0xffffffff8260099b at _linux_pci_register_driver+0xcb
Jul 29 22:39:20 brainpad kernel: #15 0xffffffff80aca0cf at linker_load_module+0xb3f
Jul 29 22:39:20 brainpad kernel: #16 0xffffffff80acb8c1 at kern_kldload+0xc1
Jul 29 22:39:20 brainpad kernel: #17 0xffffffff80acb9db at sys_kldload+0x5b
Comment 20 Niclas Zeising freebsd_committer freebsd_triage 2019-07-29 20:55:17 UTC
(In reply to Philip Homburg from comment #19)
> (In reply to Niclas Zeising from comment #18)
> 
> 12.0 breaks IPv6 for me, so that's not an option.

In which way, even with the latest set of erratas and stuff?
> 
> Loading /boot/modules/i915kms.ko (from drm-fbsd11.2-kmod-4.11g20190424)
> gives a kernel panic:
> 

I believe this panic is fixed in FreeBSD 12, but I'm not sure.  Are you building a custom kernel?  If so, it might be worth trying to build drm-fbsd11.2-kmod from ports, rather than using packages.  Any chance you can give FreeBSD 11.3 a try?
Thanks!
Regards
Niclas
Comment 21 Philip Homburg 2019-07-29 21:00:15 UTC
(In reply to Niclas Zeising from comment #20)

I have a stock kernel, but I build ports myself using poudriere.

I'll try 11.3 soon, probably this week.
Comment 22 Niclas Zeising freebsd_committer freebsd_triage 2019-07-29 21:05:25 UTC
(In reply to Philip Homburg from comment #21)
> (In reply to Niclas Zeising from comment #20)
> 
> I have a stock kernel, but I build ports myself using poudriere.


Then you should be fine, insofar as the kernel module should match your running kernel (and sources).

> 
> I'll try 11.3 soon, probably this week.

Please do that, and report back.  It might be that this issue is solved there.
Comment 23 rkoberman 2019-07-29 22:24:45 UTC
(In reply to Philip Homburg from comment #19)
Would running 12-stable be out of the question? The problem was fixed in stable just a few days after 12.0 was released. I run stable on my laptop and IPv6 works fine. For the same reason I'm stuck on 11 on my server where I need to stick to releases.
Comment 24 Philip Homburg 2019-08-05 15:47:57 UTC
(In reply to Niclas Zeising from comment #22)
I see exactly the same behavior in 11.3
Comment 25 rkoberman 2019-08-05 18:47:43 UTC
FWIW, I am running mesa-dri-19.0.0_3 (never committed) and xorg-server-1.18.4_11,1 on my Thinkpad with no problems. Will mesa ever get updated to something not way past EOL? Actually, I think 19.0 is also EOL a this time, but I'm not sure.

Giving 19.0 or 19.1 a shot may be worth trying. Jan has patches for 19.0, but they may no longer be available. Thre is an open ticket that references building 19.1 from the git sources, but I have not tried that.
Comment 26 Niclas Zeising freebsd_committer freebsd_triage 2020-04-15 19:54:04 UTC
Is this an issue with xorg-server 1.20?
Comment 27 Philip Homburg 2020-04-18 13:52:04 UTC
(In reply to Niclas Zeising from comment #26)

Same for xorg-server-1.20.8,1 and mesa-dri-18.3.2_10

Note that for the most part I have moved on to 12.1, so 11.3 is not that important to me anymore.
Comment 28 Niclas Zeising freebsd_committer freebsd_triage 2020-07-11 14:37:13 UTC
Closing this, it works with 12.1, and there are other regressions with graphics on 11, such as no QT packages by default.