Bug 217664 - [DRI3] OpenGL applications crash in brw_workaround_depthstencil_alignment()
Summary: [DRI3] OpenGL applications crash in brw_workaround_depthstencil_alignment()
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-x11 (Nobody)
URL:
Keywords: crash, i915
Depends on:
Blocks:
 
Reported: 2017-03-09 15:36 UTC by Sergei Akhmatdinov
Modified: 2017-04-11 14:18 UTC (History)
3 users (show)

See Also:
jbeich: maintainer-feedback+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sergei Akhmatdinov 2017-03-09 15:36:31 UTC
Current port of PPSSPP fails to build properly on several of my machines
running STABLE.

Attempting to run PPSSPP results in a segfault.

Here is the trace from the terminal:

---------------------
OpenGL 2.0 or higher.
41:07:129 Core/Config.cpp:957 I[LOAD]: Loading controller config: /home/sakhmatd/.config/ppsspp/PSP/SYSTEM/controls.ini
41:07:129 Core/Config.cpp:1313 E[LOAD]: Failed to read /home/sakhmatd/.config/ppsspp/PSP/SYSTEM/controls.ini. Setting controller config to default.
Pixels: 960 x 544
Virtual pixels: 960 x 544
I: gpu_features.cpp:126: GPU Vendor : Intel Open Source Technology Center ; renderer: Mesa DRI Intel(R)
Ivybridge Mobile version str: 3.0 Mesa 13.0.5 ; GLSL version str: 1.30
[1] 5907 segmentation fault (core dumped) ppsspp
---------------------

Dumped core can be found here:
http://rgho.st/private/7P67Cds99/e84024231c1b30de32d0d915f0d0c75b

GDB Backtrace:

---------------------
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 80da16000 (LWP 100658/ppsspp)]
0x000000080f7dcd4e in __driDriverGetExtensions_i915 () from /usr/local/lib/dri/i965_dri.so
Current language: auto; currently minimal
(gdb) bt
#0 0x000000080f7dcd4e in __driDriverGetExtensions_i915 () from /usr/local/lib/dri/i965_dri.so
#1 0x000000080f7c8a1d in __driDriverGetExtensions_i915 () from /usr/local/lib/dri/i965_dri.so
#2 0x00000000006c8281 in ?? ()
#3 0x00000000006c6932 in ?? ()
#4 0x00000000004118fd in ?? ()
#5 0x000000000069aa85 in ?? ()
#6 0x000000000040d12f in ?? ()
#7 0x0000000800c92000 in ?? ()
#8 0x0000000000000000 in ?? ()
---------------------

I asked upstream and the author suspects that PPSSPP builds without symbols.

I have tried the binary package from the FreeBSD repos, building it with Synth,
building it while marked unsafe, etc.

This affects both of my FreeBSD machines, so I imagine this would affect somebody else too.
Comment 1 Jan Beich freebsd_committer freebsd_triage 2017-03-09 16:32:50 UTC
(In reply to Sergei Akhmatdinov from comment #0)
> Current port of PPSSPP fails to build properly on several of my machines
> running STABLE.

Can you show build error and make.conf? Which /stable branch, what revision, what architecture?

> I asked upstream and the author suspects that PPSSPP builds without symbols.

Correct, see https://www.freebsd.org/doc/en/books/porters-handbook/install.html#install-strip

> I: gpu_features.cpp:126: GPU Vendor : Intel Open Source Technology Center ; renderer: Mesa DRI Intel(R) Ivybridge Mobile version str: 3.0 Mesa 13.0.5 ; GLSL version str: 1.30

Are you using i915.ko or i915kms.ko? If the latter try the kernel from https://github.com/FreeBSDDesktop/freebsd-base-graphics/tree/drm-next but ignore world

> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 80da16000 (LWP 100658/ppsspp)]
> 0x000000080f7dcd4e in __driDriverGetExtensions_i915 () from
> /usr/local/lib/dri/i965_dri.so
> Current language: auto; currently minimal
> (gdb) bt
> #0 0x000000080f7dcd4e in __driDriverGetExtensions_i915 () from
> /usr/local/lib/dri/i965_dri.so
> #1 0x000000080f7c8a1d in __driDriverGetExtensions_i915 () from
> /usr/local/lib/dri/i965_dri.so

Looks like either DRM or Mesa bug. Try adding WITH_DEBUG=1 and NO_CPU_CFLAGS=1 to make.conf then rebuild at least libGL, dri and ppsspp. If you still see ?? instead of symbols check the binary with "ldd -a" and "pkg which" then repeat for more dependencies.

Does glxgears work? Can you try attachment 180618 [details]? Can you try SNA in xorg.conf per intel(4x)?
Comment 2 Sergei Akhmatdinov 2017-03-09 17:43:18 UTC
(In reply to Jan Beich (mail not working) from comment #1)
> Can you show build error and make.conf? Which /stable branch, what revision, > what architecture?
I don't get a build error, the package builds fine, but doesn't work properly.
I don't think anything in my make.conf might interfere, but here is all I have:
WRKDIRPREFIX=/tmp
DEFAULT_VERSIONS+=ssl=openssl
DWM_CONF=/home/sakhmatd/.config/dwm/config.h

Arch is AMD64. I am not sure how to look up the exact revision/branch that I am using, but I updated world/kernel two days ago.

> Are you using i915.ko or i915kms.ko? If the latter try the kernel from 
> https://github.com/FreeBSDDesktop/freebsd-base-graphics/tree/drm-next but 
> ignore world
I am using the former.

> Looks like either DRM or Mesa bug. Try adding WITH_DEBUG=1 and 
> NO_CPU_CFLAGS=1 to make.conf then rebuild at least libGL, dri and ppsspp. If > you still see ?? instead of symbols check the binary with "ldd -a" and "pkg 
> which" then repeat for more dependencies.

Here is the new backtrace.
-------------------------
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 80e016000 (LWP 100778/ppsspp)]
get_stencil_miptree (irb=0x80e03fa00) at brw_misc_state.c:229
229     brw_misc_state.c: No such file or directory.
        in brw_misc_state.c
#0  get_stencil_miptree (irb=0x80e03fa00) at brw_misc_state.c:229
#1  0x000000080ffced4f in brw_workaround_depthstencil_alignment (brw=0x80f80d6a8, clear_mask=50)
    at brw_misc_state.c:245
#2  0x000000080ffa6e93 in brw_clear (ctx=0x80f80d6a8, mask=50) at brw_clear.c:271
#3  0x000000080faa4913 in _mesa_Clear (mask=17664) at clear.c:224
#4  0x0000000804a7bfcc in glClear (mask=17664) at glapi_mapi_tmp.h:3084
#5  0x0000000000b35f82 in Thin3DGLContext::Clear (this=0x80e03fb40, mask=15, colorval=4278190080,
    depthVal=0, stencilVal=0)
    at /tmp/usr/ports/emulators/ppsspp/work/ppsspp-1.3/ext/native/thin3d/thin3d_gl.cpp:954
#6  0x0000000000b32f00 in Thin3DContext::Begin (this=0x80e03fb40, clear=true, colorval=4278190080,
    depthVal=0, stencilVal=0) at thin3d.h:379
#7  0x0000000000b4ad08 in UIScreen::preRender (this=0x80e390940)
    at /tmp/usr/ports/emulators/ppsspp/work/ppsspp-1.3/ext/native/ui/ui_screen.cpp:64
#8  0x0000000000b45426 in ScreenManager::render (this=0x80e395030)
    at /tmp/usr/ports/emulators/ppsspp/work/ppsspp-1.3/ext/native/ui/screen.cpp:123
#9  0x000000000041c13c in NativeRender (graphicsContext=0x80e044b88)
    at /tmp/usr/ports/emulators/ppsspp/work/ppsspp-1.3/UI/NativeApp.cpp:739
#10 0x00000000005908df in UpdateRunLoop (input_state=0x1812b08)
    at /tmp/usr/ports/emulators/ppsspp/work/ppsspp-1.3/Core/Core.cpp:175
#11 0x0000000000ab4735 in main (argc=1, argv=0x7fffffffea18)
    at /tmp/usr/ports/emulators/ppsspp/work/ppsspp-1.3/ext/native/base/PCMain.cpp:879
-------------------------

> Does glxgears work? Can you try attachment 180618 [details]? Can you try SNA > in xorg.conf per intel(4x)?
glxgears didn't work, also segfaulted. How do I go about applying the patch or applying SNA?

Just in case, here is a backtrace for glxgears:
-------------------------
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 805416000 (LWP 100169/glxgears)]
get_stencil_miptree (irb=0x8054650e0) at brw_misc_state.c:229
229     brw_misc_state.c: No such file or directory.
        in brw_misc_state.c
Current language:  auto; currently minimal
(gdb) bt
#0  get_stencil_miptree (irb=0x8054650e0) at brw_misc_state.c:229
#1  0x0000000805bced4f in brw_workaround_depthstencil_alignment (brw=0x806c0d3e8, clear_mask=18)
    at brw_misc_state.c:245
#2  0x0000000805ba6e93 in brw_clear (ctx=0x806c0d3e8, mask=18) at brw_clear.c:271
#3  0x00000008056a4913 in _mesa_Clear (mask=16640) at clear.c:224
#4  0x0000000800ddefcc in glClear (mask=16640) at glapi_mapi_tmp.h:3084
#5  0x0000000000404e72 in draw () at glxgears.c:254
#6  0x0000000000404e4c in draw_gears () at glxgears.c:316
#7  0x0000000000404b87 in draw_frame (dpy=0x805420000, win=25165826) at glxgears.c:341
#8  0x0000000000402d07 in event_loop (dpy=0x805420000, win=25165826) at glxgears.c:703
#9  0x0000000000402114 in main (argc=1, argv=0x7fffffffe9f0) at glxgears.c:798
-------------------------
Comment 3 Jan Beich freebsd_committer freebsd_triage 2017-03-09 18:13:55 UTC
(In reply to Sergei Akhmatdinov from comment #2)
> > Can you try attachment 180618 [details]? Can you try SNA in xorg.conf per intel(4x)?
>
> How do I go about applying the patch

$ fetch -o /tmp/Mesa-17.0.1.diff 'https://bugs.freebsd.org/bugzilla/attachment.cgi?id=180618'
$ cd /usr/ports
$ svn patch /tmp/Mesa-17.0.1.diff || patch -Efsp0 -i /tmp/Mesa-17.0.1.diff

After that upgrade packages installed from the following ports

  graphics/dri
  graphics/gbm
  graphics/libEGL
  graphics/libglapi
  graphics/libglesv2
  graphics/libosmesa

> or applying SNA?

If you don't have /etc/X11/xorg.conf and /usr/local/etc/X11/xorg.conf then create the later and put something like the following. Otherwise, adjust AccelMethod in existing Device but make sure Driver is "intel".

  Section "Device"
	  Identifier "integrated_card"
	  Driver     "intel"
	  Option     "AccelMethod" "sna"
  EndSection

>
> Just in case, here is a backtrace for glxgears:
> -------------------------
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 805416000 (LWP 100169/glxgears)]
> get_stencil_miptree (irb=0x8054650e0) at brw_misc_state.c:229
> 229     brw_misc_state.c: No such file or directory.
>         in brw_misc_state.c
> Current language:  auto; currently minimal
> (gdb) bt
> #0  get_stencil_miptree (irb=0x8054650e0) at brw_misc_state.c:229
> #1  0x0000000805bced4f in brw_workaround_depthstencil_alignment (brw=0x806c0d3e8, clear_mask=18)
>     at brw_misc_state.c:245
> #2  0x0000000805ba6e93 in brw_clear (ctx=0x806c0d3e8, mask=18) at brw_clear.c:271
> #3  0x00000008056a4913 in _mesa_Clear (mask=16640) at clear.c:224
> #4  0x0000000800ddefcc in glClear (mask=16640) at glapi_mapi_tmp.h:3084

I can reproduce the crash on drm-next (i915kms, Skylake GT2, SNA), Mesa 17.0.1. DRI3 appears to be broken which affects modesetting that cannot work without it.

For now make sure to use "intel" in xorg.conf and remove Option "DRI" "3" or set it to "2".
Comment 4 Jan Beich freebsd_committer freebsd_triage 2017-03-09 18:16:20 UTC
For reference, "intel" driver is from x11-drivers/xf86-video-intel while "modesetting" is part of x11-servers/xorg-server.
Comment 5 Sergei Akhmatdinov 2017-03-09 18:37:48 UTC
(In reply to Jan Beich (mail not working) from comment #3)
>If you don't have /etc/X11/xorg.conf and /usr/local/etc/X11/xorg.conf then >create the later and put something like the following. Otherwise, adjust >AccelMethod in existing Device but make sure Driver is "intel".
>
>  Section "Device"
>	  Identifier "integrated_card"
>	  Driver     "intel"
>	  Option     "AccelMethod" "sna"
>  EndSection

Cheers. I created the xorg.conf and added Option "DRI" "2" to it just in case
and it fixed it. glxgears and ppsspp don't crash anymore.
Comment 6 Jan Beich freebsd_committer freebsd_triage 2017-03-09 20:00:07 UTC
Trying to chase the crash with ASan I've noticed it disappeared after x11/libxshmfence rebuild. Perhaps, Clang miscompiled it at some point but due to infrequent updates the packages was never rebuilt. Can you confirm?
Comment 7 Matthew Rezny freebsd_committer freebsd_triage 2017-03-09 20:52:04 UTC
(In reply to Sergei Akhmatdinov from comment #2)

>> Are you using i915.ko or i915kms.ko? If the latter try the kernel from 
>> https://github.com/FreeBSDDesktop/freebsd-base-graphics/tree/drm-next but 
>> ignore world
>>
>I am using the former.

You should be using the latter. The former only supports DRI1 and UMS. The latter is required for DRI2 and KMS. While there is still UMS support in the intel DDX, it is only intended for use on very old hardware for which there is no KMS support, i.e. i8xx chipsets.

If you are manually loading i915.ko, cease doing so. Intel DDX will try to load i915kms.ko first and only tries i915.ko if the first fails (or you are not on the current version as there was a bug that would cause it to try loading both).
Comment 8 Sergei Akhmatdinov 2017-03-09 22:19:10 UTC
(In reply to Jan Beich (mail not working) from comment #6)
> Trying to chase the crash with ASan I've noticed it disappeared after 
> x11/libxshmfence rebuild. Perhaps, Clang miscompiled it at some point but due > to infrequent updates the packages was never rebuilt. Can you confirm?
Should I disable the xorg.conf bandaid first?

(In reply to Matthew Rezny from comment #7)
> If you are manually loading i915.ko, cease doing so. Intel DDX will try to 
> load i915kms.ko first and only tries i915.ko if the first fails (or you are 
> not on the current version as there was a bug that would cause it to try 
> loading both).
In that case it probably fails, because I am not loading it manually.
I never really did anything to change X or how the Intel drivers are loaded.
Comment 9 Sergei Akhmatdinov 2017-03-09 22:20:55 UTC
Actually, disregard the other comment.
kldstat shows both i915.ko and i915kms.ko in use.
Comment 10 Jan Beich freebsd_committer freebsd_triage 2017-03-09 23:37:28 UTC
(In reply to Sergei Akhmatdinov from comment #8)
> Should I disable the xorg.conf bandaid first?

Yep. Restore the file from backup or just remove it. Either Driver "modesetting" or Driver "intel" + Option "DRI" "3" should work.

I'm not sure why you had DRI3 by default with intel, though.
Comment 11 Jan Beich freebsd_committer freebsd_triage 2017-03-10 05:53:03 UTC
(In reply to Jan Beich (mail not working) from comment #6)
> Perhaps, Clang miscompiled it at some point but due to infrequent updates the packages was never rebuilt

x11/libxshmfence requires PTHREAD_PROCESS_SHARED but FreeBSD only implemented it since 11.0-RELEASE (or base r296162). Essentially, DRI3 is broken on anything below like /stable/10 branch. x11-drivers/xf86-video-intel still defaults to DRI2, so the only concern is modesetting DDX which is enabled by default if xorg.conf lacks Device section.

FWIW, NetBSD has patches to use POSIX named semaphores aka sem_open() et al.
Comment 12 Jung-uk Kim freebsd_committer freebsd_triage 2017-03-10 07:40:53 UTC
(In reply to Jan Beich (mail not working) from comment #11)
> x11/libxshmfence requires PTHREAD_PROCESS_SHARED...

No, we use umtx for FreeBSD.
Comment 13 Sergei Akhmatdinov 2017-03-10 13:58:52 UTC
(In reply to Jan Beich (mail not working) from comment #11)
> Essentially, DRI3 is broken on anything below like /stable/10 branch.
But I am using /stable/11. Does that mean that DRI3 should work on my machine?
Rebuilding the package didn't seem to help.
Comment 14 Jung-uk Kim freebsd_committer freebsd_triage 2017-03-10 16:05:09 UTC
(In reply to Sergei Akhmatdinov from comment #13)
> But I am using /stable/11. Does that mean that DRI3 should work on my machine?
> Rebuilding the package didn't seem to help.

x11/libxshmfence should work on all supported FreeBSD branches.  However, I don't know anything about intel+DRI3.
Comment 15 Matthew Rezny freebsd_committer freebsd_triage 2017-03-10 16:26:14 UTC
DRI3 is only expected to possibly work with a drm-next kernel. There is definitely no DRI3 support in stock FreeBSD kernels.

The intel driver should automatically use DRI2 without an explicit option in xorg.conf. At least, on the machines where I am testing, both intel and radeon operate correctly with DRI2 although built with DRI3 support enabled. The radeon driver prints a message when DRI3 init fails which may look troubling but is benign.
Comment 16 Matthew Rezny freebsd_committer freebsd_triage 2017-04-11 14:18:45 UTC
It turns out the fall back to DRI2 only works for GLX, users of EGL fail when DRI3 is enabled and not supported by the system. DRI3 now requires a runtime switch to enable in order to avoid such problems.