Bug 247416 - x11-servers/xorg-server: DRI2 fails on AMD Polaris 10
Summary: x11-servers/xorg-server: DRI2 fails on AMD Polaris 10
Status: Closed Not A Bug
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-x11 (Nobody)
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2020-06-19 13:16 UTC by Adam Jimerson
Modified: 2020-07-20 11:52 UTC (History)
1 user (show)

See Also:
jbeich: maintainer-feedback+


Attachments
Xorg.0.log (52.54 KB, text/plain)
2020-06-19 16:40 UTC, Adam Jimerson
no flags Details
Dmesg output (26.48 KB, text/plain)
2020-06-19 18:05 UTC, Adam Jimerson
no flags Details
New Xorg.0.log with just xf86-video-amdgpu installed (5.62 KB, text/plain)
2020-06-19 20:12 UTC, Adam Jimerson
no flags Details
Xorg.0.log (52.04 KB, text/plain)
2020-06-20 11:17 UTC, Adam Jimerson
no flags Details
dmesg output (77.60 KB, text/plain)
2020-06-20 11:18 UTC, Adam Jimerson
no flags Details
Xorg.0.log from Apr 28 (52.04 KB, text/plain)
2020-06-20 17:50 UTC, Adam Jimerson
no flags Details
Dmesg log from May 2 (22.09 KB, text/plain)
2020-06-20 17:56 UTC, Adam Jimerson
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Jimerson 2020-06-19 13:16:11 UTC
Using FreeBSD 12.1-RELEASE-p6 picom v8 is crashing with a segmentation fault for me. I have tested with the version that in available from pkg as well as rebuilt it locally from the ports tree and got the same result.

Looking at the backtrace from the core dump I don't know how useful it will be as it appears that picom is being built without debugging symbols and I'm not sure how to compile picom from the ports tree with them not striped out.

[/u/p/x/picom]─> lldb -c /home/vendion/picom.core -- /usr/local/bin/picom 
(lldb) target create "/usr/local/bin/picom" --core "/home/vendion/picom.core"
Core file '/home/vendion/picom.core' (x86_64) was loaded.
(lldb) thread backtrace all
* thread #1, name = 'picom', stop reason = signal SIGSEGV
  * frame #0: 0x00000008003925e2 libX11.so.6`___lldb_unnamed_symbol17$$libX11.so.6 + 258
    frame #1: 0x00000008003942cc libX11.so.6`_XSetImage + 204
    frame #2: 0x000000080038ebc4 libX11.so.6`XGetSubImage + 52
    frame #3: 0x00000008006ac7a2 libGL.so.1`___lldb_unnamed_symbol835$$libGL.so.1 + 242
    frame #4: 0x00000008006ac673 libGL.so.1`___lldb_unnamed_symbol833$$libGL.so.1 + 19
    frame #5: 0x00000008016339a5 swrast_dri.so`___lldb_unnamed_symbol12$$swrast_dri.so + 245
    frame #6: 0x000000000023dfbf picom`___lldb_unnamed_symbol255$$picom + 671
    frame #7: 0x0000000000231dd5 picom`___lldb_unnamed_symbol184$$picom + 485
    frame #8: 0x0000000000232436 picom`___lldb_unnamed_symbol185$$picom + 1526
    frame #9: 0x0000000000222c68 picom`___lldb_unnamed_symbol32$$picom + 4680
    frame #10: 0x0000000000222be7 picom`___lldb_unnamed_symbol32$$picom + 4551
    frame #11: 0x00000000002217d9 picom`___lldb_unnamed_symbol26$$picom + 25
    frame #12: 0x00000008002cf9b1 libev.so.4`ev_invoke_pending + 113
    frame #13: 0x00000008002d02f0 libev.so.4`ev_run + 2336
    frame #14: 0x0000000000220768 picom`___lldb_unnamed_symbol20$$picom + 6856
    frame #15: 0x000000000021e10f picom`___lldb_unnamed_symbol1$$picom + 271
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2020-06-19 13:18:17 UTC
(In reply to Adam Jimerson from comment #0)

Thanks for the report Adam

A port build WITH_DEBUG should stop the port from stripping symbols
Comment 2 Adam Jimerson 2020-06-19 13:46:45 UTC
(In reply to Kubilay Kocak from comment #1)

Thanks, I'll rebuild the port with that flag and upload a new backtrace later today.
Comment 3 Jan Beich freebsd_committer 2020-06-19 14:18:28 UTC
Also rebuild library dependencies WITH_DEBUG after consulting ldd(1). For example, frame #0 to #5 are not even in picom code.

(In reply to Adam Jimerson from comment #0)
>    frame #5: 0x00000008016339a5 swrast_dri.so`___lldb_unnamed_symbol12$$swrast_dri.so + 245

OpenGL hardware acceleration has failed. Can you show /var/log/Xorg.0.log ? Also try

  $ pkg install mesa-demos
  $ LIBGL_DEBUG=verbose glxgears
Comment 4 Adam Jimerson 2020-06-19 16:40:32 UTC
Created attachment 215793 [details]
Xorg.0.log

Attached is my Xorg log file.

Below is the output of glxgrears with LIBGL_DEBUG environment variable set to verbose:

libGL: screen 0 does not appear to be DRI2 capable
libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory.
libGL: Can't open configuration file /home/vendion/.drirc: No such file or directory.
709 frames in 5.0 seconds = 141.730 FPS
714 frames in 5.0 seconds = 142.607 FPS
716 frames in 5.0 seconds = 143.080 FPS
715 frames in 5.0 seconds = 142.983 FPS
674 frames in 5.0 seconds = 134.769 FPS
716 frames in 5.0 seconds = 143.142 FPS

If the drivers matter I have the following installed:

xf86-video-amdgpu-19.1.0_1
gpu-firmware-kmod-g20200503
drm-kmod-g20190710
drm-fbsd12.0-kmod-4.16.g20200221
Comment 5 Adam Jimerson 2020-06-19 17:05:36 UTC
New stacktrace with debug symbols for picom, libX11, and mesa-libs:

[/u/h/vendion]─> lldb -c /home/vendion/picom.core -- /usr/local/bin/picom 
(lldb) target create "/usr/local/bin/picom" --core "/home/vendion/picom.core"
Core file '/home/vendion/picom.core' (x86_64) was loaded.
(lldb) thread backtrace all
* thread #1, name = 'picom', stop reason = signal SIGSEGV
  * frame #0: 0x00000008003ba5e2 libX11.so.6`XListFonts(dpy=0x0000000800d91aa0, pattern="z\xb8PÕ, maxNames=8, actualCount=0x00000008004456f0) at FontNames.c:97:14
    frame #1: 0x00000008003bc2cc libX11.so.6`XGetAtomNames(dpy=0x0000000800d2b000, atoms=0x0000000800d91a00, count=0, names_return=0x0000000800d91a00) at GetAtomNm.c:169:6
    frame #2: 0x00000008003b6bc4 libX11.so.6`XFillArc(dpy=0x0000000800ced300, d=4, gc=0x00000008003b6bc4, x=32767, y=-12320, width=8, height=4294967295, angle1=2, angle2=14227968) at FillArc.c:64:25
    frame #3: 0x00000008006d47a2 libGL.so.1`__glElementsPerGroup(format=232, type=8) at compsize.c:65:4
    frame #4: 0x00000008006d4673 libGL.so.1`__glX_send_client_info(glx_dpy=0x0000000800ced300) at clientinfo.c:143:7
    frame #5: 0x00000008018339a5 swrast_dri.so`___lldb_unnamed_symbol12$$swrast_dri.so + 245
    frame #6: 0x000000000025b1ea picom`glx_bind_pixmap(ps=0x0000000800d96000, pptex=0x0000000800d961d8, pixmap=10486301, width=1, height=1, repeat=true, fbcfg=0x0000000800dc1c00) at opengl.c:589:2
    frame #7: 0x0000000000245788 picom`paint_bind_tex(ps=0x0000000800d96000, ppaint=0x0000000800d961d0, wid=0, hei=0, repeat=true, depth=0, visual=33, force=false) at render.c:93:10
    frame #8: 0x0000000000248e99 picom`get_root_tile(ps=0x0000000800d96000) at render.c:509:10
    frame #9: 0x0000000000246f9a picom`paint_root(ps=0x0000000800d96000, reg_paint=0x00007fffffffd4b8) at render.c:521:38
    frame #10: 0x00000000002460b6 picom`paint_all(ps=0x0000000800d96000, t=0x0000000800deef80, ignore_damage=false) at render.c:844:2
    frame #11: 0x0000000000227aad picom`_draw_callback(loop=0x0000000800300328, ps=0x0000000800d96000, revents=8192) at picom.c:1444:4
    frame #12: 0x000000000022794f picom`_draw_callback(loop=0x0000000800300328, ps=0x0000000800d96000, revents=8192) at picom.c:1425:10
    frame #13: 0x0000000000226ec2 picom`draw_callback(loop=0x0000000800300328, w=0x0000000800d960c0, revents=8192) at picom.c:1466:2
    frame #14: 0x00000008002f79b1 libev.so.4`ev_invoke_pending + 113
    frame #15: 0x00000008002f82f0 libev.so.4`ev_run + 2336
    frame #16: 0x0000000000225107 picom`session_run(ps=0x0000000800d96000) at picom.c:2325:2
    frame #17: 0x00000000002226bb picom`main(argc=4, argv=0x00007fffffffd848) at picom.c:2426:3
    frame #18: 0x000000000022110f picom`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1.c:76:7
Comment 6 Jan Beich freebsd_committer 2020-06-19 17:49:41 UTC
(In reply to Adam Jimerson from comment #4)
> libGL: screen 0 does not appear to be DRI2 capable
> 709 frames in 5.0 seconds = 141.730 FPS

VSync is disabled, so you get more than 60 FPS. Looks generic to OpenGL under X11 -> reassigning. I've never used an AMD GPU myself, so may not be aware of quirks.

> [   141.324] (--) PCI:*(8@0:0:0) 1002:67df:1da2:e353 rev 231, Mem @ 0xe0000000/268435456, 0xf0000000/2097152, 0xfe900000/262144, I/O @ 0x0000f000/256, BIOS @ 0x????????/65536

67df (Polaris10) is listed in both drm-kmod and mesa-dri, so OpenGL acceleration should work.

> [   141.333] (==) Matched ati as autoconfigured driver 0
> [   141.333] (==) Matched modesetting as autoconfigured driver 1
> [   141.333] (==) Matched scfb as autoconfigured driver 2
> [   141.333] (==) Matched vesa as autoconfigured driver 3
[...]
> [   141.335] (EE) open /dev/dri/card0: Operation not supported by device
> [   141.335] (WW) Falling back to old probe method for modesetting
> [   141.335] (EE) open /dev/dri/card0: Operation not supported by device
> [   141.335] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
> [   141.335] (EE) Screen 0 deleted because of no matching config section.
> [   141.335] (II) UnloadModule: "modesetting"

Can you show "dmesg" output? If you have amdgpu_load=YES in /boot/loader.conf convert to kld_list=amdgpu in /etc/rc.conf. Modern GPUs need to load firmware which may not work from loader(8).

> [   141.337] (II) VESA(0): initializing int10
> [   141.337] (II) VESA(0): Primary V_BIOS segment is: 0xc000
> [   141.337] (II) VESA(0): VESA BIOS detected
> [   141.337] (II) VESA(0): VESA VBE Version 3.0
> [   141.337] (II) VESA(0): VESA VBE Total Mem: 49152 kB
[...]
> [   141.656] (II) AIGLX: Screen 0 is not DRI2 capable
> [   141.676] (II) IGLX: Loaded and initialized swrast
> [   141.676] (II) GLX: Initialized DRISWRAST GL provider for screen 0

xf86-video-vesa doesn't support OpenGL acceleration. Maybe deinstall all xf86-video-* except xf86-video-amdgpu to facilitate debugging.
Comment 7 Jan Beich freebsd_committer 2020-06-19 17:57:24 UTC
(In reply to Adam Jimerson from comment #5)
I'm postponing debugging x11-wm/picom crash until you get OpenGL accelerated. The failure mode in graphics/mesa-* is kinda fragile, so when DRI2 fails it may not gracefully fallback to swrast (compared to LIBGL_ALWAYS_SOFTWARE).
Comment 8 Adam Jimerson 2020-06-19 18:04:34 UTC
Thanks for your assistance in looking to this Jan, picom was working up until yesterday when I did a `pkg upgrade` so I know that my hardware and driver supports OpenGL acceleration. Worst case I roll back my ZFS pool to a snapshot from before I did that update and go from there...

(In reply to Jan Beich from comment #6)

> Can you show "dmesg" output?

Sure I'll grab that and attach it here

> If you have amdgpu_load=YES in /boot/loader.conf convert to kld_list=amdgpu in /etc/rc.conf. Modern GPUs need to load firmware which may not work from loader(8).

That is already set in my /etc/rc.conf per the post-install message of drm-fbsd12.0-kmod and the kernel modules are getting loaded on boot

 6    1 0xffffffff82f11000   25b3cc amdgpu.ko
12    1 0xffffffff8321b000     8160 amdgpu_polaris10_mc_bin.ko
13    1 0xffffffff83224000     4420 amdgpu_polaris10_pfp_2_bin.ko
14    1 0xffffffff83229000     4418 amdgpu_polaris10_me_2_bin.ko
15    1 0xffffffff8322e000     2418 amdgpu_polaris10_ce_2_bin.ko
16    1 0xffffffff83231000     5d40 amdgpu_polaris10_rlc_bin.ko
17    1 0xffffffff83237000    40430 amdgpu_polaris10_mec_2_bin.ko
18    1 0xffffffff83278000    40430 amdgpu_polaris10_mec2_2_bin.ko
19    1 0xffffffff832b9000     3318 amdgpu_polaris10_sdma_bin.ko
20    1 0xffffffff832bd000     3320 amdgpu_polaris10_sdma1_bin.ko
21    1 0xffffffff832c1000    5bc00 amdgpu_polaris10_uvd_bin.ko
22    1 0xffffffff8331d000    28d20 amdgpu_polaris10_vce_bin.ko
23    1 0xffffffff83346000    1fe50 amdgpu_polaris10_k_smc_bin.ko

> xf86-video-vesa doesn't support OpenGL acceleration. Maybe deinstall all xf86-video-* except xf86-video-amdgpu to facilitate debugging.

Sure I can do that, I only had that installed temperately anyways when I got this graphics card for basic tests and meant to remove it afterwards.
Comment 9 Adam Jimerson 2020-06-19 18:05:39 UTC
Created attachment 215794 [details]
Dmesg output
Comment 10 Jan Beich freebsd_committer 2020-06-19 18:43:41 UTC
(In reply to Adam Jimerson from comment #9)
> [drm:gfx_v8_0_ring_test_ring] amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD)
> [drm:amdgpu_device_ip_init] hw_init of IP block <gfx_v8_0> failed -22
> drmn0: amdgpu_device_ip_init failed

No clue how important this is. Try cold reboot: power off, wait half a minute for remaining power to dissipate, then turn on. Try loading amdgpu manually instead of via kld_list.

> drmn0: Fatal error during GPU init
> [drm] amdgpu: finishing device.
> device_attach: drmn0 attach returned 22

Probably explains broken /dev/dri/card0 i.e., DRM has initialized but cannot do anything because device-specific driver failed.
Comment 11 Niclas Zeising freebsd_committer 2020-06-19 19:57:48 UTC
When did you last update before this?  Which version of drm-fbsd12.0-kmod and gpu-firmware-kmod were installed when this was last working?
Comment 12 Adam Jimerson 2020-06-19 20:12:51 UTC
Created attachment 215798 [details]
New Xorg.0.log with just xf86-video-amdgpu installed

I tried doing a cold boot as suggested after removing xf86-video-vesa and X11 couldn't even start (I updated the Xorg.0.log attachment with the new log output) throwing a "no screens found(EE)" error. I even tried commenting out the kld_load line from /etc/rc.conf and manually loaded amdgpu before starting X11 with the same result. This may explain why it was using the vesa driver which I had to reinstall. I can also upload an updated output of `dmesg` if need be.
Comment 13 Adam Jimerson 2020-06-19 20:25:13 UTC
(In reply to Niclas Zeising from comment #11)

> When did you last update before this?

Before yesterday looks like the last time anything was updated on my system was June 9th.

> Which version of drm-fbsd12.0-kmod and gpu-firmware-kmod were installed when this was last working?

Those haven't seen any updates for a while, although I did try rebuilding them out of the ports tree as well as xf86-video-amdgpu today to see if that wouldn't resolve the issue so I don't know exactly when they were last upgraded.
Comment 14 Niclas Zeising freebsd_committer 2020-06-19 20:30:06 UTC
(In reply to Adam Jimerson from comment #13)

It would be interesting to know if the last working set were using the latest drivers or not.  There has been an update to drm-fbsd12.0-kmod in February, and gpu-firmware-kmod on May, nothing since.

It looks like loading the driver is failing, which could indicate an update of either the driver or the firmware, which is why I'm asking when they were last updated and if it used to work even with the latest versions of those drivers.
Comment 15 Adam Jimerson 2020-06-19 21:39:37 UTC
(In reply to Niclas Zeising from comment #14)

To confirm the versions of drm-fbsd12.0-kmod and gpu-firmware-kmod installed before yesterday I rolled by my system to the 17th and I have the following:

- drm-fbsd12.0-kmod-4.16.g20200221 installed on Fri Feb 28 07:40:44 2020 EST
- gpu-firmware-kmod-g20200503 installed on Thu May  7 21:22:33 2020 EDT
- xf86-video-amdgpu-19.1.0_1 installed on xf86-video-amdgpu-19.1.0_1

It's also worth noting that the amdgpu driver is working correctly again, OpenGL acceleration and picom is working as well.

If I do a `pkg upgrade` I get a long list of packages that pkg wants to upgrade (mostly KDE/plasma packages) but nothing sticks out to me as being something that would break the amdgpu driver.
Comment 16 Adam Jimerson 2020-06-19 21:42:31 UTC
(In reply to Adam Jimerson from comment #15)

I take that back I over looked this on the first blush

mesa-dri: 19.0.8_4 -> 19.0.8_6

So the amdgpu driver dosn't like something about mesa-dri 19.0.8_6 I'm guessing
Comment 17 Adam Jimerson 2020-06-20 02:18:32 UTC
This is getting stranger by the minute, I updated everything but the mesa-dri package and rebooted to test if everything would come back up as expected (same "No screens found" error).

Strangely without the xf86-video-vesa package installed X11 failed to start, but installing, X11 works and picom which seems happy but yet doing something like glxgears with LIBGL_DEBUG flag I still get the following.

> libGL: screen 0 does not appear to be DRI2 capable
> libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory.
> libGL: Can't open configuration file /home/vendion/.drirc: No such file or directory.
> libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory.
> libGL: Can't open configuration file /home/vendion/.drirc: No such file or directory.
> 511 frames in 5.0 seconds = 102.029 FPS
> 439 frames in 5.0 seconds = 87.668 FPS

I also still see the following in my Xorg.0.log file

> [   139.052] (EE) open /dev/dri/card0: Operation not supported by device
> [   139.052] (WW) Falling back to old probe method for modesetting
> [   139.052] (EE) open /dev/dri/card0: Operation not supported by device
> [   139.052] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
> [   139.052] (EE) Screen 0 deleted because of no matching config section.


So I guess the amdgpu driver may have been failing all this time and vesa has just been playing a trick on me.

I also don't know if this is related to anything but it seems that X11 is looking for a module that does not exist for some reason:

> [   139.051] (II) LoadModule: "ati"
> [   139.051] (WW) Warning, couldn't open module ati
> [   139.051] (EE) Failed to load module "ati" (module does not exist, 0)

rather than looking for amdgpu
Comment 18 Niclas Zeising freebsd_committer 2020-06-20 06:01:52 UTC
(In reply to Adam Jimerson from comment #17)

Can you verify that amdgpu.ko loaded properly when things were working?
There as been some churn in mesa-dri related to the switch to meson, especially surrounding swrast, wich is a software rendering GL driver.  While mesa-dri 19.0.8_6 should have fixed that, it might be worth updating to 19.0.8_7 which had some further fixes.

xf86-video-vesa is being used when no other xf86-video* driver can be used.  Without having amdgpu.ko (or any other drm kernel driver) load properly, Xorg falls back to that driver, or a similar driver called xf86-video-scfb, if you are using an EFI framebuffer console.
Comment 19 Adam Jimerson 2020-06-20 10:43:33 UTC
(In reply to Niclas Zeising from comment #18)

> Can you verify that amdgpu.ko loaded properly when things were working?
There as been some churn in mesa-dri related to the switch to meson, especially surrounding swrast, wich is a software rendering GL driver.

I do know that the amdgpu and the related Polaris 10 kernel modules are getting loaded at boot as in comment 8 (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247416#c8). At this point though I question is if X11 was ever loading /usr/local/lib/xorg/modules/drivers/amdgpu_drv.so looking at the logs it looks like it tries to load the ati driver and fails (due to no ati_drv.so in /usr/local/lib/xorg/modules/drivers I'm guessing) and eventually settles for loading the vesa driver.

> While mesa-dri 19.0.8_6 should have fixed that, it might be worth updating to 19.0.8_7 which had some further fixes.

Sure I'll try that and report back my results, although it looks like 19.0.8_7 has not fullly propogated through the pkg servers so I'm building it from the ports tree.
Comment 20 Jan Beich freebsd_committer 2020-06-20 10:58:23 UTC
(In reply to Adam Jimerson from comment #17)
> I also don't know if this is related to anything but it seems that
> X11 is looking for a module that does not exist for some reason:

"ati" module is installed by xf86-video-ati and used together with radeonkms from drm-kmod. Xorg only tries to use it if xorg.conf is missing.
https://gitlab.freedesktop.org/xorg/xserver/-/blob/xorg-server-1.20.8/hw/xfree86/common/xf86pciBus.c#L1108-1110


(In reply to Adam Jimerson from comment #19)
> At this point though I question is if X11 was ever loading /usr/local/lib/xorg/modules/drivers/amdgpu_drv.so

Before this bug Xorg probably used "modesetting". Can you attach "dmesg" output and /var/log/Xorg.0.log from a working configuration?
Comment 21 Adam Jimerson 2020-06-20 11:04:16 UTC
It looks like things are still failing with mesa-dri 19.0.8_7, although it seems that the core problem is X11 is not using the correct driver.

I even tried creating a /usr/local/etc/X11/xorg.conf.d/ config file to try and tell X11 to load the amdgpu driver and it still failed with the "No screens found" error.

Section "Device"
    Identifier "Card0"
    Driver     "amdgpu"
EndSection

I was hoping that would cause X11 to look for /usr/local/lib/xorg/modules/drivers/amdgpu_drv.so instead of /usr/local/lib/xorg/modules/drivers/ati_drv.so like it seems to want to do.
Comment 22 Jan Beich freebsd_committer 2020-06-20 11:08:38 UTC
(In reply to Adam Jimerson from comment #21)
> tell X11 to load the amdgpu driver and it still failed with the "No screens found" error.

Both "modesetting" and "amdgpu" require working /dev/dri/card0. UMS (userland mode setting) isn't supported nowadays.
Comment 23 Adam Jimerson 2020-06-20 11:16:10 UTC
(In reply to Jan Beich from comment #20)

> "ati" module is installed by xf86-video-ati and used together with radeonkms from drm-kmod. Xorg only tries to use it if xorg.conf is missing.

Which I don't have xf86-video-ati installed but I do have xf86-video-amdgpu but it looks like it's not even being looked for going off the Xorg logs.

> Before this bug Xorg probably used "modesetting". Can you attach "dmesg" output and /var/log/Xorg.0.log from a working configuration?

At this point my system is running as it always has been since I installed this card which is why I'm starting to think my system was never actually running the correct driver for my card. I can upload my dmesg output and Xorg.0.log from this boot but I'm sure it would be similar to what is already attached here.
Comment 24 Adam Jimerson 2020-06-20 11:17:36 UTC
Created attachment 215811 [details]
Xorg.0.log
Comment 25 Adam Jimerson 2020-06-20 11:18:09 UTC
Created attachment 215812 [details]
dmesg output
Comment 26 Niclas Zeising freebsd_committer 2020-06-20 11:18:15 UTC
I do not believe you had working graphics using amdgpu.ko even before all this

The dmesg you have posted, from the broken system, indicates that amdgpu.ko does not attach properly to the device.  There might be an issue with swrast in recent versions of mesa-dri, that used to work before, that was masking this.

Comment 8 only proves that the module (and firmware modules) load, not that they attach or work.  Can you, as also requested by Jan, provide an Xorg.log and a dmesg from a system when this was working.

Xorg will probe a multitude of drivers (xf86-video-*) to figure out what's going on, and which is best.  Both xf86-video-ati and xf86-video-amdgpu requires working kernel drivers though.

Without having proper logs, from a working system, to compare with the ones from when it is not working, it is very hard to tell what's going on.
Comment 27 Niclas Zeising freebsd_committer 2020-06-20 11:22:06 UTC
In the dmesg you just posted, amdgpu.ko is not loading correctly:

> [drm:gfx_v8_0_ring_test_ring] amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD)
> [drm:amdgpu_device_ip_init] hw_init of IP block <gfx_v8_0> failed -22
> drmn0: amdgpu_device_ip_init failed
> drmn0: 0xfffff800037c8800 unpin not necessary
> drmn0: 0xfffff8000476f800 unpin not necessary
> [TTM] Finalizing pool allocator
> [TTM] Zone  kernel: Used memory at exit: 0 kiB
> [TTM] Zone   dma32: Used memory at exit: 0 kiB
> [drm] amdgpu: ttm finalized
> drmn0: Fatal error during GPU init
> [drm] amdgpu: finishing device.
> device_attach: drmn0 attach returned 22

Looking at Xorg.log, I see that it is using the vesa driver, and also complaining about not finding dri/card/0, further confirming that this is the case.

It might be that you were using swrast before, and that is now failing.  It is one of the components that changed between mesa-dri 19.0.8_4 and _7.
Comment 28 Adam Jimerson 2020-06-20 17:24:45 UTC
(In reply to Niclas Zeising from comment #26)

> Can you, as also requested by Jan, provide an Xorg.log and a dmesg from a system when this was working.

At this point I don't think it was ever working, and I was just being fooled by vesa and swrast I checked my ZFS snapshots and could not find an instance when my system was actually using the amdgpu driver and kernel module.
Comment 29 Adam Jimerson 2020-06-20 17:50:46 UTC
Created attachment 215819 [details]
Xorg.0.log from Apr 28

Attached is the oldest Xorg log that I have in my backups, it dates back to Apr 28 and in it I see that I was still using the vesa driver back then and it still compains about /dev/dri/card0
Comment 30 Adam Jimerson 2020-06-20 17:56:04 UTC
Created attachment 215821 [details]
Dmesg log from May 2

This is the demesg log from the same snapshot of the last Xorg.0.log file I posted.
Comment 31 Adam Jimerson 2020-06-22 12:54:58 UTC
Seeing as how there seems to be two issues at play here:

1. amdgpu driver/module: does not correctly load for my card
2. mesa-dri 19.0.8_6+ breaks swrast which was being used with vesa driver

what would be the prefered route to take:

a. try and work out why amdgpu is failing (and hopefully drop the need for vesa and swrast)
b. try to work out why the mesa-dri update breaks swrast, and I can create a seperate issue for the amdgpu issue
c. something else (I'm open for suggestions if there seems to be an easier route to fix these issues)
Comment 32 Adam Jimerson 2020-06-23 13:00:28 UTC
Okay so update on progress, I managed to work out why amdgpu was failing to load. Some point late last year I did a full reinstall of my OS and restored from a backup, but my /boot/loader.conf didn't get restored so the hw.syscons.disable=1 and kern.vty=wt settings were not set.

Running everything with amdgpu now everything works as intended, but mesa-dri still breaks vesa+swrast.
Comment 33 Niclas Zeising freebsd_committer 2020-07-20 11:52:37 UTC
(In reply to Adam Jimerson from comment #32)
the mesa-dri issue is believed to be fixed.

I'm closing this.  The issue seem related to configuration, and has been resolved.  If there's any lingering issues, please re-open this or submit a new PR.