Bug 250933 - AMD Radeon, FreeBSD 12.2, startx fails: Caught signal 6 (Abort trap). Server aborting
Summary: AMD Radeon, FreeBSD 12.2, startx fails: Caught signal 6 (Abort trap). Server ...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-x11 (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-07 20:44 UTC by Kurt Jaeger
Modified: 2020-11-08 08:24 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kurt Jaeger freebsd_committer freebsd_triage 2020-11-07 20:44:26 UTC
Having upgraded from 12.1 to 12.2, using freshly-build ports, startx fails in the end with a crash message:

Caught signal 6 (Abort trap). Server aborting

The full Xorg.0.log can be found at:
https://people.freebsd.org/~pi/logs/Xorg.0.log

The xorg.conf file:
https://people.freebsd.org/~pi/logs/xorg.conf

The card:

vgapci0@pci0:65:0:0:    class=0x030000 card=0x05211043 chip=0x67df1002 rev=0xe7 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]'
    class      = display
    subclass   = VGA
    bar   [10] = type Prefetchable Memory, range 64, base rx80000000, size 268435456, enabled
    bar   [18] = type Prefetchable Memory, range 64, base rx90000000, size 2097152, enabled
    bar   [20] = type I/O Port, range 32, base rx2000, size 256, enabled
    bar   [24] = type Memory, range 32, base rx9ed00000, size 262144, enabled

Any ideas what can be the next step to debug this ? Rolling back to 12.1 etc. would be a huge time sink 8-(
Comment 1 Kurt Jaeger freebsd_committer freebsd_triage 2020-11-07 20:47:42 UTC
All ports were build in a 12.2 poudriere jail, so it should not be the
'kernel out of sync with world/ports' problem.
Comment 2 Jan Beich freebsd_committer freebsd_triage 2020-11-07 22:22:34 UTC
(In reply to Kurt Jaeger from comment #0)
> (II) AMDGPU(0): Setting screen physical size to 1016 x 571

Can you try modesetting DDX driver (it's part of xorg-server package)? Set Driver to "modesetting" in xorg.conf or deinstall xf86-video-amdgpu and remove xorg.conf.

> (EE) 0: /usr/local/bin/Xorg (?+0x0) [0x41c400]
> (EE) unw_get_proc_name failed: no unwind info found [-10]
> (EE) 1: /lib/libthr.so.3 (?+0x0) [0x800928aa0]
> (EE) unw_get_proc_name failed: no unwind info found [-10]

Symbol addresses without exact binary and core dump are useless. Rebuild xorg-server WITH_DEBUG=1 (unoptimized, debug symbols) or STRIP="" (optimized, non-debug symbols) then re-upload Xorg.0.log.

> Any ideas what can be the next step to debug this ?

Try graphics/kmscube on console, mainly to check /dev/dri/* work fine. However, OpenGL may fall back to software rendering. Also try vkcube-display from devel/vulkan-tools.

If the issue is in the kernel driver running under truss(1) and raising compat.linuxkpi.drm_debug may help determine which DRM ioctl fails and why. However, connecting "why" to what introduced the regression may still be hard without bisecting the kernel or at least $SYSDIR/compat/linuxkpi bits.

> Rolling back to 12.1 etc. would be a huge time sink 8-(

Even with bectl(8) or is this a new machine? Maybe try 13.0-CURRENT with drm-current-kmod but keep userland and other ports as is.
Comment 3 Kurt Jaeger freebsd_committer freebsd_triage 2020-11-08 08:24:25 UTC
(In reply to Jan Beich from comment #2)
Deinstalling xf86-video-amdgpu and removing xorg.conf fixed the problem!

I was not aware that the xf port and the modesetting driver were two different ways to get this going.

I still have issues with one cursor key in textfields like this (cursor-left no longer works?) and in some context the middle-button of the mouse fails, but I can cope with that for now.

About using bectl: I did not use it, because I assumed 12.1 -> 12.2 would be no real trouble 8-(

Thank you very much!