Bug 286311 - x11-wm/sway: fails to start on amdgpu after ports bb43067a6928
Summary: x11-wm/sway: fails to start on amdgpu after ports bb43067a6928
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Jan Beich
URL: https://cgit.freebsd.org/ports/commit...
Keywords: regression
Depends on:
Blocks: 282659
  Show dependency treegraph
 
Reported: 2025-04-23 15:23 UTC by Evgenii Khramtsov
Modified: 2025-05-17 00:58 UTC (History)
5 users (show)

See Also:
jbeich: maintainer-feedback+


Attachments
output from tinywl -s alacritty (133.93 KB, text/plain)
2025-04-26 08:41 UTC, Thibault Payet
no flags Details
patch (1.43 KB, patch)
2025-05-08 22:08 UTC, Ivan Rozhuk
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Evgenii Khramtsov 2025-04-23 15:23:00 UTC
$ dbus-run-session seatd-launch sway
[...]
drm_gem_plane_helper_prepare_fb: explicit fence not handled

$ pciconf -lv | grep 580 -B2 -A2
vgapci0@pci0:1:0:0:	class=0x030000 rev=0xe7 hdr=0x00 vendor=0x1002 device=0x67df subvendor=0x1458 subdevice=0x2302
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]'
    class      = display
    subclass   = VGA
hdac0@pci0:1:0:1:	class=0x040300 rev=0x00 hdr=0x00 vendor=0x1002 device=0xaaf0 subvendor=0x1458 subdevice=0xaaf0
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]'
    class      = multimedia
    subclass   = HDA
(src) $ git rev-parse origin/main
base d14036ea424d
(ports) $ git rev-parse origin/main
ports 4e2974ebfe54
$ pkg info graphics/drm-66-kmod
[...]
6.6.25.1500038_1
$ pkg info mesa-devel
[...]
Version        : 25.1.b.228
(standalone, + local ports patches to pull in mesa-devel rather than mesa-dri)

Reverting ports bb43067a6928 and downgrading helps, so that suggests a regression in wlroots/sway or a newly exposed drm-kmod or base bug.

I can't investigate (e.g. try git bisect on wlroots or sway) until this weekend at least, I also can't submit any bug to either of {wlroots,sway,drm-kmod,src} yet.

I have found https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=282659#c7 but I can't confirm anything yet.
Comment 1 Jan Beich freebsd_committer freebsd_triage 2025-04-23 18:31:27 UTC
I can't reproduce on i915kms. Does adjusting environ(7) with either WLR_DRM_NO_ATOMIC=1 or WLR_DRM_FORCE_LIBLIFTOFF=1 help? Can you reproduce with tinywl (part of wlroots but not installed)?

Anyway, properly fixing this would require bisecting wlroots and likely help with testing upstream patches.
Comment 2 Thibault Payet 2025-04-24 14:55:56 UTC
By experimenting with virglrenderer, I know that on amdgpu explicit fence is not working.

Looking at what changed in backend/drm/atomic.c, maybe this one introduced some issue ?

https://gitlab.freedesktop.org/wlroots/wlroots/-/commit/d7223eae021f86eddcaaacb808598a393de9d114
Comment 3 Jan Beich freebsd_committer freebsd_triage 2025-04-25 00:25:11 UTC
DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD was used even before that wlroots commit. Maybe x11@ folks can help reproduce and debug on drm-kmod side.

Note, explicit sync required by linux-drm-syncobj-v1 protocol isn't really supported unlike DRM syncobj itself.
https://github.com/freebsd/drm-kmod/issues/278
https://github.com/freebsd/drm-kmod/blob/drm_v6.6.25_2/drivers/gpu/drm/drm_ioctl.c#L718-L722
Comment 4 Thibault Payet 2025-04-25 05:17:01 UTC
To add more information, in addition to the explicit fence not handled message, I got from sway (when using wlroots) an error in:

backend/drm/atomic.c at line 85 with the following (the formatting may be wrong):

Atomic commit failed Invalid argument

Also in my case I have two gpus: one amd igpu and another amd dgpu.
Comment 5 Jan Beich freebsd_committer freebsd_triage 2025-04-25 18:52:43 UTC
Don't ignore comment 1. I also need "sway -d" output if you're *not* going to report upstream.

(In reply to Thibault Payet from comment #4)
> explicit fence not handled message

Likely existed before 1.11 update per https://forums.freebsd.org/threads/wayland-kernel-drm_gem_plane_helper_prepare_fb-explicit-fence-not-handled.97571/ (predates bb43067a6928 by 2 days) 

> Atomic commit failed Invalid argument

Could be unrelated. I've seen it often on i915kms a year ago.
Comment 6 commit-hook freebsd_committer freebsd_triage 2025-04-25 18:54:05 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=125700dbc37ca7ce66411fe679818f2adaafaeb9

commit 125700dbc37ca7ce66411fe679818f2adaafaeb9
Author:     Jan Beich <jbeich@FreeBSD.org>
AuthorDate: 2025-04-25 18:11:24 +0000
Commit:     Jan Beich <jbeich@FreeBSD.org>
CommitDate: 2025-04-25 18:53:42 +0000

    x11-wm/sway: document amdgpu regression after bb43067a6928

    Updating to RC1 on /latest was done precisely to discover regressions
    with plenty of time to debug before the next /quarterly branches.
    Other wlroots019 consumers are likely affected as well, so there's no
    point in backing out just because of amdgpu.

    PR:             286311

 x11-wm/sway/pkg-message | 3 +++
 1 file changed, 3 insertions(+)
Comment 7 Thibault Payet 2025-04-25 19:22:28 UTC
(In reply to Jan Beich from comment #1)
Setting WLR_DRM_NO_ATOMIC=1 before running sway make it work again, thanks.
Comment 8 Jan Beich freebsd_committer freebsd_triage 2025-04-26 07:02:25 UTC
(In reply to Thibault Payet from comment #7)
What happened to tinywl testing? This is a bug tracker, not user forum to run away after finding a workaround or speculate with few details.
Comment 9 Thibault Payet 2025-04-26 08:41:02 UTC
Created attachment 259881 [details]
output from tinywl -s alacritty

Here is the output for tinywl -s alacritty
Which does not display the terminal window, and does not seems to like the atomic flag:

PAGE_FLIP_EVENT | ATOMIC_NONBLOCK
Comment 10 Evgenii Khramtsov 2025-04-26 10:37:06 UTC
git bisect with subproject/wlroots is a PITA due to upstream {API chase days after,meson wlroots project rename}, and other temporary dev branch regressions in the way, so with manual date based bisect of two projects I found out that:

https://github.com/swaywm/sway/commit/ae7c1b139a3c

is the first commit with "drm_gem_plane_helper_prepare_fb: explicit fence not handled" that causes the message to appear, only black screen before ae7c1b139a3c.

Thankfully sway could be kill -9 via ssh and vt(4) doesn't get garbled. Afterwards "Requested backend configuration failed, searching for valid fallbacks" to be seen.

> Does adjusting environ(7) with either WLR_DRM_NO_ATOMIC=1
> or WLR_DRM_FORCE_LIBLIFTOFF=1 help? Can you reproduce with tinywl
> (part of wlroots but not installed)?

I didn't try because WLR_RENDER_NO_EXPLICIT_SYNC works.

> Anyway, properly fixing this would require bisecting wlroots
> and likely help with testing upstream patches.

I know, really. I can report something to both wlroots and drm-kmod (not today).

(In reply to Jan Beich from comment #8)
> What happened to tinywl testing? This is a bug tracker, not user forum to run away
> after finding a workaround or speculate with few details.

I feel like this comment is too authoritarian, so I would rather keep my scarce free volunteer time today for something else, and would report upstream a day later or so.
Comment 11 Evgenii Khramtsov 2025-04-28 13:13:35 UTC
It is:

https://gitlab.freedesktop.org/wlroots/wlroots/-/commit/5f886351181e

From 5f886351181ead1f862056ce622cb1ab9e75f915 Mon Sep 17 00:00:00 2001
From: Simon Ser <totallynotforscapers>
Date: Wed, 3 Apr 2024 15:08:44 +0200
Subject: [PATCH] scene: add explicit synchronization for rendered buffers

---
 include/wlr/types/wlr_scene.h |  3 +++
 types/scene/wlr_scene.c       | 18 +++++++++++++++++-
 2 files changed, 20 insertions(+), 1 deletion(-)

I tested with sway-9a1c411abd82 ("Add support for tearing-control-v1")
(predates by one commit / is parent of 05e895c46382 "Add support for linux-drm-syncobj-v1").

I'll report upstream later (+ sway -d std{err,out} pre/post wlroots-5f886351181e on sway-9a1c411abd82) sometime today.
Comment 12 Evgenii Khramtsov 2025-04-28 13:46:52 UTC
(In reply to Evgenii Khramtsov from comment #10)

> report to wlroots

https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/3971

> report to drm-kmod

GitHub issue template seems like too much effort (yawn), and drm-kmod contributors likely read the FreeBSD-X11@ mailing list that is in CC of this PR. Maybe later when wlroots contributors provide suggestions and lead to report to drm-kmod I'll summarize in drm-kmod Issues list.
Comment 13 Evgenii Khramtsov 2025-04-30 17:23:35 UTC
(In reply to Thibault Payet from comment #9)

Can you test patched drm-kmod? Works for me on RX 580 as a workaround.

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index fe6529f39e..72a3274f7e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2835,8 +2835,7 @@ static const struct drm_driver amdgpu_kms_driver = {
 	.driver_features =
 	    DRIVER_ATOMIC |
 	    DRIVER_GEM |
-	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ |
-	    DRIVER_SYNCOBJ_TIMELINE,
+	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ,
 	.open = amdgpu_driver_open_kms,
 	.postclose = amdgpu_driver_postclose_kms,
 	.lastclose = amdgpu_driver_lastclose_kms,
@@ -2862,8 +2861,7 @@ static const struct drm_driver amdgpu_kms_driver = {
 
 const struct drm_driver amdgpu_partition_driver = {
 	.driver_features =
-	    DRIVER_GEM | DRIVER_RENDER | DRIVER_SYNCOBJ |
-	    DRIVER_SYNCOBJ_TIMELINE,
+	    DRIVER_GEM | DRIVER_RENDER | DRIVER_SYNCOBJ,
 	.open = amdgpu_driver_open_kms,
 	.postclose = amdgpu_driver_postclose_kms,
 	.lastclose = amdgpu_driver_lastclose_kms,
Comment 14 Evgenii Khramtsov 2025-05-01 05:32:41 UTC
(In reply to Evgenii Khramtsov from comment #13)

Opting out from explicit sync seemed fine for one day (no panics, software works etc), one can also test (seems to work too):

diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
index 4fb8acd280..f1aa85f5b7 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -20,14 +20,13 @@ drm_gem_plane_helper_prepare_fb(struct drm_plane *dp,
 		obj = dps->fb->obj[0];
 		if (obj == NULL)
 			return -EINVAL;
-		if (dps->fence) {
-			printf("%s: explicit fence not handled\n", __func__);
-			return -EINVAL;
-		}
 		r = dma_resv_get_singleton(obj->resv, DMA_RESV_USAGE_WRITE, &f);
 		if (r)
 			return r;
-		dps->fence = f;
+		if (dps->fence)
+			dma_fence_put(f);
+		else
+			dps->fence = f;
 	}
 
 	return 0;

Obtained from: https://github.com/openbsd/src/commit/4ae46dfd37dc
Comment 15 Evgenii Khramtsov 2025-05-01 20:28:52 UTC
(In reply to Evgenii Khramtsov from comment #14)

I have no idea if this works on anything else than -CURRENT and my GPU, and I won't test stable/14 and -RELEASE on my desktop. If no one reports by

Sat, 03 May 2025 10:00:00 +0000

I'll close the wlroots issue and toss the patch from comment #14 as MR to drm-kmod with __FreeBSD_version limited to -CURRENT, then close this bug as unrelated to sway.
Comment 16 Thibault Payet 2025-05-01 22:03:13 UTC
(In reply to Evgenii Khramtsov from comment #15)
I can confirm that the patch from comment #14 works on 14.2-RELEASE on my gpu
Comment 17 shamaz.mazum 2025-05-03 03:58:41 UTC
(In reply to Evgenii Khramtsov from comment #15)

Patch in comment #14 works for the default (OpenGL? OpenGL ES?) backend. The Vulkan backend which I used before the upgrade still fails with

[main.c:571] wl_display_roundtrip failed

and a segmentation fault. This backend can be enabled with setting WLR_RENDERER environment variable to "vulkan".
Comment 18 Jan Beich freebsd_committer freebsd_triage 2025-05-03 09:03:36 UTC
(In reply to shamaz.mazum from comment #17)
Reproduced on i915kms and reported upstream[1]. Sorry, I only test nested WLR_RENDERER=vulkan (as Sway on Sway).

Without explicit sync there's no reason to use it unless you have an NVIDIA GPU (glFlush vs. glFinish workaround but doesn't help Xwayland). The upcoming Vulkan-by-default[2] won't affect FreeBSD[3] even if drm-kmod suddenly implements the missing ioctls.

[1] https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/3973
[2] https://gitlab.freedesktop.org/wlroots/wlroots/-/merge_requests/4103
[3] https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/3843
Comment 19 Jan Beich freebsd_committer freebsd_triage 2025-05-03 09:41:25 UTC
Hmm, Vulkan regression seems related. On i915kms comment 14 didn't help but comment 13 did.

Bugged timeline reminds me of https://gitlab.freedesktop.org/mstoeckl/waypipe/-/issues/132
Comment 20 Evgenii Khramtsov 2025-05-03 10:55:21 UTC
https://github.com/freebsd/drm-kmod/pull/350
Comment 21 shamaz.mazum 2025-05-03 11:33:01 UTC
(In reply to Jan Beich from comment #19)

Yes, I confirm that the patch comment #13 helps with the both backends.
Comment 22 Evgenii Khramtsov 2025-05-06 22:30:16 UTC
(In reply to Jan Beich from comment #19)

Do you still need comment 13 for WLR_RENDERER=vulkan after https://gitlab.freedesktop.org/wlroots/wlroots/-/merge_requests/5059 ?
Comment 23 Jan Beich freebsd_committer freebsd_triage 2025-05-07 03:55:28 UTC
(In reply to Evgenii Khramtsov from comment #22)
No. I was waiting for 0.19.0-rc4, so others could easily test.

DRM_CAP_SYNCOBJ_TIMELINE in Hyprland, KWin, Mutter seems to be limited to linux-drm-syncobj-v1. In Xwayland limited to DRI3 1.4 / Present 1.4 *but* runtime-disabled on FreeBSD due to missing ioctls.
Comment 24 Ivan Rozhuk 2025-05-08 22:08:21 UTC
Created attachment 260269 [details]
patch

This fix kodi in gbm mode for me, thanks Evgenii Khramtsov.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=282659
Comment 25 Evgenii Khramtsov 2025-05-09 11:47:38 UTC
(In reply to Ivan Rozhuk from comment #24)

comment 14 is not sustainable, see https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/3971#note_2895333

comment 13 is a safe workaround as this opts out from explicit sync which avoids impacted code path until explicit sync is implemented right by drm-kmod contributors.

I plan to try wlroots with https://gitlab.freedesktop.org/wlroots/wlroots/-/merge_requests/5059 then update https://github.com/freebsd/drm-kmod/pull/350
(e.g. remove i915 opt out) and remove draft status sometime later. I can't now.

Thanks for testing.
Comment 26 commit-hook freebsd_committer freebsd_triage 2025-05-12 07:00:31 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=3adb702829143411225480d4aa30bbc51bad4803

commit 3adb702829143411225480d4aa30bbc51bad4803
Author:     Emmanuel Vadot <manu@FreeBSD.org>
AuthorDate: 2025-05-12 06:55:45 +0000
Commit:     Emmanuel Vadot <manu@FreeBSD.org>
CommitDate: 2025-05-12 06:55:45 +0000

    graphics/drm-61-kmod: Update to drm_v6.1.128_3

    Fixes issues with explicit fence and linux jiffies.

    PR:             286311
    Sponsored by:   Beckhoff Automation GmbH & Co. KG

 graphics/drm-61-kmod/Makefile         | 2 +-
 graphics/drm-61-kmod/Makefile.version | 2 +-
 graphics/drm-61-kmod/distinfo         | 6 +++---
 graphics/nvidia-drm-61-kmod/Makefile  | 2 +-
 graphics/nvidia-drm-61-kmod/distinfo  | 6 +++---
 5 files changed, 9 insertions(+), 9 deletions(-)
Comment 27 commit-hook freebsd_committer freebsd_triage 2025-05-12 07:00:33 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=8495963ac01c02db3f0d11253732c0c925d52fb6

commit 8495963ac01c02db3f0d11253732c0c925d52fb6
Author:     Emmanuel Vadot <manu@FreeBSD.org>
AuthorDate: 2025-05-12 06:59:04 +0000
Commit:     Emmanuel Vadot <manu@FreeBSD.org>
CommitDate: 2025-05-12 06:59:04 +0000

    graphics/drm-66-kmod: Update to drm_v6.6.25_4

    Fixes issues with explicit fence and linux jiffies.

    PR:             286311
    Sponsored by:   Beckhoff Automation GmbH & Co. KG

 graphics/drm-66-kmod/Makefile         | 2 +-
 graphics/drm-66-kmod/Makefile.version | 2 +-
 graphics/drm-66-kmod/distinfo         | 6 +++---
 graphics/nvidia-drm-66-kmod/Makefile  | 2 +-
 graphics/nvidia-drm-66-kmod/distinfo  | 6 +++---
 5 files changed, 9 insertions(+), 9 deletions(-)
Comment 28 Ivan Rozhuk 2025-05-12 10:00:30 UTC
(In reply to Evgenii Khramtsov from comment #25)

I had try https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=286311#c13 but it does not work.
Probably because 6.1 have only one DRIVER_SYNCOBJ_TIMELINE in drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c and some other file must be patched.
Comment 29 Evgenii Khramtsov 2025-05-12 14:51:11 UTC
(In reply to commit-hook from comment #26)

See https://github.com/freebsd/drm-kmod/pull/350#issuecomment-2872825604
and https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/3971#note_2895333

// This also lacked "Obtained from: OpenBSD"
Comment 30 shamaz.mazum 2025-05-12 18:26:46 UTC
May I ask how is that the patch comment #14 is called "not sustainable" in comment #25 but graphics/drm-61-kmod is now built with that patch?
Comment 31 commit-hook freebsd_committer freebsd_triage 2025-05-15 13:11:06 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=194af85ca4ae733687ca3e4487283c53c95e17d1

commit 194af85ca4ae733687ca3e4487283c53c95e17d1
Author:     Jan Beich <jbeich@FreeBSD.org>
AuthorDate: 2025-05-15 08:47:57 +0000
Commit:     Jan Beich <jbeich@FreeBSD.org>
CommitDate: 2025-05-15 13:09:50 +0000

    x11-toolkits/wlroots019: update to 0.19.0

    Fixes WLR_RENDERER=vulkan on KMS console.

    Changes:        https://gitlab.freedesktop.org/wlroots/wlroots/-/compare/0.19.0-rc3...0.19.0
    Changes:        https://gitlab.freedesktop.org/wlroots/wlroots/-/releases/0.19.0
    Reported by:    GitLab (notify releases)
    PR:             286311

 x11-toolkits/wlroots019/Makefile | 2 +-
 x11-toolkits/wlroots019/distinfo | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)
Comment 32 Jan Beich freebsd_committer freebsd_triage 2025-05-17 00:13:19 UTC
Assuming fixed but no clue if drm-kmod < 6.1 works fine (mainly affects FreeBSD 13.* series until EOL on 2026-04-30).
Comment 33 commit-hook freebsd_committer freebsd_triage 2025-05-17 00:58:40 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=8b1d2b70fb2705edad1abd3b5fb0fad5eb37ffe2

commit 8b1d2b70fb2705edad1abd3b5fb0fad5eb37ffe2
Author:     Jan Beich <jbeich@FreeBSD.org>
AuthorDate: 2025-05-17 00:31:47 +0000
Commit:     Jan Beich <jbeich@FreeBSD.org>
CommitDate: 2025-05-17 00:57:07 +0000

    x11-wm/sway: drop amdgpu note after 3adb70282914

    This reverts commit 125700dbc37ca7ce66411fe679818f2adaafaeb9.

    PR:             286311

 x11-wm/sway/pkg-message | 3 ---
 1 file changed, 3 deletions(-)