Bug 253277 - x11/xtrans: Don't unlink existing UNIX sockets => allows multiple X sessions from sddm
Summary: x11/xtrans: Don't unlink existing UNIX sockets => allows multiple X sessions ...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-x11 (Nobody)
URL: https://reviews.freebsd.org/D30557
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-05 18:13 UTC by Olivier Certner
Modified: 2023-09-04 14:34 UTC (History)
2 users (show)

See Also:
bugzilla: maintainer-feedback? (x11)


Attachments
Patch against the ports tree (1.60 KB, patch)
2021-02-05 18:13 UTC, Olivier Certner
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Certner freebsd_committer freebsd_triage 2021-02-05 18:13:27 UTC
Created attachment 222189 [details]
Patch against the ports tree

Patch that removes the `unlink` call before binding some UNIX socket.

The problem with the existing approach is that, when Xorg is launched as `root`, it can always unlink existing sockets, which it does when no display is explicitly specified and `-displayfd` is used.

Moreover, the X server removes the UNIX socket when exiting (even in most abnormal cases).

Typically, sddm exactly triggers the problem. So when you try to open another session through it, the first session's sockets are crushed and it becomes stale.

I'm proposing the patch here since it is not practically useful on platforms that launch Xorg with an unprivileged user and because upstream seems inactive.
Comment 1 Niclas Zeising freebsd_committer freebsd_triage 2021-02-05 23:15:41 UTC
I am quite skeptical of pulling this in without it first being submitted and accepted upstream.
Can you explain exactly what problem you are trying to solve?
Comment 2 Olivier Certner freebsd_committer freebsd_triage 2021-02-08 11:24:30 UTC
Hi,

As for the high-level problem I'm fixing, I wrote this description:

> Typically, sddm exactly triggers the problem. So when you try to open
> another session through it, the first session's sockets are crushed
> and it becomes stale.

What do you find unclear in it exactly?

As for the low-level description, let me recapitulate with more context. As you most probably know, there are two ways that Xorg selects its display at startup:
- Either you supply an explicit display on the command-line (e.g., ":1").
- Or you do not, and either "-displayfd" has been specified, in which case Xorg will try to assign some automatically, or it has not, in which case :0 is used by default.
("-displayfd" has other effects, but this is a separate discussion, relevant to the other bug #253278.)

The problem is with the automatic assignment of display. It works by trying to grab the first display available (from :0 when using UNIX sockets), which it does by simply unlinking an existing socket file and then binding a new socket to the same file. When the server is run as root, this operation always succeeds, and any existing socket for :0 ("/tmp/.X11-UNIX/X0") that would exist is removed. This leaves any preexisting X session on :0 stale (e.g., launching new apps make them go to the new session).

And SDDM triggers that: It passes "-displayfd", and no display directive, relying on the server to do display assignmnent, which it doesn't do correctly.

Now, I agree that this problem is probably not FreeBSD specific (it affects platforms/distributions running Xorg as root). If you prefer, I'll report it upstream and we'll see how it goes.
Comment 3 Olivier Certner freebsd_committer freebsd_triage 2021-02-08 16:43:46 UTC
Amusingly, I found this code at start of 'startx':

# Automatically determine an unused $DISPLAY
d=0
while true ; do
    [ -e "/tmp/.X$d-lock" -o -S "/tmp/.X11-unix/X$d" ] || break
    d=$(($d + 1))
done
defaultdisplay=":$d"
unset d

This is exactly what the proposed change makes X (launched with '-displayfd') do.
Comment 4 Olivier Certner freebsd_committer freebsd_triage 2021-02-09 09:40:00 UTC
Reported upstream here:
https://gitlab.freedesktop.org/xorg/lib/libxtrans/-/issues/4
Comment 5 Jan Beich freebsd_committer freebsd_triage 2021-05-02 14:25:22 UTC
+1. Prevents Plasma/Wayland (debug session) destroying X11 socket used by Sway (dogfood session) at least until https://invent.kde.org/plasma/kwin/-/commit/9f0f4527029a. Both run Xwayland with different flags e.g.,

[kernel]
- /sbin/init
|-- login [pam] (login)
| `-- sway
|-- Xwayland :0 -rootless -terminate -core -listenfd 35 -listenfd 37 -wm 40
...
|-- login [pam] (login)
| `-- ck-launch-session dbus-run-session startplasma-wayland
|   `-- dbus-run-session startplasma-wayland
|     |-- dbus-daemon --nofork --print-address 4 --session
|     `-- startplasma-wayland
|       `-- kwin_wayland_wrapper --xwayland ../libexec/startplasma-waylandsession (kwin_wayland_wrappe)
|         `-- kwin_wayland --wayland_fd 4 --xwayland ../libexec/startplasma-waylandsession
|           `-- Xwayland -displayfd 49 -rootless -wm 52 -auth /var/run/user/1234/xauth_XOBvXm

Notice, Sway doesn't use -displayfd unlike Plasma but simply checks locks.
https://github.com/swaywm/wlroots/blob/0.13.0/xwayland/sockets.c#L130-L186
Comment 6 Niclas Zeising freebsd_committer freebsd_triage 2021-05-31 06:30:34 UTC
I would like upstream to comment on the MR that is in the patch in Phab that Jan linked here before we commit this.
Comment 7 Jan Beich freebsd_committer freebsd_triage 2023-05-31 05:24:19 UTC
Upstream suggested https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/844 which helps Xwayland at least after https://gitlab.freedesktop.org/wlroots/wlroots/-/commit/4741e9d8410f

Note, Linux is likely unaffected due to using abstract unix(4) sockets i.e., no paths to unlink(2) and leading NUL byte in sun_path.
Comment 8 Olivier Certner freebsd_committer freebsd_triage 2023-05-31 16:44:42 UTC
Hi Jan,

I provided a lot of background and explanations (and asked questions) in this mail almost 2 years ago:
https://lists.x.org/archives/xorg-devel/2021-July/058735.html

Every line of it is still valid.  Unfortunately, I never received any response, and after some months I kinda lost hope in upstream doing anything on this matter.

Thanks for pointing me to Jon Turnery's MR, which I was not aware of.  Upstream doesn't seem to have made any progress since then (almost 1 year and a half ago).  Glancing at the MR, I have doubts about their approach, in particular the fact that this should be fixed in the server.  I think it is likely it should fixed in xtrans instead.  After all, this is where sockets are created or unlinked, which should not happen without a locking mechanism.  Moreover, as pointed out in the mail, the current lock file mechanism in the X server is logically incorrect and could be defeated in arguably rare or provoked situations.

I may try to give it a new shot soon (next week perhaps), but not making hard promises.

Thanks and regards.
Comment 9 Gleb Popov freebsd_committer freebsd_triage 2023-09-01 08:21:41 UTC
May I push this in? This change seems safe for us as `/etc/rc.d/cleartmp` removes stale X sockets on system boot.
Comment 10 Olivier Certner freebsd_committer freebsd_triage 2023-09-01 09:09:25 UTC
(In reply to Gleb Popov from comment #9)

I concur.  The patch is still up-to-date.

zeising@: I don't think it's reasonable to hold this as there is no sign of upstream really pondering the topic and given that:
- This problem has been preventing the launch of multiple concurrent sessions, an important use case, for years already.
- I've been using it without the slightest side effect for years.

Thanks.
Comment 11 commit-hook freebsd_committer freebsd_triage 2023-09-01 11:49:05 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=3312d027696ff994d07c434a2baaba1f5b18d998

commit 3312d027696ff994d07c434a2baaba1f5b18d998
Author:     Gleb Popov <arrowd@FreeBSD.org>
AuthorDate: 2023-09-01 11:43:07 +0000
Commit:     Gleb Popov <arrowd@FreeBSD.org>
CommitDate: 2023-09-01 11:47:43 +0000

    x11/xtrans: Don't unlink X11 Unix sockets.

    Without this change calling `Xorg -displayfd ..` would always make X to use
    :0 as the DISPLAY. This was a showstopper for running parallel desktop sessions.

    The leftover UDS' are cleaned up during system's boot by the /etc/rc.d/cleartmp
    service.

    Based on work of jbeich and Olivier Certner <olivier.freebsd@free.fr>.

    PR:             253277
    Approved by:    manu
    Sponsored by:   Serenity Cybersecurity, LLC
    Differential Revision:  https://reviews.freebsd.org/D30557

 net/tigervnc-server/Makefile        | 2 +-
 x11-fonts/libFS/Makefile            | 1 +
 x11-fonts/libXfont/Makefile         | 2 +-
 x11-fonts/libXfont2/Makefile        | 1 +
 x11-fonts/xfs/Makefile              | 1 +
 x11-servers/xarcan/Makefile         | 2 +-
 x11-servers/xorg-server/Makefile    | 2 +-
 x11-servers/xwayland-devel/Makefile | 1 +
 x11/cinnamon-session/Makefile       | 2 +-
 x11/gnome-session/Makefile          | 2 +-
 x11/libICE/Makefile                 | 2 +-
 x11/libSM/Makefile                  | 1 +
 x11/libX11/Makefile                 | 1 +
 x11/mate-session-manager/Makefile   | 2 +-
 x11/xtrans/Makefile                 | 4 ++++
 x11/xtrans/distinfo                 | 2 ++
 16 files changed, 20 insertions(+), 8 deletions(-)
Comment 12 Gleb Popov freebsd_committer freebsd_triage 2023-09-01 11:50:59 UTC
This is finally landed. Thanks Olivier for the fix!
Comment 13 Olivier Certner freebsd_committer freebsd_triage 2023-09-01 20:51:20 UTC
Great! Thanks Gleb!
Comment 14 Olivier Certner freebsd_committer freebsd_triage 2023-09-04 14:34:44 UTC
For the record, on "stale" socket files:
- Xorg has a signal handler to clean up sockets on abnormal termination.  So, unless the process is killed with SIGKILL or something else is very wrong, abnormal termination doesn't leave stale sockets.
- The automatic display selection ('-displayfd' and no explicit display ID passed) will automatically skip the stale socket and associated display ID and assign another ID.
- Reboot leads to socket cleanup, as Gleb pointed out.

That's why I've never had the slightest problem with this patch and why you most probably won't either (unless you love to kill your Xorg and insist on explicitly reusing such a killed display without an intervening reboot).