Bug 286692 - unix(4): regressed after base d15792780760
Summary: unix(4): regressed after base d15792780760
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 15.0-CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Gleb Smirnoff
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2025-05-09 11:36 UTC by Evgenii Khramtsov
Modified: 2025-05-13 13:37 UTC (History)
3 users (show)

See Also:


Attachments
kdump.stdout.txt (23.85 KB, text/plain)
2025-05-12 15:46 UTC, Evgenii Khramtsov
no flags Details
kdump2.stdout.txt.zst (10.94 KB, application/octet-stream)
2025-05-12 16:25 UTC, Evgenii Khramtsov
no flags Details
kdump3.stdout.txt (294.66 KB, text/plain)
2025-05-12 18:21 UTC, Evgenii Khramtsov
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Evgenii Khramtsov 2025-05-09 11:36:57 UTC
x11-wm/sway starts normally, though executing a command through it, e.g. dmenu-wl

(I have this in ~/.config/sway/config):
set $menu dmenu-wl_path | dmenu-wl -h 30 -nb '#FFFFFF' -sb '#BFBFBF' -nf '#000000' -sf '#000000' | xargs swaymsg exec --

$ dbus-run-session seatd-launch sway
($mod+d pressed, dmenu-wl appears)
(sway starts consuming 100% WCPU in top(1) output)

Another config hotkey exec that causes this:
$mod+End exec 'swaymsg "output * power off"'

Using a hotkey that spawns x11/foot terminal through exec does not cause this issue,
so maybe swaymsg reproduces this?

git bisect told me this is base d15792780760

I reverted the commits fixing d15792780760 (to avoid revert d15792780760 conflicts),
then reverted d15792780760, rebuilt kernel, then the issue is gone.

Sorry, I can't provide a more minimal reproducer because ENOTIME.
Comment 1 Maxim Konovalov freebsd_committer freebsd_triage 2025-05-09 21:07:16 UTC
Gleb probably wants to take a look.
Comment 2 Gleb Smirnoff freebsd_committer freebsd_triage 2025-05-12 14:18:50 UTC
Eugene, what was the most recent revision you have tried? There were
a few fixups after d15792780760.  Have you tried 1df3097d5525 or something
newer?
Comment 3 Evgenii Khramtsov 2025-05-12 14:47:01 UTC
(In reply to Gleb Smirnoff from comment #2)

I am on base b0dd1a604810, and I reboot every half-a-week or so. I fired another base+ports build now to confirm and will report back the state of unix(4) once I reboot this machine.
Comment 4 Gleb Smirnoff freebsd_committer freebsd_triage 2025-05-12 14:56:15 UTC
b0dd1a604810 is newer than 1df3097d5525.  If the bug is present on
b0dd1a604810, then no need to recheck.
Comment 5 Gleb Smirnoff freebsd_committer freebsd_triage 2025-05-12 15:22:01 UTC
Can you please capture ktrace from swaymsg? Let's use second case as it looks
simplier:

$mod+End exec 'ktrace -f /tmp/ktrace.out swaymsg "output * power off"'

Then, cd /tmp && kdump.
Comment 6 Evgenii Khramtsov 2025-05-12 15:44:38 UTC
base eed3be47967f ports 4e97ab08bdfa still affected, will post kdump shortly
Comment 7 Evgenii Khramtsov 2025-05-12 15:46:04 UTC
Created attachment 260351 [details]
kdump.stdout.txt

started sway, did $mod+End, got 100% WCPU, killed sway
Comment 8 Gleb Smirnoff freebsd_committer freebsd_triage 2025-05-12 16:02:52 UTC
Pardon my misunderstanding, this was sway eating CPU, not swaymsg.  Can you
please do this instead:

ktrace -f /tmp/ktrace.out $sway_pid

, then reproduce the problem, kill sway, look at kdump in /tmp.

Anyway, your captured ktrace from swaymsg can also be useful, too, as we
learned what exactly the other side sends to the socket to get sway into
bad state.
Comment 9 Evgenii Khramtsov 2025-05-12 16:25:10 UTC
Created attachment 260352 [details]
kdump2.stdout.txt.zst

$ du -A /tmp/ktrace.out
4174032	/tmp/ktrace.out

Oh no-no-no :D

That was swaymsg, sleep 10, then killall sway

Then I did

$ kdump | head -c 16384000 | zstd -9 -o kdump2.stdout.txt.zst

which should give more looped entries than enough I guess.
Comment 10 Gleb Smirnoff freebsd_committer freebsd_triage 2025-05-12 17:37:43 UTC
Can you please also record ktrace of sway on kernel with reverted d15792780760?
That will help a lot to compare the trace with the old kernel versus the
regressed one.
Comment 11 Evgenii Khramtsov 2025-05-12 18:21:47 UTC
Created attachment 260356 [details]
kdump3.stdout.txt

> Can you please also record ktrace of sway on kernel with reverted d15792780760?

Yessir!