240393 – x11/xterm : uxterm puts UTF-8 encoded strings to WM_NAME property

Bug 240393 - x11/xterm : uxterm puts UTF-8 encoded strings to WM_NAME property

Summary: x11/xterm : uxterm puts UTF-8 encoded strings to WM_NAME property

Status:	Closed FIXED

Alias:	None

Product:	Ports & Packages
Classification:	Unclassified
Component:	Individual Port(s) (show other bugs)
Version:	Latest
Hardware:	Any Any

Importance:	--- Affects Some People
Assignee:	Emanuel Haupt

URL:
Keywords:

Depends on:
Blocks:

Reported:	2019-09-07 17:30 UTC by shamaz.mazum
Modified:	2019-10-04 08:19 UTC (History)
CC List:	1 user (show)

See Also:

Flags:	bugzilla: maintainer-feedback? (ehaupt)

Attachments
Dirty fix (391 bytes, text/plain) 2019-09-07 17:30 UTC, shamaz.mazum	no flags	Details
screenshot showing hex-encoded UTF-8 (52.04 KB, image/png) 2019-09-18 00:37 UTC, Thomas E. Dickey	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description shamaz.mazum 2019-09-07 17:30:00 UTC

Created attachment 207263 [details]
Dirty fix

Hello! Uxterm puts UTF-8 encoded string to WM_NAME instead of _NET_WM_NAME. If you have non-ASCII characters in window title, it will not be showed correctly by the window manager. 

If you define IsSetUtf8Title(xw) macro to 1, window titles will be showed correctly.

So how it can be fixed correctly? Is it related to bug #229682?

Comment 1 Emanuel Haupt freebsd_committer

2019-09-08 09:18:37 UTC

Just to make sure I understand the issue, could you please provide me with a reproducible use case?

Comment 2 shamaz.mazum 2019-09-08 09:30:39 UTC

For example, you can run ncmpcpp in uxterm and play any song with Russian letters in its title(or just letters like ä or ü) and look at the title of the uxterm window. Most likely it will contain unreadable garbage (as in e16) or will be empty (as in StumpWM).

Comment 3 shamaz.mazum 2019-09-08 17:12:17 UTC

Answering to comment #1.

Another example (you do not have to use ncmpcpp for this one).

1) Check that your locale ends with ".UTF-8".
2) Open uxterm and run python
3) Run a command print("\033]0;Übermensch\a")
4) Run another xterm and xprop | grep WM_NAME in it.
5) Click on xterm with python
6) Get garbage (WM_NAME(STRING) = "Ã?bermensch") in xprop output

7) Apply dirty fix and do the test again. You will get this time:
vasily@vonbraun:~ % xprop | grep WM_NAME
_NET_WM_NAME(UTF8_STRING) = "Übermensch"
WM_NAME(STRING) = "Übermensch"

Comment 4 Thomas E. Dickey 2019-09-08 18:18:06 UTC

It's been a while (and perhaps some problem to fix),
but the place to start is with the utf8Title and titleModes resources:

https://invisible-island.net/xterm/manpage/xterm.html#VT100-Widget-Resources:utf8Title

and

https://invisible-island.net/xterm/manpage/xterm.html#VT100-Widget-Resources:titleModes

(that is, it's configurable, defaulting to the standard behaviour).

Comment 5 Thomas E. Dickey 2019-09-08 18:20:05 UTC

That's from patch #210, by the way:

https://invisible-island.net/xterm/xterm.log.html#xterm_210

Comment 6 Thomas E. Dickey 2019-09-08 18:21:44 UTC

So... if your script writes a UTF-8 string,
what do you suppose should be in WM_NAME?

Comment 7 shamaz.mazum 2019-09-08 19:01:58 UTC

If you write UTF-8 encoded string in WM_NAME, then this property must have its type set to "UTF8_STRING", not "STRING", no matter how xterm is configured. Moreover EWMH suggests, that application should set _NET_WM_NAME to the title of the window in UTF-8 encoding. Then, _NET_WM_NAME has priority over WM_NAME.

Setting WM_NAME of type STRING to UTF-8 encoded string is clearly an application error.

Checked this on Debian and got the same result. So I, by mistake, assumed that this is FreeBSD specific. Maybe I should report a bug to xterm developers.

Comment 8 shamaz.mazum 2019-09-08 19:11:17 UTC

See here, for example: https://github.com/stumpwm/stumpwm/issues/641

I had StumpWM crashing over and over, until I figured out that that was uxterm setting wrongly encoded WM_NAME property (and I think user can not know about utf8Title resource). Imagine, that there is another window manager, that will crash because of this behavior. I do not think, that uxterm behaves correctly in this situation.

The question for me is the following: should this behavior be fixed in FreeBSD ports (as a patch) or should I report it to xterm developers?

Comment 9 shamaz.mazum 2019-09-08 19:27:58 UTC

I am rereading xterm manual.

>           However, some users may wish to write a title string encoded in
>           UTF-8.  The window manager is responsible for drawing window
>           titles.  Some window managers (not all) support UTF-8 encoding
>           of window titles.  Set this resource to "true" to allow UTF-8
>           encoded title strings.  That cancels the translation to UTF-8,
>           allowing UTF-8 strings to be displayed as is.

If really not all window managers support UTF-8 window titles nowadays, then maybe all must be left as is, I really don't know.

Comment 10 Thomas E. Dickey 2019-09-08 20:35:54 UTC

xterm doesn't set WM_NAME directly: it sets _NET_WM_NAME directly.

It uses the XtNtitle resource, tell libXt to set the title,
in turn libXt calls XSetWMName, which sets WM_NAME.

Interestingly,  libXt has a titleEncoding resource which predates
the UTF8_STRING format.  If xterm were to set WM_NAME using
UTF8_STRING, that would have to be by setting the X property directly,
and probably confuse any application which uses the value.

It would be useful to have a table showing what different
window managers do with UTF-8 strings in WM_NAME and/or _NET_WM_NAME.

Comment 11 Thomas E. Dickey 2019-09-08 20:50:15 UTC

By the way, a UTF-8 string is (probably) a valid ISO-8859-1 string,
so (unless xterm is told differently), it just passes the data off
via libXt / XSetWMName to handle it.

Comment 12 Thomas E. Dickey 2019-09-08 21:37:42 UTC

Looking to see where it's actually setting the UTF8_STRING type,
I don't see it directly, but it's based on a XUT8StringStyle value
handled in src/xlibi18n/lcTxtPr.c

This q/a appears relevant:

https://stackoverflow.com/questions/7296302/x11-xm-name-type-is-utf-8-rather-than-string-utf8

saying that your locale settings trigger the behavior, telling
libX11 + libXt to create the UTF8_STRING value.

Comment 13 Thomas E. Dickey 2019-09-15 21:28:02 UTC

In the past week, I wrote a test-driver for the titleModes resource
and the title-stacking feature.  At first glance, it might seem to
"work" to just use the UTF-8 string as suggested.  But there's a
problem with that: some of the UTF-8 bytes lie inside the C1 controls,
and an earlier stage suppresses those.  If you happened to find one
of those, you'd see a "?" in the title. There's a command-line option
which disables most of the C1 control behavior, but that particular
part of the code is not affected.

Past that, this report deals with a special case: the UTF-8 can be
translated into ISO-8859-1 (what STRING holds) without loss.  Most
of the UTF-8 possibilities can't do that.  That's the reason for the
titleModes, to allow for those possibilities (by hex-encoding).

You don't see this with other terminals of course, since they
emulate VT100 (or some subset of that), while xterm emulates
VT220/VT420.  The former are 7-bit controls only, while the latter
are 7-bit or 8-bit controls.

Comment 14 shamaz.mazum 2019-09-17 13:52:49 UTC

As I can see this, my issue is not a bug, but just misconfigured xterm. Maybe you can help me. As I understand, I need to set X resource XTerm.vt100.titleModes to 2 to make xterm store window titles as UTF8_STRING. From manual:

> titleModes (class TitleModes)
>               Tells xterm whether to accept or return window- and icon-labels
>               in ISO-8859-1 (the default) or UTF-8.  Either can be encoded in
>               hexadecimal.  The default for this resource is “0”.
>  2    Set window/icon labels using UTF-8 (overrides utf8Title resource).

I am unfamiliar with X resources. I create a file ~/.Xresources with the following content:

> XTerm*vt100.titleModes: 2

and run xrdb -merge ~/.Xresources. But I still get the followin with xprop:

> WM_NAME(STRING) = "Ð?Ð¾Ñ?Ð¾Ð»Ñ? Ð¸ Ð¨Ñ?Ñ? - Ð¡Ð¼ÐµÐ»Ñ?Ñ?Ð°Ðº Ð¸ Ð²ÐµÑ?ÐµÑ?"

I want something like this:

> _NET_WM_NAME(UTF8_STRING) = "Беломорс - Излучатель СВЧ"

Now I use my dirty patch, but how can I do it with X resources? Can you give me an instruction? Thanks)

Comment 15 Thomas E. Dickey 2019-09-18 00:35:35 UTC

It's a little more complicated than that.  The titleModes bit #2 for UTF-8
is only useful if you first hex-encode the string (bit #0 of titleModes).
I'll add a screenshot from my test-driver to illustrate.

Comment 16 Thomas E. Dickey 2019-09-18 00:37:25 UTC

Created attachment 207592 [details]
screenshot showing hex-encoded UTF-8

The title-string uses double-width codes from the FF00 code block.

Comment 17 Thomas E. Dickey 2019-09-18 00:40:32 UTC

I'm starting to look into whether I can (cleanly) provide a way using the "+k8"
option to accept UTF-8 strings in a way that would let you more easily script this stuff.

Comment 18 shamaz.mazum 2019-09-18 06:08:21 UTC

The following .Xresources works for me:

> xterm*vt100.utf8Title: true

I tried different capital letters (like XTerm* or True), nothing else seems to be working. I was completely unfamiliar with X resources and the experience with them is very unpleasant. At least now all works as expected.

IMHO, all this X resources stuff is completely redundant and adds more complexity, because we have good old command line arguments ;)

Big thanks for the pointing me to the solution.

Comment 19 Thomas E. Dickey 2019-09-18 07:44:11 UTC

no problem... 

Actually that particular choice is a special case (which I'll try to explain in the manual page).  Once #349 is ready, the preferred solution would use the allowC1Printable resource - see

https://invisible-island.net/xterm/manpage/xterm.html#VT100-Widget-Resources:allowC1Printable

(but as I commented earlier, that route doesn't work yet).

Comment 20 commit-hook freebsd_committer

2019-09-23 07:14:47 UTC

A commit references this bug:

Author: ehaupt
Date: Mon Sep 23 07:14:26 UTC 2019
New revision: 512613
URL: https://svnweb.freebsd.org/changeset/ports/512613

Log:
  - Update to 349
  - Pacify portlint

  PR:		240393 (ref. in changelog)

Changes:
  head/x11/xterm/Makefile
  head/x11/xterm/distinfo