Bug 234822 - [PATCH] sysutils/tmux: Add utf8proc option to Makefile
Summary: [PATCH] sysutils/tmux: Add utf8proc option to Makefile
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Mathieu Arnold
Depends on:
Reported: 2019-01-10 14:30 UTC by David O'Rourke
Modified: 2020-11-21 18:41 UTC (History)
3 users (show)

See Also:
bugzilla: maintainer-feedback? (mat)

sysutils/tmux: Add utf8proc option to Makefile (767 bytes, patch)
2019-01-10 14:30 UTC, David O'Rourke
no flags Details | Diff
newtest.txt (62 bytes, text/plain)
2019-04-03 20:51 UTC, David O'Rourke
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David O'Rourke 2019-01-10 14:30:54 UTC
Created attachment 200987 [details]
sysutils/tmux: Add utf8proc option to Makefile

This patch adds an option to compile tmux with utf8proc support, and defaults it to on.

This patch comes after I spent a while tracking down why I was having issues with various characters in the terminal and reading through https://github.com/tmux/tmux/issues/1057. Within that issue, there is a newtest.txt file which I was using to test tmux in a reliable way.

The characters within the file are a "hugging face" and a "glowing star". They are definitely printable characters. They're listed at emojipedia as:
  - https://emojipedia.org/glowing-star/
  - https://emojipedia.org/hugging-face/

Tmux compiled without utf8proc will show the following when displaying the above file:
Unicode 1f917, wcwidth() -1
input_top_bit_set 4 '\360\237\244\227' (width 1)
Unicode 1f31f, wcwidth() -1
input_top_bit_set 4 '\360\237\214\237' (width 1)

The wcwidth() lines are showing that wcwidth(3) failed to properly detect the character width, which leads to display issues later on.

Once compiled against utf8proc, the above becomes:
input_top_bit_set 4 '\360\237\244\227' (width 2)
input_top_bit_set 4 '\360\237\214\237' (width 2)

The character width is correctly detected and display issues no longer happen.

I made this an option instead of just enabling it in the CONFIGURE_ARGS since maybe someone, somewhere depends on the odd wcwidth() behaviour.

Comment 1 David O'Rourke 2019-04-03 20:51:29 UTC
Created attachment 203355 [details]

This is the test file from GitHub to avoid having to go hunting for it. I should have included this in the beginning.
Comment 2 commit-hook freebsd_committer 2019-04-10 16:01:54 UTC
A commit references this bug:

Author: mat
Date: Wed Apr 10 16:01:38 UTC 2019
New revision: 498577
URL: https://svnweb.freebsd.org/changeset/ports/498577

  Add a default option to use utf8proc for Unicode normalization,
  case-folding, and other operations.

  It is substantially better and more up-to-date than the libc functions
  providing the same features.

  PR:		234822
  Submitted by:	David O'Rourke

Comment 3 Ulrich Spörlein freebsd_committer 2019-09-12 19:25:17 UTC
Hi, I'd like to reopen this and get the default changed to OFF, as it breaks tmux+vim for various unicode chars (most notably, the ones that I use as listchars!)

adamw@ and I both were able to reproduce this.

The file to reproduce this is this:
% cat test
% xxd test
00000000: 3132 3334 3536 0ac2 bbcb 99c2 bbcb 99c2  123456..........
00000010: bbcb 990a                                ....

With tmux+utf8proc+vim, the second line isn't 6 chars wide, but 12 and the ˙ have a trailing space always.

with tmux+vim and no utf8proc, it renders fine, just like cat(1)

Please note that *outside* of tmux, vim renders this also fine. The bug is likely with vim (because neovim is fine both ways), but until we've figured that out, I would suggest to not ship with a broken tmux, assuming here that many, many people are using vim inside tmux.

And I would also prefer to rather break the frivolous emojis instead of basic punctuation chars that have been here for decades.

Any pointers on how to debug this further are greatly appreciated.

Unless I hear strong objections, I'll flip the option to OFF in a week or two. Thanks!
Comment 4 Adam Weinberger freebsd_committer 2019-09-12 21:38:24 UTC
Is there any knob in tmux that will disable using utf8proc? I know it's a long-shot, but I'd like to avoid us having to turn it off for everybody if possible.
Comment 5 Mathieu Arnold freebsd_committer 2019-09-14 21:46:11 UTC
Yeah, I've been having problems with wide chars, not just in vim, but also when using those kind of characters in a zsh prompt for instance.
Fun fact is, it's broken in some ways with it, and it's broken in some other way without...
Comment 6 Adam Weinberger freebsd_committer 2019-09-14 23:12:27 UTC
Come to think of it, that's true. I'd forgotten all about this: I had to disable a section of my zsh prompt because it mishandled a wide char under tmux.

Mat, what's your take on this? Do the benefits of utf8proc on by default outweigh the downsides? I don't know enough to give an opinion, so I'm dumping this on your doorstep.
Comment 7 Mathieu Arnold freebsd_committer 2019-09-17 07:05:24 UTC
Well, I don't think I ever encountered any problem before this, but it seems someone did. So, I don't know.
Comment 8 Ulrich Spörlein freebsd_committer 2019-10-16 16:01:45 UTC
David, are you ok with flipping the default to OFF for utf8proc?

Do you have more insights into how we could debug this also?
Comment 9 Mathieu Arnold freebsd_committer 2020-02-11 13:47:11 UTC
disabling it by default for now.
Comment 10 W.J. van der Laan 2020-11-21 18:41:47 UTC
FWIW, compiling tmux with this option solved the issues I was having with weechat IRC in tmux. Thanks!