Bug 228539

Summary: [PATCH] for /usr/bin/man when using multibyte characters (utf-8).
Product: Base System Reporter: Michihiro Satoh <satoumc>
Component: binAssignee: Yuri Pankov <yuripv>
Status: In Progress ---    
Severity: Affects Many People CC: bapt, cem, emaste, yuripv
Priority: --- Keywords: patch
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
patch itself.
none
patch for manualpage of man(1)
none
screenshot of correct output of groff.
none
screenshot of broken output of mandoc. none

Description Michihiro Satoh 2018-05-27 10:42:38 UTC
Created attachment 193757 [details]
patch itself.

The /usr/bin/man command has the following problems,
when using the manual source files of UTF-8 encoding,
source written in the multi-byte characters language.

A: When using the 'mandoc' command in man command
  A1: Folding of the character strings are quite low quality,
      and the text formats are often corrupted.
B: When using the 'groff' command in man command
  (it is used when a problem occurs in mandoc)
  B1: Command can not recognize that the source is UTF-8 and users can not
      read because the outputs are garbled.
  B2: Folding of the character string is quite low quality,
      and the text formats are often corrupted.
  B3: The new mdoc(7) format is not supported.

I made a patch to solve these problems. It is very short.
 - User can select whether to use 'mandoc' or 'groff',
   with the environment variable MANPROC.
 - Use the new groff with appropriate options (-D$nroff_dev, -mandoc).
 - User can give necessary options to groff
   with the environment variable MANROFFOPT.

User can avoid the problems as follows:
 A1: User can solve it by setting MANPROC="groff" and letting groff
     handle the process.
     After the 'mandoc' command is improved in the future, user can undo it.
 B1: By setting -D$nroff_dev with new groff, it will be solved automatically.
 B2: In the Japanese-speaking area, problems can be solved if the user sets
     MANROFFOPT="-mja".
     Perhaps users want to use "-mfr" in French-speaking countries and
     "-mde" in German-speaking countries.
 B3: Because we set -mandoc instead of -man with newer groff,
     It automatically identifies man(7) and mdoc(7) and performs appropriate
     formatting.

There is no change for this modification to English-speaking users
who currently have no problem, No new problems will occur for them.
Comment 1 Michihiro Satoh 2018-05-27 10:45:55 UTC
I tested this patch with 'groff-1.22.3'.
Comment 2 Michihiro Satoh 2018-06-02 02:30:23 UTC
Created attachment 193908 [details]
patch for manualpage of man(1)
Comment 3 Yuri Pankov freebsd_committer freebsd_triage 2018-11-26 18:04:58 UTC
A bit unrelated: could you please describe the issues in A1 in a bit more detail?
Comment 4 Michihiro Satoh 2018-11-27 08:00:18 UTC
The result of text formatting with the 'mandoc' command seems to be almost as bad as groff with no "-mja" option.
Although I can only make an accurate evaluation about Japanese, in Japanese, except for a few of the constrained characters, we should fold at the position where the end of the line is at the end, but the 'mandoc' since the return is not done at all except where there are punctuation marks or space, the result is that the line is too short or too long to protrude.
Comment 5 Yuri Pankov freebsd_committer freebsd_triage 2018-11-27 19:32:47 UTC
I'll take this.

Meanwhile could you please file separate issue for mandoc describing the incorrect rendering, with mdoc source and maybe a screenshot of rendered output?  There's little hope it will be fixed if we aren't aware about it :-).
Comment 6 Yuri Pankov freebsd_committer freebsd_triage 2018-11-27 19:34:49 UTC
Or, it would be much better if you could reported the mandoc issue directly to the developers -- take a look at http://mandoc.bsd.lv/contact.html (you want the discuss list).  TIA!
Comment 7 Michihiro Satoh 2018-12-17 08:36:02 UTC
Created attachment 200185 [details]
screenshot of correct output of groff.
Comment 8 Michihiro Satoh 2018-12-17 08:37:14 UTC
Created attachment 200186 [details]
screenshot of broken output of mandoc.
Comment 9 Michihiro Satoh 2018-12-17 08:38:47 UTC
(In reply to Yuri Pankov from comment #5)
These images are examples of correct output processed with groff with -mja option and erroneous output processed by mandoc.