Bug 274132 - groff 1.23.0: some manual page show a UNTITLED() topic
Summary: groff 1.23.0: some manual page show a UNTITLED() topic
Status: Open
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Wolfram Schneider
URL:
Keywords:
Depends on:
Blocks: 273245 273903
  Show dependency treegraph
 
Reported: 2023-09-27 15:22 UTC by Wolfram Schneider
Modified: 2023-12-28 15:42 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wolfram Schneider freebsd_committer freebsd_triage 2023-09-27 15:22:54 UTC
Some FreeBSD base manual pages in /usr/share/man do not display the correct document title with the latest groff. E.g.

zcat /usr/share/man/man3/krb5_afslog.3.gz | nroff -mandoc | head -1
UNTITLED()                           LOCAL                          UNTITLED()

groff 1.22.4 works fine.

This affects the following manual pages

/usr/share/man/man3/k_afs_cell_of_file.3.gz
/usr/share/man/man3/k_pioctl.3.gz
/usr/share/man/man3/k_hasafs.3.gz
/usr/share/man/man3/k_unlog.3.gz
/usr/share/man/man3/kafs5.3.gz
/usr/share/man/man3/k_setpag.3.gz
/usr/share/man/man3/kafs.3.gz
/usr/share/man/man3/kafs_set_verbose.3.gz
/usr/share/man/man3/kafs_settoken.3.gz
/usr/share/man/man3/kafs_settoken_rxkad.3.gz
/usr/share/man/man3/kafs_settoken5.3.gz
/usr/share/man/man3/krb5_afslog_uid.3.gz
/usr/share/man/man3/krb5_afslog.3.gz
/usr/share/man/man3/krb_afslog.3.gz
/usr/share/man/man3/krb_afslog_uid.3.gz

and 371 ports manual pages (out of 135k). E.g.

https://man.freebsd.org/cgi/man.cgi?obabel
https://man.freebsd.org/cgi/man.cgi?teco
Comment 1 Mina Galić freebsd_triage 2023-09-27 20:57:46 UTC
any idea why?
Comment 2 G. Branden Robinson 2023-09-28 03:29:48 UTC
(In reply to Mina Galić from comment #1)
I recommend checking the first few macro calls in the pages to see if they are in the required order: Dd, Dt, Os.

Judging by <https://opensource.apple.com/source/Heimdal/Heimdal-498/lib/kafs/kafs.3.auto.html>, that is the problem.  It uses the ordering Dd, Os, Dt.

Example:

$ printf '.Dd 2023-09-27\n.Os WackyOS\n.Dt foobar 1\n.Sh Name\n.Nm foobar\n.Nd wacky tobacky\n' | groff -mdoc -Tascii
UNTITLED()                           LOCAL                          UNTITLED()

Name
       foobar -- wacky tobacky

WackyOS                           2023-09-27                         foobar(1)

groff 1.23.0 is strict about this because it had to be to improve formatting of multiple man/mdoc documents in sequence.  <https://git.savannah.gnu.org/cgit/groff.git/commit/?id=f911d0075cdae4a9f940ef2cad27e53a7af01b61>

There were also changes around `Dd`, `Dt`, and `Os` to handle degeneracy, that is, outright failure of the mdoc(7) document to use one of these macros at all.  <https://savannah.gnu.org/bugs/?62774>
Comment 3 G. Branden Robinson 2023-09-28 03:37:14 UTC
Here is some further motivating detail from a commit message.

      (Dd): Interpret this macro call strictly as starting a new mdoc(7)
      document.  (andoc.tmac already makes this assumption, and has for over
      20 years.  groff_mdoc(7) and mandoc_mdoc(7) also prescribe the
      sequence `Dd`, `Dt`, `Os`.)  We require this invariant even more
      rigidly now because it's the only way we can be sure that we can
      process multiple documents while rendering headers and footers with
      information corresponding to the appropriate document.  (man(7)'s `TH`
      has an advantage here in that calling it is "atomic": from its
      arguments alone you can obtain everything you need to know to format
      the header and footer.  In mdoc(7), permuting the initialization macro
      order reliably produces chaos.)  Break the page (if necessary)
      _before_ processing any arguments (instead of after), to flush the
      previous page's footer.  Stop calling `doc-set-up-titles` here; we
      don't have enough information to do that yet.  Also stop writing the
      PDF bookmark here, because `doc-document-title` and `doc-section` will
      not reflect the new page content yet.

      (Os): Once the `doc-operating-system` string content has been
      determined, call `doc-set-up-titles`, write the PDF bookmark for the
      page, and call `doc-header`, causing the page header to be formatted.
      These changes further imply a stronger requirement on initialization
      macro ordering being canonical.

https://git.savannah.gnu.org/cgit/groff.git/commit/?id=50a2d4165f7b82cb78d7ee96484f776c11a47def
Comment 4 Wolfram Schneider freebsd_committer freebsd_triage 2023-10-01 12:36:47 UTC
(In reply to G. Branden Robinson from comment #2)

what do you mean with "multiple man/mdoc documents in sequence"? Our manual pages are single files, one file per manual page.


I noticed that some ports manual pages have multiple .Os calls in a file, but I guess this is a mistake.
Comment 5 Wolfram Schneider freebsd_committer freebsd_triage 2023-10-01 12:52:02 UTC
Looking at the ports manual pages we have 377 manual pages with UNTITLED title. These are from 77 packages. In total we have 32415 packages in the ports collection.

see
https://people.freebsd.org/~wosch/tmp/groff/UNTITLED/
Comment 6 G. Branden Robinson 2023-10-01 13:24:49 UTC
(In reply to Wolfram Schneider from comment #4)
Hi Wolfram,

> what do you mean with "multiple man/mdoc documents in sequence"? Our manual pages are single files, one file per manual page.

Yes, one file per man(7) (or mdoc(7)) document is the idiomatic way to maintain man pages and man librarian tooling doesn't support any other approach well.

But maintenance is not the same thing as rendering.  groff has documented its man(7) package implementation with the following synopsis basically forever.

Synopsis
       groff -man [option ...] [file ...]
       groff -m man [option ...] [file ...]

The ellipsis after the "file" operand indicates that the previous argument can be repeated.  man(1) programs generally don't bother with this, but a user can assume that the synopsis means what it says and take it at face value.  Given a pile of uncompressed man pages, they might type something like

$ groff -t -man captoinfo.1m clear.1 infocmp.1m infotocap.1m tabs.1 tic.1m toe.1m tput.1 tset.1 | less -R

In groff 1.22.4 and earlier, there were many problems with the resulting output, especially with mdoc(7) documents or when formatting for a typesetter (such as PostScript or PDF output).  The problems fell into 3 categories:

1.  inadequate restoration of formatter state to reasonable defaults between documents;
2.  errors in managing the traps responsible for breaking pages between documents and formatting page headers and footers; and
3.  cosmetic differences between documents written in man(7) versus mdoc(7).

All of these problems undermined the promise of "andoc.tmac", that little piece of _roff_ wizardry that came to your attention recently for automatically loading "an.tmac" or "doc.tmac" depending on which macro package a document required, determined in turn by using a call of the `TH` or `Dd` macros, respectively.

In groff 1.23.0 this all works much better, to the extent that it is now feasible to produce a PDF compilation of a large set of man pages, as groff now does (see <https://www.gnu.org/software/groff/manual/index.html>) and as the Linux man-pages project does as well (piloting some PDF support changes currently being worked on by gropdf(1) author Deri James; <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/tree/scripts/LinuxManBook>).

Admittedly the problems with mdoc(7) headers and footers in groff 1.22.4 were more fundamental; when formatting for terminals, they didn't render _at all_.

$ /usr/bin/nroff --version
GNU nroff (groff) version 1.22.4
$ /usr/bin/nroff -mdoc -Tascii ./EXPERIMENTS/multiple.mdoc | cat -s

Name
     ls -- list files

Name
     da -- defeat AI

$ nroff --version | head -n 1
GNU nroff (groff) version 1.23.0
$ nroff -mdoc -Tascii ./EXPERIMENTS/multiple.mdoc
ls(1)                       General Commands Manual                      ls(1)

Name
       ls -- list files

First Edition                     1971-11-01                             ls(1)
-------------------------------------------------------------------------------
da(1)                       General Commands Manual                      da(1)

Name
       da -- defeat AI

Debian 12                         2023-09-23                             da(1)

(The "cat -s" with groff 1.22.4 is to elide many useless blank lines; this is one of the page trap management bugs to which I referred.)

If you were to modify the groff mdoc(7) package in 1.22.4 to display the page headers and footers for every document--which seems an obvious choice since they are not necessarily the same for any given two documents--you'd quickly find that in many circumstances you'd get the _wrong ones for the document being formatted_.

That is why groff is now more strict about validating the ordering of the `Dd`, `Dt`, `Os` macro triple.

I observe that mandoc(1) doesn't do brilliantly with the test case above.

$ dpkg -l mandoc|tail -n 1
ii  mandoc         1.14.6-1     amd64        BSD manpage compiler toolset
$ mandoc ./EXPERIMENTS/multiple.mdoc
ls(1)                       General Commands Manual                      ls(1)

Name
     ls – list files

Name
     da – defeat AI

Debian 12                         2023-09-23                         Debian 12

You get the header of the first page and the footer of the last.

groff 1.23.0 has none of these problems.

Let me know if there is anything I can do to further illuminate these issues.
Comment 7 Graham Perrin 2023-10-02 09:48:30 UTC
^Triage: blocks, or depends on? 

(The same question for bug 273255.)