Bug 20028

Summary: ASCII docs should reflect <emphasis> tags in the source
Product: Documentation Reporter: brooks <brooks>
Component: Books & ArticlesAssignee: freebsd-doc (Nobody) <doc>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Latest   
Hardware: Any   
OS: Any   

Description brooks 2000-07-19 07:00:01 UTC
When the <emphasis> tag is used in DocBook, this is translated to a
variant on the <i> tag in HTML vi the style sheets.  The HTML docs are
then processed by w3m to produce ASCII versions.  w3m appears to
compleatly ignore <i> tags even in interative mode and it ignores both
<i> and <b> tags in -dump mode.  This means that no indication of
emphasis is transmitted to the ASCII form which potentialy distorts the
text's meaning.

Fix: 

The fix is going to be something like patching w3m to have a mode where
it emphasises things like *this* or something.
How-To-Repeat: 
Create ASCII docs.
Comment 1 brooks 2000-08-24 02:28:35 UTC
On Tue, Jul 18, 2000 at 10:54:04PM -0700, brooks@one-eyed-alien.net wrote:
> >Description:
> 
> When the <emphasis> tag is used in DocBook, this is translated to a
> variant on the <i> tag in HTML vi the style sheets.  The HTML docs are
> then processed by w3m to produce ASCII versions.  w3m appears to
> compleatly ignore <i> tags even in interative mode and it ignores both
> <i> and <b> tags in -dump mode.  This means that no indication of
> emphasis is transmitted to the ASCII form which potentialy distorts the
> text's meaning.
> 
> >Fix:
> 
> The fix is going to be something like patching w3m to have a mode where
> it emphasises things like *this* or something.

I've looked into this a little today.  It looks like creating a patch
which accomplishes this is pretty easy, but there are a few hoops to
jump though.  First, w3m delibratly doesn't support <i> tags at all.
It parses them, but throws them out.  This could be corrected if we
wanted to do so.  What is supported is <strong> which maps to <em> which
in turn maps to <b>.  I've generated a patch so <b>blah</b> becomes
*blah* when -dump is specified.  There's a good chance this is the wrong
way to do this, but it works for me.  How would people suggest I
proceed?  Should I implement Nik's suggestion of <b>bold</b> -> *bold*
and <i>italics</i> -> /italics/ or just what?  My concern about Nik's
suggesion is that <B> is used in a number of places including FAQ
Query's which I think it will look silly.  I'm kinda thinking the right
thing to do may be to change the style sheets to translate <emphasis> to
<em> and only dealing with <em> in w3m.

-- Brooks

This patch will add *bold* support to w3m in dump mode.  It should be
applied after the other patches in the port.

--- file.c.freebsd	Wed Aug 23 17:58:07 2000
+++ file.c	Wed Aug 23 18:13:22 2000
@@ -2507,6 +2507,7 @@
 #ifdef ID_EXT
     Str id = NULL;
 #endif				/* ID_EXT */
+    extern int w3m_dump;
 
     if (obuf->flag & RB_PRE) {
 	switch (cmd) {
@@ -2520,16 +2521,25 @@
 
     switch (cmd) {
     case HTML_B:
-	obuf->in_bold++;
-	if (obuf->in_bold > 1)
-	    return 1;
+	if(!w3m_dump) {
+	    obuf->in_bold++;
+	    if (obuf->in_bold > 1)
+	        return 1;
+	} else {
+	    HTMLlineproc1("*", h_env);
+	}
 	return 0;
     case HTML_N_B:
-	if (obuf->in_bold == 1 && close_effect0(obuf, HTML_B))
-	    obuf->in_bold = 0;
-	if (obuf->in_bold > 0) {
-	    obuf->in_bold--;
-	    if (obuf->in_bold == 0)
+	if(!w3m_dump) {
+	    if (obuf->in_bold == 1 && close_effect0(obuf, HTML_B))
+		obuf->in_bold = 0;
+	    if (obuf->in_bold > 0) {
+		obuf->in_bold--;
+		if (obuf->in_bold == 0)
+		    return 0;
+	    }
+	} else {
+	    HTMLlineproc1("*", h_env);
 		return 0;
 	}
 	return 1;

-- 
Any statement of the form "X is the one, true Y" is FALSE.
Comment 2 Rasmus Kaj 2000-08-24 11:53:02 UTC
>>>>> "BD" == Brooks Davis <brooks@one-eyed-alien.net> writes:

 [ About making <emphasis> some kind of visible emphasis in text-only
   output ]

 BD> I've looked into this a little today.  It looks like creating a patch
 BD> which accomplishes this is pretty easy, but there are a few hoops to
 BD> jump though.  First, w3m delibratly doesn't support <i> tags at all.
 BD> It parses them, but throws them out.  This could be corrected if we
 BD> wanted to do so.  What is supported is <strong> which maps to <em> which
 BD> in turn maps to <b>.  I've generated a patch so <b>blah</b> becomes
 BD> *blah* when -dump is specified.  There's a good chance this is the wrong
 BD> way to do this, but it works for me.  How would people suggest I
 BD> proceed?  Should I implement Nik's suggestion of <b>bold</b> -> *bold*
 BD> and <i>italics</i> -> /italics/ or just what?  My concern about Nik's
 BD> suggesion is that <B> is used in a number of places including FAQ
 BD> Query's which I think it will look silly.  I'm kinda thinking the right
 BD> thing to do may be to change the style sheets to translate <emphasis> to
 BD> <em> and only dealing with <em> in w3m.

Well, *foo* looks like bold to some, but isn't, really. Same goes for
/bar/ ...   So, while I'm in favor of <strong>foo</strong> -> *foo*
and <em>bar</em> -> _bar_ or /bar/, I think <b> and <i> really should
be ignored when font controll isn't availible.

Also, you may want to make it possible to disable this stuff in
certain tags, for example, if you have an example command line that
looks like:

  % *rm* /junk/

... then there is bound to be some questions about that ... :-)

That said, I agree with the basic suggestion that it would be nice to
have e.g. <emphasis> render visibly in plain text.

-- 
Rasmus Kaj ------------------------ rasmus@kaj.se - http://Raditex.se/~kaj/
 \                                       If you're happy, you're successful
  \----------------------------------------------------- http://Raditex.se/
Comment 3 brooks 2000-08-24 21:49:36 UTC
On Thu, Aug 24, 2000 at 12:53:02PM +0200, Rasmus Kaj wrote:
> 
> Well, *foo* looks like bold to some, but isn't, really. Same goes for
> /bar/ ...   So, while I'm in favor of <strong>foo</strong> -> *foo*
> and <em>bar</em> -> _bar_ or /bar/, I think <b> and <i> really should
> be ignored when font controll isn't availible.

I think I agree here.  If we changed the style sheet to output either
<em> or <strong> when it seems the DocBook <emphasis> tag and then hack
w3m to produce either some variation on *foo* when it sees that tag then
we'd accomplish the task of translating <emphasis> to something visiable
in ASCII docs and avoid screwing things up that actually do use <b> or
<i> for typographic reasions.

> Also, you may want to make it possible to disable this stuff in
> certain tags, for example, if you have an example command line that
> looks like:
> 
>   % *rm* /junk/
> 
> ... then there is bound to be some questions about that ... :-)

That's a style sheet issue.  In this case we probably shouldn't be
writing <strong>rm</strong> <em>junk</em> as html output because that's
not what we mean.  In this case we really do mean <b>rm</b> <i>junk</i>
because this is purley a typographical convention at this point not a
symantic markup.

> That said, I agree with the basic suggestion that it would be nice to
> have e.g. <emphasis> render visibly in plain text.

My prefrence is for a result that <emphasis> renders in the *baz* style.
I think what I'll do is hack up some patches to <emphasis> translates to
<em> and w3m translates <em> and <strong> to *.

-- Brooks

-- 
Any statement of the form "X is the one, true Y" is FALSE.
Comment 4 nik freebsd_committer freebsd_triage 2000-08-25 09:59:15 UTC
On Wed, Aug 23, 2000 at 06:28:35PM -0700, Brooks Davis wrote:
> I've looked into this a little today.  It looks like creating a patch
> which accomplishes this is pretty easy, but there are a few hoops to
> jump though.  First, w3m delibratly doesn't support <i> tags at all.
> It parses them, but throws them out.  This could be corrected if we
> wanted to do so.  What is supported is <strong> which maps to <em> which
> in turn maps to <b>.  I've generated a patch so <b>blah</b> becomes
> *blah* when -dump is specified.  There's a good chance this is the wrong
> way to do this, but it works for me.  How would people suggest I
> proceed?  Should I implement Nik's suggestion of <b>bold</b> -> *bold*
> and <i>italics</i> -> /italics/ or just what?  My concern about Nik's
> suggesion is that <B> is used in a number of places including FAQ
> Query's which I think it will look silly.  I'm kinda thinking the right
> thing to do may be to change the style sheets to translate <emphasis> to
> <em> and only dealing with <em> in w3m.

The stylesheets add a CLASS attribute with the name of the original DocBook
element in some cases.  For example;

    <i class="emphasis">This was originally marked up with 'emphasis'</i>

See if w3m can look for that instead.

N
-- 
Internet connection, $19.95 a month.  Computer, $799.95.  Modem, $149.95.
Telephone line, $24.95 a month.  Software, free.  USENET transmission,
hundreds if not thousands of dollars.  Thinking before posting, priceless.
Somethings in life you can't buy.  For everything else, there's MasterCard.
  -- Graham Reed, in the Scary Devil Monastery
Comment 5 Murray Stokely freebsd_committer freebsd_triage 2001-09-02 22:33:13 UTC
State Changed
From-To: open->suspended

Nothing has happened on this PR for a year.  This is a prime candidate 
for someone to pick up and work on.
Comment 6 Remko Lodder freebsd_committer freebsd_triage 2014-03-24 11:29:50 UTC
State Changed
From-To: suspended->closed

Bite the bullet, this had not been touched for almost 13 
years.  This will not likely be ever resolved, close the 
ticket.