Bug 33963 - Messages at the serial IO port device probe are misleading
Summary: Messages at the serial IO port device probe are misleading
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 4.4-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Bruce Evans
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-01-16 22:40 UTC by Kevin Oberman
Modified: 2009-12-17 17:56 UTC (History)
0 users

See Also:


Attachments
file.diff (626 bytes, patch)
2002-01-16 22:40 UTC, Kevin Oberman
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kevin Oberman 2002-01-16 22:40:01 UTC
If the serial device driver probes at load time and gets no response,
it simply labels the device as an 8250 when it may be not present or
broken. If the port is not configured in BIOS, the driver sends out a
cryptic message that the irq in not in the bitmap of probed irqs,
again not making it clear to many where the problem is.

How-To-Repeat: De-configure a serial device in BIOS and reboot.
Comment 1 Bruce A. Mah freebsd_committer freebsd_triage 2002-01-16 23:01:27 UTC
Responsible Changed
From-To: freebsd-bugs->bmah

I promised Kevin that I'd do this.  If anyone else has a compulsion tor 
review and commit this, that's fine by me too.
Comment 2 Bruce A. Mah freebsd_committer freebsd_triage 2002-01-17 06:22:36 UTC
State Changed
From-To: open->analyzed

Committed to -CURRENT, awaiting MFC to 4-STABLE (after 4.5-RELEASE).
Comment 3 Ceri Davies freebsd_committer freebsd_triage 2003-03-01 00:19:50 UTC
Adding to the audit trail from pending/48674:

Message-Id: <20030225170511.GA53886@intruder.bmah.org>
Date: Tue, 25 Feb 2003 09:05:11 -0800
From: "Bruce A. Mah" <bmah@freebsd.org>
To: oberman@es.net, bde@freebsd.org
Cc: freebsd-gnats-submit@freebsd.org
Subject: pending/48674: PR bin/33963

 I'm trying to figure out what to do with this PR, which has been
 languishing for awhile and gives me momentary angst every week when it
 shows up in my PR reminders list.
 
 Some discussion with bde (shortly after the last PR followup) has made
 me realize that the root cause of these misleading sio(4) probe
 messages wasn't what I thought it was, and that I don't have the
 qualification or time ot deal with this.
 
 The alternatives seem to be:
 
 1.  Toss the PR back to freebsd-bugs.
 
 2.  Give it to bde, who's the de facto sio(4) maintainer.
 
 3.  Close it.
 
 Any thoughts?  Thanks.
 
 Bruce.
Comment 4 Ceri Davies freebsd_committer freebsd_triage 2003-03-01 00:25:03 UTC
Adding to audit trail, from pending/48677:

Message-Id: <20030225172014.67F7D5D06@ptavv.es.net>
Date: Tue, 25 Feb 2003 09:20:14 -0800
From: "Kevin Oberman" <oberman@es.net>
To: "Bruce A. Mah" <bmah@freebsd.org>
Cc: bde@freebsd.org, freebsd-gnats-submit@freebsd.org
Subject: pending/48677: Re: PR bin/33963 

 > Date: Tue, 25 Feb 2003 09:05:11 -0800
 > From: "Bruce A. Mah" <bmah@freebsd.org>
 > 
 > I'm trying to figure out what to do with this PR, which has been
 > languishing for awhile and gives me momentary angst every week when it
 > shows up in my PR reminders list.
 > 
 > Some discussion with bde (shortly after the last PR followup) has made
 > me realize that the root cause of these misleading sio(4) probe
 > messages wasn't what I thought it was, and that I don't have the
 > qualification or time ot deal with this.
 > 
 > The alternatives seem to be:
 > 
 > 1.  Toss the PR back to freebsd-bugs.
 > 
 > 2.  Give it to bde, who's the de facto sio(4) maintainer.
 > 
 > 3.  Close it.
 
 imp vetoed the first section of the patch and I agree with his point
 in doing so. But I think the second half (8250 or not responding) is a
 legitimate bug fix as the driver currently implies that it actually
 could tell the device is an 8250 when the message is really only an
 indication that the driver did not receive a response to its query. It
 assumes that this means an 8250 as any flavor of 16550 or any decent
 clone will respond in some way.
 
 I can re-submit with only the single line change, but if bde "owns"
 the sio driver these days, I'll leave it up to him. 
 
 R. Kevin Oberman, Network Engineer
 Energy Sciences Network (ESnet)
 Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
 E-mail: oberman@es.net			Phone: +1 510 486-8634
Comment 5 Ceri Davies freebsd_committer freebsd_triage 2003-03-01 00:27:40 UTC
Adding to audit trail, from pending/48680:

Message-Id: <20030225184409.GA54634@intruder.bmah.org>
Date: Tue, 25 Feb 2003 10:44:09 -0800
From: "Bruce A. Mah" <bmah@freebsd.org>
To: Kevin Oberman <oberman@es.net>
Cc: "Bruce A. Mah" <bmah@freebsd.org>, bde@freebsd.org,
	freebsd-gnats-submit@freebsd.org
Subject: pending/48680: Re: PR bin/33963

 If memory serves me right, Kevin Oberman wrote:
 
 > imp vetoed the first section of the patch and I agree with his point
 > in doing so.=20
 
 Heh.  Actually that was the part that got committed to HEAD.  I made
 an offer to bde to back this out but he didn't take me up on it
 (yet)...the offer still stands.
 
 > But I think the second half (8250 or not responding) is a
 > legitimate bug fix as the driver currently implies that it actually
 > could tell the device is an 8250 when the message is really only an
 > indication that the driver did not receive a response to its query. It
 > assumes that this means an 8250 as any flavor of 16550 or any decent
 > clone will respond in some way.
 
 What I was recall being told is that the underlying cause was that the
 driver shouldn't have been trying to probe the device anyways.  (A
 separate, but related problem.)
 
 > I can re-submit with only the single line change, but if bde "owns"
 > the sio driver these days, I'll leave it up to him.=20
 
 Yeah.  I should have done a better job handling this PR.
 
 Bruce.
Comment 6 Ceri Davies freebsd_committer freebsd_triage 2003-03-01 00:28:27 UTC
Adding to audit trail, from pending/48774:

Message-Id: <20030228220111.Y22326-100000@gamplex.bde.org>
Date: Fri, 28 Feb 2003 22:56:27 +1100 (EST)
From: Bruce Evans <bde@zeta.org.au>
To: "Bruce A. Mah" <bmah@freebsd.org>
Cc: Kevin Oberman <oberman@es.net>, <bde@freebsd.org>,
	<freebsd-gnats-submit@freebsd.org>
Subject: pending/48774: Re: PR bin/33963

 On Tue, 25 Feb 2003, Bruce A. Mah wrote:
 
 > If memory serves me right, Kevin Oberman wrote:
 >
 > > imp vetoed the first section of the patch and I agree with his point
 > > in doing so.
 >
 > Heh.  Actually that was the part that got committed to HEAD.  I made
 > an offer to bde to back this out but he didn't take me up on it
 > (yet)...the offer still stands.
 
 I think both parts got committed to HEAD.  I mostly agreed with the
 first part but not with the second part.  All I've done is fix the
 formatting and improve the English of the first part in my version:
 
 %%%
 Index: sio.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/dev/sio/sio.c,v
 retrieving revision 1.382
 diff -u -2 -r1.382 sio.c
 --- sio.c	11 Oct 2002 20:22:20 -0000	1.382
 +++ sio.c	23 Feb 2003 13:59:40 -0000
 @@ -766,6 +783,5 @@
  		"sio%d: configured irq %ld not in bitmap of probed irqs %#x\n",
  		    device_get_unit(dev), xirq, irqs);
 -		printf(
 -		"sio%d: port may not be enabled\n",
 +		printf("sio%d: port might not be enabled\n",
  		    device_get_unit(dev));
  	}
 %%%
 
 I think it's worth saying something here to decrypt/disalarm the "not in
 bitmap of probed irqs" message.  The "port might not be enabled" message
 is just to clarify the previous message, but it's hard to write paragraphs
 in boot message so it looks more like a separate message and thus may
 further obscure things.  Some more rewording might help, and it wouldn't
 hurt to put this in the man page.   I rather hoped that you (bmah) would
 handle the doc aspects of this.
 
 This code is only reached in the !noprobe case, which should be only in
 legacy cases where it should not be suprising to get misconfigured irqs,
 but I think pnp and/or acpi are now too successful at finding devices
 (even when you have disabled them in static hints?) so we now get more
 half-working probes.  So the main problems here are:
 - the driver doesn't understand that some hints are better than others
   so it shouldn't attempt to check them.
 - various problems if multiple sio devices share an interrupt (and
   interrupts are ISAish (edge sensitive, etc.) so they can't be shared
   at runtime.  The message is the first hint of such problems.
 
 > > But I think the second half (8250 or not responding) is a
 > > legitimate bug fix as the driver currently implies that it actually
 > > could tell the device is an 8250 when the message is really only an
 > > indication that the driver did not receive a response to its query. It
 > > assumes that this means an 8250 as any flavor of 16550 or any decent
 > > clone will respond in some way.
 >
 > What I was recall being told is that the underlying cause was that the
 > driver shouldn't have been trying to probe the device anyways.  (A
 > separate, but related problem.)
 
 It is in the attach code actually.  I don't understand how the attach
 routine can be called on completely "not responding" devices.  The
 noprobe case makes the probe sloppy but should only be used for pccards
 and certain broken cases where another layer hopefully knows what it's
 doing.  Technically, I think we can do the fifo test first to distinguish
 the >= 16550 UARTs.  Then the scratch register test would only be needed
 to distinguish between ancient UARTs.  NetBSD's com.c doesn't bother
 with it.
 
 > > I can re-submit with only the single line change, but if bde "owns"
 > > the sio driver these days, I'll leave it up to him.
 >
 > Yeah.  I should have done a better job handling this PR.
 
 Me too, sigh.  Now I mostly don't work run anything near a current
 -current, so find it hard to test things for, but I have more interest
 in making -stable work right.  This doesn't extend to large configuration
 changes though.  (I've grown to detest large init/config code.)
 
 Bruce
Comment 7 Ceri Davies freebsd_committer freebsd_triage 2003-03-01 00:28:57 UTC
Adding to audit trail, from pending/48792:

Message-Id: <20030228211250.84FCA5D04@ptavv.es.net>
Date: Fri, 28 Feb 2003 13:12:50 -0800
From: "Kevin Oberman" <oberman@es.net>
To: Bruce Evans <bde@zeta.org.au>
Cc: "Bruce A. Mah" <bmah@freebsd.org>, bde@freebsd.org,
	freebsd-gnats-submit@freebsd.org
Subject: pending/48792: Re: PR bin/33963 

 > Date: Fri, 28 Feb 2003 22:56:27 +1100 (EST)
 > From: Bruce Evans <bde@zeta.org.au>
 > 
 > On Tue, 25 Feb 2003, Bruce A. Mah wrote:
 > 
 > > If memory serves me right, Kevin Oberman wrote:
 > >
 > > > imp vetoed the first section of the patch and I agree with his point
 > > > in doing so.
 > >
 > > Heh.  Actually that was the part that got committed to HEAD.  I made
 > > an offer to bde to back this out but he didn't take me up on it
 > > (yet)...the offer still stands.
 > 
 > I think both parts got committed to HEAD.  I mostly agreed with the
 > first part but not with the second part.  All I've done is fix the
 > formatting and improve the English of the first part in my version:
 > 
 > %%%
 > Index: sio.c
 > ===================================================================
 > RCS file: /home/ncvs/src/sys/dev/sio/sio.c,v
 > retrieving revision 1.382
 > diff -u -2 -r1.382 sio.c
 > --- sio.c	11 Oct 2002 20:22:20 -0000	1.382
 > +++ sio.c	23 Feb 2003 13:59:40 -0000
 > @@ -766,6 +783,5 @@
 >  		"sio%d: configured irq %ld not in bitmap of probed irqs %#x\n",
 >  		    device_get_unit(dev), xirq, irqs);
 > -		printf(
 > -		"sio%d: port may not be enabled\n",
 > +		printf("sio%d: port might not be enabled\n",
 >  		    device_get_unit(dev));
 >  	}
 > %%%
 
 I like this a LOT better than the current verbiage and better than
 mine. Another way to make it clearer that the second line is an
 extension of the first would be:
 printf("      port might not be enabled\n",     device_get_unit(dev));
 
 In the end the appearance of dmesg will always be a bit ugly and oft
 times a bit confusing. Extra line are evil as they leave less data on
 the 24 (or whatever) lines available on the console. (Perhaps it's
 time to think about increasing the default value of SC_HISTORY_SIZE.)
 
 > This code is only reached in the !noprobe case, which should be only in
 > legacy cases where it should not be suprising to get misconfigured irqs,
 > but I think pnp and/or acpi are now too successful at finding devices
 > (even when you have disabled them in static hints?) so we now get more
 > half-working probes.  So the main problems here are:
 > - the driver doesn't understand that some hints are better than others
 >   so it shouldn't attempt to check them.
 > - various problems if multiple sio devices share an interrupt (and
 >   interrupts are ISAish (edge sensitive, etc.) so they can't be shared
 >   at runtime.  The message is the first hint of such problems.
 
 ACPI does not run properly on a great many laptops (both if my IBMs
 included) and even PNP can be problematic, although now that we have
 IRQ sharing for PCCARD, it's generally not evil. But there are regular
 messages on questions asking about this message, so people are still
 seeing it a lot in V4.
 
 > > > But I think the second half (8250 or not responding) is a
 > > > legitimate bug fix as the driver currently implies that it actually
 > > > could tell the device is an 8250 when the message is really only an
 > > > indication that the driver did not receive a response to its query. It
 > > > assumes that this means an 8250 as any flavor of 16550 or any decent
 > > > clone will respond in some way.
 > >
 > > What I was recall being told is that the underlying cause was that the
 > > driver shouldn't have been trying to probe the device anyways.  (A
 > > separate, but related problem.)
 > 
 > It is in the attach code actually.  I don't understand how the attach
 > routine can be called on completely "not responding" devices.  The
 > noprobe case makes the probe sloppy but should only be used for pccards
 > and certain broken cases where another layer hopefully knows what it's
 > doing.  Technically, I think we can do the fifo test first to distinguish
 > the >= 16550 UARTs.  Then the scratch register test would only be needed
 > to distinguish between ancient UARTs.  NetBSD's com.c doesn't bother
 > with it.
 
 I have a specific case. I maintain (or try to, as it won't work in V5)
 the package to support the mWave modem on some older IBM ThinkPads.
 Since the 16550A is emulated in software, there are cases where
 initialization problems result in the FIFO probe failing and the
 device being identified as an 8250. I know just what this means, but
 it took me a while to figure it out and others have just given up,
 deciding that the port does not work.
 
 R. Kevin Oberman, Network Engineer
 Energy Sciences Network (ESnet)
 Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
 E-mail: oberman@es.net			Phone: +1 510 486-8634
Comment 8 Bruce A. Mah freebsd_committer freebsd_triage 2003-11-01 00:45:11 UTC
Responsible Changed
From-To: bmah->bde

I give up on this PR.  It's been sitting on my pile for over a year 
and a half, my reasoning about the solution to this PR was totally 
wrong, and I'm not likely to come up with the right solution anytime 
in this lifetime.  Giving this PR to the sio(4) maintainer to figure 
out what should be done about this. 

My apologies.
Comment 9 Kevin Oberman 2004-04-12 22:11:49 UTC
This was fixed many months ago after consultation with Bruce Mah, Warner
Losh and Bruce. Please close it.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman@es.net			Phone: +1 510 486-8634
Comment 10 Bruce Evans freebsd_committer freebsd_triage 2004-05-10 14:26:04 UTC
State Changed
From-To: analyzed->analyzed sime more

Misprobing as an 8250 may be caused by misconfiguring the serial console 
flag 0x10.  This flag is not just a hint.  Bad things happen if it is 
used on for a device whose hardware doesn't exist.  If the device with 
nonexistent hardware gets used as a serial console, then the system 
usually just hangs early.  Otherwise, sioprobe() trusts the configuration 
too much and forces the probe to succeed after printing some diagnostics. 
Probing for the UART type is delayed until sioattach().  sioattach() 
doesn't probe the hardware except for this and there is no failure 
case for this, so nonexistent hardware is usually considered to be an 
8250 since it has the same number of special features (none).  If the 
device with nonexistent hardware is opened from userland, the system 
often hangs then.
Comment 11 Jaakko Heinonen freebsd_committer freebsd_triage 2009-12-17 17:29:33 UTC
State Changed
From-To: analyzed->closed

The submitted patch has been committed and the submitter has requested to 
close this PR.