Bug 26309

Summary: PPPoE client panics in kernel - fxp problem
Product: Base System Reporter: brett <brett>
Component: kernAssignee: Gleb Smirnoff <glebius>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   

Description brett 2001-04-03 09:20:01 UTC
Have been trying to set up a PPPoE client on 4.3-RC2 as instructed in the
FreeBSD Handbook, but the machine regularaly page faults in the kernel just 
as the link is preparing to come up (that is, right after ppp says "Using 
interface: tun0.") The panic display said that the "current process" was 
ppp (not surprisingly) and that the CPU had encountered a page fault "while 
in kernel mode." The Netgraph, Netgraph sockets, Netgraph PPPoE, and 
Netgraph Ethernet modules were all compiled statically into the kernel 
and so didn't have to be loaded. Given that tun is a pretty simple
device and ppp is well tested, this tester suspects that the bug is 
in the Netgraph system somewhere (but cannot tell for sure).

Fix: 

At time of submission could not find a workaround.
How-To-Repeat: Configure PPPoE as instructed in the FreeBSD handbook. My ppp.conf
looked like this:

default:
  set device PPPoE:fxp0:pppserv # The PPPoE host will list itself as "pppserv"
  set mru 1492
  set mtu 1492
  set authname username
  set authkey passwordstring
  set log Phase tun command # you can add more detailed logging if you wish
  set dial
  set login
# Let the server dictate address based on its
# ppp.secret file, which contains a static
# address for each user
  set ifaddr 0.0.0.0/0 0.0.0.0/0
  add default HISADDR
  nat enable yes

papchap:
  set authname username
  set authkey passwordstring

Then, just start PPP with "ppp -ddial". The actual presence of a server 
did not seem to be required.
Comment 1 Brian Somers freebsd_committer freebsd_triage 2001-04-03 10:12:03 UTC
Responsible Changed
From-To: freebsd-bugs->brian

I guess ppp is mine
Comment 2 Brian Somers 2001-04-03 10:17:39 UTC
> 
> >Number:         26309
> >Category:       kern
> >Synopsis:       PPPoE client panics in kernel
[.....]
> Have been trying to set up a PPPoE client on 4.3-RC2 as instructed in the
> FreeBSD Handbook, but the machine regularaly page faults in the kernel just 
> as the link is preparing to come up (that is, right after ppp says "Using 
> interface: tun0.") The panic display said that the "current process" was 
[.....]
> default:
>   set device PPPoE:fxp0:pppserv # The PPPoE host will list itself as "pppserv"
[.....]

Do you have src/usr.sbin/ppp/ether.c 1.9.2.6 ?  fxp misbehaves at the 
start of fxp_start() if it hasn't yet been brought up (IFF_UP).  The 
latest version of ether.c brings the ethernet interface up if it's 
not already.
-- 
Brian <brian@Awfulhak.org>                        <brian@[uk.]FreeBSD.org>
      <http://www.Awfulhak.org>                   <brian@[uk.]OpenBSD.org>
Don't _EVER_ lose your sense of humour !
Comment 3 brett 2001-04-03 18:32:07 UTC
At 03:17 AM 4/3/2001, Brian Somers wrote:

>Do you have src/usr.sbin/ppp/ether.c 1.9.2.6 ?  fxp misbehaves at the 
>start of fxp_start() if it hasn't yet been brought up (IFF_UP).  The 
>latest version of ether.c brings the ethernet interface up if it's 
>not already.

I have what's in 4.3-RC2, which is 1.9.2.6 (dated 2001/03/29).

Is this the version that misbehaves?

--Brett
Comment 4 Brian Somers 2001-04-03 19:40:41 UTC
> At 03:17 AM 4/3/2001, Brian Somers wrote:
> 
> >Do you have src/usr.sbin/ppp/ether.c 1.9.2.6 ?  fxp misbehaves at the 
> >start of fxp_start() if it hasn't yet been brought up (IFF_UP).  The 
> >latest version of ether.c brings the ethernet interface up if it's 
> >not already.
> 
> I have what's in 4.3-RC2, which is 1.9.2.6 (dated 2001/03/29).
> 
> Is this the version that misbehaves?

Nope - it's the version that should have avoided the fxp problem :-/

I'll try to reproduce this locally, but I have no fxp cards so I may 
be irritating^Wasking you for more details soon :-)

> --Brett

-- 
Brian <brian@Awfulhak.org>                        <brian@[uk.]FreeBSD.org>
      <http://www.Awfulhak.org>                   <brian@[uk.]OpenBSD.org>
Don't _EVER_ lose your sense of humour !
Comment 5 brett 2001-04-03 19:56:09 UTC
At 12:40 PM 4/3/2001, Brian Somers wrote:

>I'll try to reproduce this locally, but I have no fxp cards so I may 
>be irritating^Wasking you for more details soon :-)

It may not be the fxp card that's causing the problem (though I am
using one). Before I linked ng_ether statically into the kernel, I 
got an error when the kernel loaded it. After I linked it in, I got
the panic. So, I think the problem may be in ng_ether or in some
argument that gets passed to it. But this is just a guess, and I
could be wrong here.

If you'd like, I'll send the usual dmesg output, etc.

--Brett
Comment 6 brett 2001-04-05 07:17:23 UTC
Brian:

Here's another clue as to what's wrong. When I configured the machine a little differently, I saw the following two messages just before the kernel panic:

module_register: module netgraph already exists!

and then...

linker_file_sisinit: "netgraph.ko" failed to register!  17

followed by the panic. I've tried not linking netgraph into the kernel. When I do this, the machine does not crash; however, PPPoE doesn't work either.

--Brett
Comment 7 Brian Somers 2001-04-06 00:05:47 UTC
> At 12:40 PM 4/3/2001, Brian Somers wrote:
> 
> >I'll try to reproduce this locally, but I have no fxp cards so I may 
> >be irritating^Wasking you for more details soon :-)
> 
> It may not be the fxp card that's causing the problem (though I am
> using one). Before I linked ng_ether statically into the kernel, I 
> got an error when the kernel loaded it. After I linked it in, I got
> the panic. So, I think the problem may be in ng_ether or in some
> argument that gets passed to it. But this is just a guess, and I
> could be wrong here.

Well, I got to try it here on a -stable machine built 2 days ago and 
everything worked ok.

The kernel was built with no netgraph support - I had to ``kldload 
netgraph'' to make the link come up (ppp kept trying and succeeded 
once this was done).  I've fixed this now and hopefully it can be 
MFC'd.  I'm using a ``sis'' card.

# kldstat
Id Refs Address    Size     Name
 1    8 0xc0100000 1953bc   kernel
 2    1 0xc08fd000 6000     procfs.ko
 3    1 0xc091f000 4000     mfs.ko
 4    1 0xc0944000 4000     if_tun.ko
 5    4 0xc0989000 8000     netgraph.ko
 6    1 0xc0994000 3000     ng_ether.ko
 7    1 0xc0998000 3000     ng_socket.ko
 8    1 0xc099b000 4000     ng_pppoe.ko

> If you'd like, I'll send the usual dmesg output, etc.

Is it possible for you to add an ``ifconfig_fxp0'' line to your 
rc.conf to see if configuring the interface up front helps ?  
Otherwise I think more info is required.

> --Brett

Cheers.

-- 
Brian <brian@Awfulhak.org>                        <brian@[uk.]FreeBSD.org>
      <http://www.Awfulhak.org>                   <brian@[uk.]OpenBSD.org>
Don't _EVER_ lose your sense of humour !
Comment 8 brett 2001-04-06 00:12:05 UTC
At 05:05 PM 4/5/2001, Brian Somers wrote:

>Is it possible for you to add an ``ifconfig_fxp0'' line to your
>rc.conf to see if configuring the interface up front helps ?

It doesn't help. I tried giving fxp0 an unregisterable address
in the 172.16 range. Still got the panic.

--Brett
Comment 9 Brian Somers 2001-04-06 00:13:17 UTC
> The kernel was built with no netgraph support - I had to ``kldload 
> netgraph'' to make the link come up (ppp kept trying and succeeded 
> once this was done).  I've fixed this now and hopefully it can be 
> MFC'd.  I'm using a ``sis'' card.

On second thoughts, as the install kernel is built with options 
NETGRAPH (it's added to the GENERIC template in dokern.sh) I'll wait 
for the code freeze to end.

-- 
Brian <brian@Awfulhak.org>                        <brian@[uk.]FreeBSD.org>
      <http://www.Awfulhak.org>                   <brian@[uk.]OpenBSD.org>
Don't _EVER_ lose your sense of humour !
Comment 10 Brian Somers freebsd_committer freebsd_triage 2001-04-14 23:05:32 UTC
Responsible Changed
From-To: brian->freebsd-bugs

The panic went away when a different NIC (and driver) was used. 
As I don't have any fxp cards to test with, I can't fix this.
Comment 11 drek 2001-05-05 07:10:47 UTC
FYI: I am experiencing the same problem with 4.1-Release with the vr driver
I have not tried bringing the interface up first but will do so asap. I
thought it was just me until I found this PR tonight.

cheers,

Derek Marshall
Comment 12 brett 2001-05-05 17:05:42 UTC
No, it seems to be a general problem. With some types of adapters, 
such as fxp, there is a kernel panic. With others, such as ed,
PPPoE seems to work at first but fails to re-establish a connection
after the link is temporarily broken. Brian Somers, maintainer
of userland PPP, has asked me to try using the version of PPP that
he has on his own Web site rather than the one in -STABLE. But
I suspect that the problem is lower down -- in netgraph, the
NIC drivers, or both.

--Brett

At 12:10 AM 5/5/2001, drek@smashpow.net wrote:
  
>FYI: I am experiencing the same problem with 4.1-Release with the vr driver
>I have not tried bringing the interface up first but will do so asap. I
>thought it was just me until I found this PR tonight.
>
>cheers,
>
>Derek Marshall
Comment 13 sdrew 2001-11-25 05:16:21 UTC
I also experience this problem and had to change network cards to get
around it. Any fix in sight?
 
Comment 14 brett 2001-11-25 05:25:41 UTC
Contact Brian Somers. This is a longstanding problem with userland
PPP. But Brian was so busy planning BSDCon Europe (among other things)
that he didn't have time to fix it. Now that the convention is over 
he may be able to reproduce and correct it.

--Brett

At 10:16 PM 11/24/2001, Steve wrote:
  
>I also experience this problem and had to change network cards to get around it. Any fix in sight?
>
Comment 15 Gleb Smirnoff freebsd_committer freebsd_triage 2004-07-16 01:34:09 UTC
Responsible Changed
From-To: freebsd-bugs->glebius

I'd like to track down all PPPoE related PR's.
Comment 16 Gleb Smirnoff freebsd_committer freebsd_triage 2004-07-16 01:43:03 UTC
  Brett,

  I decided to awake this old PR. Can you repoduce the problem on
recent STABLE or one of the latest 4.x releases?

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
Comment 17 brett 2004-07-16 02:09:29 UTC
Sure. Just send it a packet with the old 3Com non-standard Ethertype
and it switches over (you can see it in the code) to using only
that Ethertype and rejecting the standard one. From that point on,
all clients that obey the standard are cut off. This is explicit
in the code.

--Brett

At 06:43 PM 7/15/2004, Gleb Smirnoff wrote:
  
>  Brett,
>
>  I decided to awake this old PR. Can you repoduce the problem on
>recent STABLE or one of the latest 4.x releases?
>
>-- 
>Totus tuus, Glebius.
>GLEBIUS-RIPN GLEB-RIPE
Comment 18 Gleb Smirnoff freebsd_committer freebsd_triage 2004-07-16 07:44:46 UTC
  Brett,

On Thu, Jul 15, 2004 at 07:09:29PM -0600, Brett Glass wrote:
B> Sure. Just send it a packet with the old 3Com non-standard Ethertype
B> and it switches over (you can see it in the code) to using only
B> that Ethertype and rejecting the standard one. From that point on,
B> all clients that obey the standard are cut off. This is explicit
B> in the code.

The issue you speak about was fixed quite time ago in kern/47920. However,
we are looking forward to make standard/nonstandard support better.

I'm asking you whether you can reproduce panic decribed in kern/26309
on recent FreeBSD versions?

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
Comment 19 brett 2004-07-16 21:32:08 UTC
I'll have to try it again. I gave up on PPPoE in FreeBSD long
ago.

--Brett

At 12:44 AM 7/16/2004, Gleb Smirnoff wrote:
  
>  Brett,
>
>On Thu, Jul 15, 2004 at 07:09:29PM -0600, Brett Glass wrote:
>B> Sure. Just send it a packet with the old 3Com non-standard Ethertype
>B> and it switches over (you can see it in the code) to using only
>B> that Ethertype and rejecting the standard one. From that point on,
>B> all clients that obey the standard are cut off. This is explicit
>B> in the code.
>
>The issue you speak about was fixed quite time ago in kern/47920. However,
>we are looking forward to make standard/nonstandard support better.
>
>I'm asking you whether you can reproduce panic decribed in kern/26309
>on recent FreeBSD versions?
>
>-- 
>Totus tuus, Glebius.
>GLEBIUS-RIPN GLEB-RIPE
Comment 20 Gleb Smirnoff freebsd_committer freebsd_triage 2004-07-16 21:42:44 UTC
  Brett,

On Fri, Jul 16, 2004 at 08:40:19PM +0000, Brett Glass wrote:
B>  I'll have to try it again. I gave up on PPPoE in FreeBSD long
B>  ago.

Thanks. I'm patiently awaiting for your feedback.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
Comment 21 Gleb Smirnoff freebsd_committer freebsd_triage 2004-07-16 21:48:25 UTC
State Changed
From-To: open->feedback

Awaiting feedback from originator.
Comment 22 Gleb Smirnoff freebsd_committer freebsd_triage 2004-10-05 13:59:09 UTC
State Changed
From-To: feedback->closed

Feedback timeout > 2 month. 
And I'm pretty sure it is fixed in recent RELENG_4.