Bug 86261

Summary: 'out of buffer space' after many PPPoE re-dial attempts, connectivity lost
Product: Base System Reporter: Bob Frazier <bobf>
Component: kernAssignee: Brian Somers <brian>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 5.4-STABLE   
Hardware: Any   
OS: Any   

Description Bob Frazier 2005-09-17 18:00:24 UTC
      At various times my DSL service provider loses connectivity, and ppp must re-dial the PPPoE connection to maintain it.  After the system has been running for quite some time.  uptime would have likely been since the last kernel build (around 7/22/2005).  Since that time period, countless re-dials by ppp have been performed.

At approximately midnight, I had a series of redials take place, which ultimately resulted in the following errors in /var/log/messages, and a complete loss of internet connectivity.

Sep 16 23:59:13 BSDServer routed[289]: ignore RTM_CHANGE without mask
Sep 16 23:59:13 BSDServer routed[289]: ignore RTM_CHANGE without mask
(these two errors happen all of the time whenever the ppp connection is established)
Sep 16 23:59:25 BSDServer ppp[230]: tun0: Warning: deflink: Reducing configured MRU from 1500 to 1492
(this is also normal)
Sep 17 00:01:00 BSDServer routed[289]: Send sendto(tun0, 216.175.112.1.520): No buffer space available
Sep 17 00:01:30 BSDServer routed[289]: Send sendto(tun0, 216.175.112.1.520): No buffer space available
Sep 17 00:03:30 BSDServer last message repeated 4 times

The previous sequence then repeated itself every few minutes as ppp attempted to re-dial the PPPoE connection indefinitely.  I am using a Zyxel DSL modem through earthlink.

Fix: 

rebooting the system restored connectivity in both cases.  This problem never occurred prior to FreeBSD 5.3-RELEASE, possibly not even until FreeBSD 5.4-RELEASE.
How-To-Repeat: I have only seen this problem once before, and it was caused by the same thing - ISP drops connection a sufficient number of times until ppp has re-dialed enough to cause the 'no buffer space available' error.  An extremely unreliable DSL connection (constantly dropping) with a lot of internet traffic and connections to services like IRC that stay up for days at a time over this connection, gigabytes of data transferred, services like http and ftp and ssh providing content, etc. connected to the internet for several weeks using PPPoE over a DSL connection where 'ppp' is providing the PPPoE login.  At some point a resource limit is exceeded, and then all internet connectivity is lost.
Comment 1 Gleb Smirnoff freebsd_committer freebsd_triage 2005-09-21 14:45:24 UTC
  Bob,

  next time this happends, please check the following things:

1) Does restart ppp(8) helps? To check this you will need to:

   /etc/rc.d/ppp-user stop
   [check that no ppp process is present]
   /etc/rc.d/ppp-user start

If no success on 1), follow to 2):

2) Does restart of ppp(8) + renewal of tun(4) interface help?

   /etc/rc.d/ppp-user stop
   [check that no ppp process is present]
   ifconfig tun0 destroy
   [check that tun0 interfaces disappeared]
   /etc/rc.d/ppp-user start

2) Does restart of ppp(8) + renewal of tun(4) + renewal of netgraph
   PPPoE node helps?

   /etc/rc.d/ppp-user stop
   [check that no ppp process is present]
   ifconfig tun0 destroy
   [check that tun0 interfaces disappeared]
   ngctl shutdown fxp0:orphans   [assuming you run PPPoE on fxp0]
   [check that no PPPoE nodes remained, 'ngctl types | grep pppoe' must
    display 0]

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
Comment 2 Gleb Smirnoff freebsd_committer freebsd_triage 2005-10-05 08:32:46 UTC
  Attach this to Audit-Trail.

----- Forwarded message from Bob Frazier <bobf@mrp3.com> -----

>   next time this happends, please check the following things:

It happened again today

> 1) Does restart ppp(8) helps? To check this you will need to:
>
>    /etc/rc.d/ppp-user stop
>    [check that no ppp process is present]

in my case it was still running.  I did a 'kill ###' on it, which ended it.

>    /etc/rc.d/ppp-user start

This made the problem go away!  Only /dev/tun0 was present, so the tun
device was properly deleted when I killed the process, FYI.

So it looks like it's client-side ppp doing it.

----- End forwarded message -----
Comment 3 Gleb Smirnoff freebsd_committer freebsd_triage 2005-10-05 11:19:58 UTC
Responsible Changed
From-To: freebsd-bugs->brian

Looks like related to ppp(8). Let Brian look at it.
Comment 4 Brian Somers freebsd_committer freebsd_triage 2009-06-14 08:10:03 UTC
State Changed
From-To: open->feedback

Sorry I'm a little late to respond here... 

I'd be interested (if the originator still has such a setup) to 
know if routed is the only thing with a problem or if ppp ends up 
in a state where it's never able to re-establish the connection. 
If ppp is stuck, can the originator try something like "set log 
+phase lcp ipcp physical" and send the resulting log? 

I'm guessing that routed is sending traffic to tun0, but as ppp 
isn't connected, it won't read the data... after some time the 
interface queue just gets too big. 

If ppp is getting stuck, maybe I can reproduce the problem by having 
some other process repeatedly write something to tun0.  If such 
traffic is hurting ppp's abililty to connect, then it can probably 
be taught to detect this and purge the offending data. 

Thanks.
Comment 5 Brian Somers freebsd_committer freebsd_triage 2009-07-18 22:37:26 UTC
State Changed
From-To: feedback->closed

No response from the submitter in over three weeks.  I conceed that I didn't 
originally respond in over three years, but I can't help without additional 
details (as per the feedback request).