Bug 16950

Summary: arpintr() incorrectly checks mbuf chain size
Product: Base System Reporter: csg <csg>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: rwatson
Priority: Normal    
Version: 3.4-STABLE   
Hardware: Any   
OS: Any   

Description csg 2000-02-24 06:00:01 UTC
The NETISR_ARP handler arpintr() incorrectly checks m->m_len to
determine if we have a complete ARP packet.  It is possible to
have a packet spread across several mbufs in the chain.

While this case apparently doesn't happen with normal ethernet
interfaces, additional mbuf operations before ARP processing (for
802.1Q Tagged VLANS, Bridged Ethernet over Frame Relay, or perhaps
LANE) can cause NETISR_ARP to be presented with a fragmented packet.

Fix: 

I've not only fixed the length comparisson, I've added several
diagnostic error messages to the handler for other out-of-the-norm
ARP packets.  This makes the error conditions easier to detect
and fix, and makes the code much more readable.

I've put patches for -STABLE and -CURRENT (which are actually
identical) online:

   http://www.waterspout.com/FreeBSD/arpintr-patch.current

   http://www.waterspout.com/FreeBSD/arpintr-patch.stable

If someone could perform a sanity check, and get these committed
before 4.0-R heads out the door, that would be ideal.

 - Steve
How-To-Repeat: 
Run my yet-to-see-the-light-of-day VLAN improvements, it blows chunks
on ever inbound ARP packet.
Comment 1 Robert Watson 2000-02-27 15:41:55 UTC
Jordan -- did it end up being the case that you did approve this fix, or
did not?

  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services
Comment 2 Jordan K. Hubbard 2000-02-27 19:02:36 UTC
Yes, I approved these awhile back.

> 
> Jordan -- did it end up being the case that you did approve this fix, or
> did not?
> 
>   Robert N M Watson 
> 
> robert@fledge.watson.org              http://www.watson.org/~robert/
> PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
> TIS Labs at Network Associates, Safeport Network Services
>
Comment 3 Robert Watson 2000-02-28 03:46:01 UTC
Steve,

I'm having stability problems with this patch.  When an arp lookup is
attempted, I get a hang.  The kernel is still live, and apparently in
kernel/interrupt context.  I don't have much debugging turned onin the
kernel, and don't have a proper debugging setup here, but here's ddb's
stack trace (from breaking to the debugger)

Debugger(c032a989) at Debugger+0x35
scgetc(c0398de0,2,c06ae100,c03913a0,ffffffff) at scgetc+0x37e
sckbdevent(c03913a0,0,c0398de0,c06ae100,1a0f080a) at sckbdevent+0x1b9
atkbd_intr(c03913a0,0,c0339dac,c02c14d2,c03913a0) at atkbd_intr+0x22
atkbd_isa_intr(c03913a0,40020000,40020010,10,c06a0010) at
atkbd_isa_intr+0x18
Xresume1() at Xresume1+0x2b
--- interrupt, eip = 0xc01d2531, esp = 0xc0339d9c, ebp = 0xc0339dac ---
arpintr(c02c1d7b,0, c0210010,40000010,ffff0010) at arpintr+0xd9
swi_net_next() at swi_net_next

My guess is you're recusively locking or something from interrupt context,
or the like.  As the console is still live, it's not as dead as it could
be, for example :-)

Interestingly, this is (I believe) the second ARP lookup, not the first
one, if that helps any. Prior to the hang, arp -an reported one entry, it
was attempting to contact a second IP that caused the trouble.

  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services
Comment 4 Robert Watson freebsd_committer freebsd_triage 2000-03-22 02:35:45 UTC
State Changed
From-To: open->closed

Patch debugged and committed prior to 4.0-RELEASE. 

Thanks!