Bug 31586 - netgraph bridges cause connectivity probs from bridge
Summary: netgraph bridges cause connectivity probs from bridge
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 4.4-STABLE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Archie Cobbs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2001-10-29 12:50 UTC by Mike Bristow
Modified: 2002-02-12 18:22 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Bristow 2001-10-29 12:50:03 UTC
Using a suitibly modified copy of
/usr/share/examples/netgraph/ether.bridge, I brought up a bridge
between de0, de1, de2, and xl0 (with xl0 being the "local interface".

The topology is :

   [ singsing ] <-x de0 -> [ win2k box]
                <-x de1 -> [ solaris box ]
                <-x de2 -> [ utterly borked freebsd-current box ]
	        <-- xl0 -> [ summit 48 + rest of network, default route etc ]

where <-x interface -> indicates a crossover cat5 cable, and <--
interface -> indicates a normal bit of cat5

[ this is all done because of a lack of wall ports in the office ]

Machines attached to the de0, de1 or de2 have connectivity problems
from singsing, but they are reachable from other machines on the network.

A "snoop -o snoop.out host singsing and host solaris_box" run on
the solaris box while simultanously running "telnet singsing imap"
in another windown suggest that the TCP checksum is wrong (acording to 
ethereal).  I haven't "snooped" icmp echo requests/responses in
either direction yet; nor did I run tcpdump on "singsing".

Traffic out of the xl0 interface appears to be unaffected.

http://www.urgle.com/~mike/netgraph/ has my modified ether.bridge;
the snoop output as seen by the solaris box; and dmesg output from "singsing".

This problem has appeared between 4.4-STABLE-20011001T111411 &
4.4-STABLE-20011027T210032

(where the timestamp is the time "cvs update" was run against a local
CVS mirror that cvsups hourly)

If additional snoops/tcpdumps on assorted hosts would be useful,
or if anything else would help track this down, let me know.

Fix: 

An evil workaround is:
	for i in machines attached to de{0,1,2} ; do 
		route add $i $default_route
	done

but this is obviously less than ideal.
How-To-Repeat: 
	Set up a bridge using /usr/share/examples/netgraph/ether.bridge
and a LOCAL_IFACE; try and connect from the bridge to machines that
are attached to BRIDGE_IFACES but not LOCAL_IFACE
Comment 1 Archie Cobbs 2002-02-02 20:47:36 UTC
I think the problem is the delayed checksum support added to FreeBSD.
The delayed checksum code makes the (reasonable) assumption in
ip_output() that a packet routed to a specific interface will actually
be transmitted by the hardware associated with that interface. This
is not necessarily true when netgraph is being used.

The solution would be to modify ng_ether() to compute on-demand any
delayed checksums for packets leaving the 'lower' hook when that hook
is attached to another node. I'll try to work on this soon.

-Archie

__________________________________________________________________________
Archie Cobbs     *     Packet Design     *     http://www.packetdesign.com
Comment 2 Archie Cobbs 2002-02-03 18:46:56 UTC
Here is a candidate patch. There seem to be two strategies:

#1 Check the flags for mbufs going out the 'lower' hook and
   compute the checksums for packets that need it.

#2 Temporarily turn off the interface flags advertising hardware
   checksums whenever 'lower' is connected.

The patch below implements #2. While #1 might make more sense,
the problem is that it fails in the case of CSUM_FRAGMENT.
To handle CSUM_FRAGMENT in case #1, we would have to implement
IP fragmentation code in ng_ether, yuck... seems much better
to let ip_output() do it.

The only unknown is whether the drivers will freak out when
ifp->if_hwassist is changed to zero and back... it doesn't appear
so; it seems the worst thing that would happen is the chip
recomputing a checksum, but I'm not positive about that.

-Archie

__________________________________________________________________________
Archie Cobbs     *     Packet Design     *     http://www.packetdesign.com

Index: sys/netgraph/ng_ether.c
===================================================================
RCS file: /home/ncvs/src/sys/netgraph/ng_ether.c,v
retrieving revision 1.2.2.10
diff -u -r1.2.2.10 ng_ether.c
--- sys/netgraph/ng_ether.c	1 Sep 2001 08:22:29 -0000	1.2.2.10
+++ sys/netgraph/ng_ether.c	2 Feb 2002 21:46:39 -0000
@@ -75,6 +75,7 @@
 	u_char		lowerOrphan;	/* whether lower is lower or orphan */
 	u_char		autoSrcAddr;	/* always overwrite source address */
 	u_char		promisc;	/* promiscuous mode enabled */
+	u_long		hwassist;	/* hardware checksum capabilities */
 };
 typedef struct private *priv_p;
 
@@ -319,6 +320,7 @@
 	priv->ifp = ifp;
 	IFP2NG(ifp) = node;
 	priv->autoSrcAddr = 1;
+	priv->hwassist = ifp->if_hwassist;
 
 	/* Try to give the node the same name as the interface */
 	if (ng_name_node(node, name) != 0) {
@@ -470,6 +472,10 @@
 	if (*hookptr != NULL)
 		return (EISCONN);
 
+	/* If lower hook connected, disable hardware checksum */
+	if (hookptr == &priv->lower)
+		ifp->if_hwassist = 0;
+
 	/* OK */
 	*hookptr = hook;
 	priv->lowerOrphan = orphan;
@@ -685,6 +691,7 @@
 {
 	const priv_p priv = hook->node->private;
 
+	/* Mark hook not connected */
 	if (hook == priv->upper)
 		priv->upper = NULL;
 	else if (hook == priv->lower) {
@@ -692,8 +699,14 @@
 		priv->lowerOrphan = 0;
 	} else
 		panic("%s: weird hook", __FUNCTION__);
+
+	/* If lower hook disconnected, restore hardware checksum */
+	if (hookptr == &priv->lower)
+		ifp->if_hwassist = priv->hwassist;
+
+	/* Reset node after last disconnect */
 	if (hook->node->numhooks == 0)
-		ng_rmnode(hook->node);	/* reset node */
+		ng_rmnode(hook->node);
 	return (0);
 }
Comment 3 Archie Cobbs freebsd_committer freebsd_triage 2002-02-04 05:20:36 UTC
Responsible Changed
From-To: freebsd-bugs->archie

My bug to fix.
Comment 4 Archie Cobbs 2002-02-04 22:53:28 UTC
OK, below are the 'better' patches for -current and -stable.

The previous patch was broken because it's the 'upper' hook
we need to watch, not the 'lower' hook.

-Archie

__________________________________________________________________________
Archie Cobbs     *     Packet Design     *     http://www.packetdesign.com

============================ current patch ===============================

--- sys/netgraph/ng_ether.c	10 Dec 2001 08:09:47 -0000	1.21
+++ sys/netgraph/ng_ether.c	3 Feb 2002 19:00:48 -0000
@@ -75,6 +75,7 @@
 	u_char		lowerOrphan;	/* whether lower is lower or orphan */
 	u_char		autoSrcAddr;	/* always overwrite source address */
 	u_char		promisc;	/* promiscuous mode enabled */
+	u_long		hwassist;	/* hardware checksum capabilities */
 	u_int		flags;		/* flags e.g. really die */
 };
 typedef struct private *priv_p;
@@ -317,6 +318,7 @@
 	priv->ifp = ifp;
 	IFP2NG(ifp) = node;
 	priv->autoSrcAddr = 1;
+	priv->hwassist = ifp->if_hwassist;
 
 	/* Try to give the node the same name as the interface */
 	if (ng_name_node(node, name) != 0) {
@@ -468,6 +470,10 @@
 	if (*hookptr != NULL)
 		return (EISCONN);
 
+	/* Disable hardware checksums while 'upper' hook is connected */
+	if (hookptr == &priv->upper)
+		priv->ifp->if_hwassist = 0;
+
 	/* OK */
 	*hookptr = hook;
 	priv->lowerOrphan = orphan;
@@ -715,9 +721,10 @@
 {
 	const priv_p priv = NG_NODE_PRIVATE(NG_HOOK_NODE(hook));
 
-	if (hook == priv->upper)
+	if (hook == priv->upper) {
 		priv->upper = NULL;
-	else if (hook == priv->lower) {
+		priv->ifp->if_hwassist = priv->hwassist;  /* restore h/w csum */
+	} else if (hook == priv->lower) {
 		priv->lower = NULL;
 		priv->lowerOrphan = 0;
 	} else

============================ stable patch ===============================

--- sys/netgraph/ng_ether.c	1 Sep 2001 08:22:29 -0000	1.2.2.10
+++ sys/netgraph/ng_ether.c	3 Feb 2002 19:00:07 -0000
@@ -75,6 +75,7 @@
 	u_char		lowerOrphan;	/* whether lower is lower or orphan */
 	u_char		autoSrcAddr;	/* always overwrite source address */
 	u_char		promisc;	/* promiscuous mode enabled */
+	u_long		hwassist;	/* hardware checksum capabilities */
 };
 typedef struct private *priv_p;
 
@@ -319,6 +320,7 @@
 	priv->ifp = ifp;
 	IFP2NG(ifp) = node;
 	priv->autoSrcAddr = 1;
+	priv->hwassist = ifp->if_hwassist;
 
 	/* Try to give the node the same name as the interface */
 	if (ng_name_node(node, name) != 0) {
@@ -470,6 +472,10 @@
 	if (*hookptr != NULL)
 		return (EISCONN);
 
+	/* Disable hardware checksums while 'upper' hook is connected */
+	if (hookptr == &priv->upper)
+		priv->ifp->if_hwassist = 0;
+
 	/* OK */
 	*hookptr = hook;
 	priv->lowerOrphan = orphan;
@@ -685,9 +691,10 @@
 {
 	const priv_p priv = hook->node->private;
 
-	if (hook == priv->upper)
+	if (hook == priv->upper) {
 		priv->upper = NULL;
-	else if (hook == priv->lower) {
+		priv->ifp->if_hwassist = priv->hwassist;  /* restore h/w csum */
+	} else if (hook == priv->lower) {
 		priv->lower = NULL;
 		priv->lowerOrphan = 0;
 	} else
Comment 5 Mike Bristow 2002-02-05 12:17:10 UTC
On 3/2/02 6:46 pm, "Archie Cobbs" <archie@dellroad.org> wrote:

> Here is a candidate patch.

The problem seems to be resolved when I apply the -stable version of the
improved patch you mailed me.  I'll leave the box running with the patch
applied and let you know if there are any problems.

(This mail is being sent from my laptop which treats the bridge as the IMAP
server and SMTP smarthost)

-- 
Mike Bristow

Sent using the Entourage X Test Drive.
Comment 6 Archie Cobbs freebsd_committer freebsd_triage 2002-02-12 18:21:31 UTC
State Changed
From-To: open->closed

Patches applied to -current and -stable.