Using a suitibly modified copy of /usr/share/examples/netgraph/ether.bridge, I brought up a bridge between de0, de1, de2, and xl0 (with xl0 being the "local interface". The topology is : [ singsing ] <-x de0 -> [ win2k box] <-x de1 -> [ solaris box ] <-x de2 -> [ utterly borked freebsd-current box ] <-- xl0 -> [ summit 48 + rest of network, default route etc ] where <-x interface -> indicates a crossover cat5 cable, and <-- interface -> indicates a normal bit of cat5 [ this is all done because of a lack of wall ports in the office ] Machines attached to the de0, de1 or de2 have connectivity problems from singsing, but they are reachable from other machines on the network. A "snoop -o snoop.out host singsing and host solaris_box" run on the solaris box while simultanously running "telnet singsing imap" in another windown suggest that the TCP checksum is wrong (acording to ethereal). I haven't "snooped" icmp echo requests/responses in either direction yet; nor did I run tcpdump on "singsing". Traffic out of the xl0 interface appears to be unaffected. http://www.urgle.com/~mike/netgraph/ has my modified ether.bridge; the snoop output as seen by the solaris box; and dmesg output from "singsing". This problem has appeared between 4.4-STABLE-20011001T111411 & 4.4-STABLE-20011027T210032 (where the timestamp is the time "cvs update" was run against a local CVS mirror that cvsups hourly) If additional snoops/tcpdumps on assorted hosts would be useful, or if anything else would help track this down, let me know. Fix: An evil workaround is: for i in machines attached to de{0,1,2} ; do route add $i $default_route done but this is obviously less than ideal. How-To-Repeat: Set up a bridge using /usr/share/examples/netgraph/ether.bridge and a LOCAL_IFACE; try and connect from the bridge to machines that are attached to BRIDGE_IFACES but not LOCAL_IFACE
I think the problem is the delayed checksum support added to FreeBSD. The delayed checksum code makes the (reasonable) assumption in ip_output() that a packet routed to a specific interface will actually be transmitted by the hardware associated with that interface. This is not necessarily true when netgraph is being used. The solution would be to modify ng_ether() to compute on-demand any delayed checksums for packets leaving the 'lower' hook when that hook is attached to another node. I'll try to work on this soon. -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com
Here is a candidate patch. There seem to be two strategies: #1 Check the flags for mbufs going out the 'lower' hook and compute the checksums for packets that need it. #2 Temporarily turn off the interface flags advertising hardware checksums whenever 'lower' is connected. The patch below implements #2. While #1 might make more sense, the problem is that it fails in the case of CSUM_FRAGMENT. To handle CSUM_FRAGMENT in case #1, we would have to implement IP fragmentation code in ng_ether, yuck... seems much better to let ip_output() do it. The only unknown is whether the drivers will freak out when ifp->if_hwassist is changed to zero and back... it doesn't appear so; it seems the worst thing that would happen is the chip recomputing a checksum, but I'm not positive about that. -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com Index: sys/netgraph/ng_ether.c =================================================================== RCS file: /home/ncvs/src/sys/netgraph/ng_ether.c,v retrieving revision 1.2.2.10 diff -u -r1.2.2.10 ng_ether.c --- sys/netgraph/ng_ether.c 1 Sep 2001 08:22:29 -0000 1.2.2.10 +++ sys/netgraph/ng_ether.c 2 Feb 2002 21:46:39 -0000 @@ -75,6 +75,7 @@ u_char lowerOrphan; /* whether lower is lower or orphan */ u_char autoSrcAddr; /* always overwrite source address */ u_char promisc; /* promiscuous mode enabled */ + u_long hwassist; /* hardware checksum capabilities */ }; typedef struct private *priv_p; @@ -319,6 +320,7 @@ priv->ifp = ifp; IFP2NG(ifp) = node; priv->autoSrcAddr = 1; + priv->hwassist = ifp->if_hwassist; /* Try to give the node the same name as the interface */ if (ng_name_node(node, name) != 0) { @@ -470,6 +472,10 @@ if (*hookptr != NULL) return (EISCONN); + /* If lower hook connected, disable hardware checksum */ + if (hookptr == &priv->lower) + ifp->if_hwassist = 0; + /* OK */ *hookptr = hook; priv->lowerOrphan = orphan; @@ -685,6 +691,7 @@ { const priv_p priv = hook->node->private; + /* Mark hook not connected */ if (hook == priv->upper) priv->upper = NULL; else if (hook == priv->lower) { @@ -692,8 +699,14 @@ priv->lowerOrphan = 0; } else panic("%s: weird hook", __FUNCTION__); + + /* If lower hook disconnected, restore hardware checksum */ + if (hookptr == &priv->lower) + ifp->if_hwassist = priv->hwassist; + + /* Reset node after last disconnect */ if (hook->node->numhooks == 0) - ng_rmnode(hook->node); /* reset node */ + ng_rmnode(hook->node); return (0); }
Responsible Changed From-To: freebsd-bugs->archie My bug to fix.
OK, below are the 'better' patches for -current and -stable. The previous patch was broken because it's the 'upper' hook we need to watch, not the 'lower' hook. -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com ============================ current patch =============================== --- sys/netgraph/ng_ether.c 10 Dec 2001 08:09:47 -0000 1.21 +++ sys/netgraph/ng_ether.c 3 Feb 2002 19:00:48 -0000 @@ -75,6 +75,7 @@ u_char lowerOrphan; /* whether lower is lower or orphan */ u_char autoSrcAddr; /* always overwrite source address */ u_char promisc; /* promiscuous mode enabled */ + u_long hwassist; /* hardware checksum capabilities */ u_int flags; /* flags e.g. really die */ }; typedef struct private *priv_p; @@ -317,6 +318,7 @@ priv->ifp = ifp; IFP2NG(ifp) = node; priv->autoSrcAddr = 1; + priv->hwassist = ifp->if_hwassist; /* Try to give the node the same name as the interface */ if (ng_name_node(node, name) != 0) { @@ -468,6 +470,10 @@ if (*hookptr != NULL) return (EISCONN); + /* Disable hardware checksums while 'upper' hook is connected */ + if (hookptr == &priv->upper) + priv->ifp->if_hwassist = 0; + /* OK */ *hookptr = hook; priv->lowerOrphan = orphan; @@ -715,9 +721,10 @@ { const priv_p priv = NG_NODE_PRIVATE(NG_HOOK_NODE(hook)); - if (hook == priv->upper) + if (hook == priv->upper) { priv->upper = NULL; - else if (hook == priv->lower) { + priv->ifp->if_hwassist = priv->hwassist; /* restore h/w csum */ + } else if (hook == priv->lower) { priv->lower = NULL; priv->lowerOrphan = 0; } else ============================ stable patch =============================== --- sys/netgraph/ng_ether.c 1 Sep 2001 08:22:29 -0000 1.2.2.10 +++ sys/netgraph/ng_ether.c 3 Feb 2002 19:00:07 -0000 @@ -75,6 +75,7 @@ u_char lowerOrphan; /* whether lower is lower or orphan */ u_char autoSrcAddr; /* always overwrite source address */ u_char promisc; /* promiscuous mode enabled */ + u_long hwassist; /* hardware checksum capabilities */ }; typedef struct private *priv_p; @@ -319,6 +320,7 @@ priv->ifp = ifp; IFP2NG(ifp) = node; priv->autoSrcAddr = 1; + priv->hwassist = ifp->if_hwassist; /* Try to give the node the same name as the interface */ if (ng_name_node(node, name) != 0) { @@ -470,6 +472,10 @@ if (*hookptr != NULL) return (EISCONN); + /* Disable hardware checksums while 'upper' hook is connected */ + if (hookptr == &priv->upper) + priv->ifp->if_hwassist = 0; + /* OK */ *hookptr = hook; priv->lowerOrphan = orphan; @@ -685,9 +691,10 @@ { const priv_p priv = hook->node->private; - if (hook == priv->upper) + if (hook == priv->upper) { priv->upper = NULL; - else if (hook == priv->lower) { + priv->ifp->if_hwassist = priv->hwassist; /* restore h/w csum */ + } else if (hook == priv->lower) { priv->lower = NULL; priv->lowerOrphan = 0; } else
On 3/2/02 6:46 pm, "Archie Cobbs" <archie@dellroad.org> wrote: > Here is a candidate patch. The problem seems to be resolved when I apply the -stable version of the improved patch you mailed me. I'll leave the box running with the patch applied and let you know if there are any problems. (This mail is being sent from my laptop which treats the bridge as the IMAP server and SMTP smarthost) -- Mike Bristow Sent using the Entourage X Test Drive.
State Changed From-To: open->closed Patches applied to -current and -stable.