| Summary: | [patch] ARP request fails after "bad gateway value" in if_ether.c | ||
|---|---|---|---|
| Product: | Base System | Reporter: | Voradesh Yenbut <yenbut> |
| Component: | kern | Assignee: | Remko Lodder <remko> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | ||
| Priority: | Normal | ||
| Version: | Unspecified | ||
| Hardware: | Any | ||
| OS: | Any | ||
State Changed From-To: open->feedback Do you have a routed(8) daemon running? Responsible Changed From-To: freebsd-bugs->ru I can easily reproduce this with routed(8) and route(8), and understand what's going on, but not sure if this is the routed(8) problem or kernel's. The following patch (against 4.4-RELEASE) solves this problem. In -CURRENT it's a little different, but the same if condition should apply, as long as it appears before the rt_setgate() statement. Voradesh, does this solve your problem? -Paul. Index: sys/net/rtsock.c =================================================================== RCS file: /mnt/ncvs/src/sys/net/rtsock.c,v retrieving revision 1.44.2.4 diff -u -r1.44.2.4 rtsock.c --- sys/net/rtsock.c 2001/07/11 09:37:37 1.44.2.4 +++ sys/net/rtsock.c 2001/11/27 01:33:03 @@ -399,6 +399,14 @@ break; case RTM_CHANGE: + /* Don't let the user specify non-link information + * for a gateway if the RTF_LLINFO flag is set. + * We'll just leave the gateway alone. + */ + if (gate && (rt->rt_flags & RTF_LLINFO) && + gate->sa_family != AF_LINK) + gate = rt->rt_gateway; + if (gate && (error = rt_setgate(rt, rt_key(rt), gate))) senderr(error); Thanks for the patch. Unfortunately, it did not solve my problem. The kernel was changed from 4.2 to 4.4 with the patch. After a while the usual error messages were printed, and no communication to IP addresses listed in the message was possible afterward. Below is an example of messages (192.168.85 is a HP LaserJet 4M and 128.95.8.25 is a win2k machine.) Nov 29 14:58:08 bs8 /kernel: arp_rtrequest: bad gateway value Nov 29 14:58:08 bs8 /kernel: arplookup 192.168.8.85 failed: could not allocate llinfo Nov 29 14:58:08 bs8 /kernel: arpresolve: can't allocate llinfo for 192.168.8.85rt Nov 29 14:58:31 bs8 /kernel: arplookup 192.168.8.85 failed: could not allocate llinfo Nov 29 14:58:31 bs8 /kernel: arpresolve: can't allocate llinfo for 192.168.8.85rt Nov 29 14:58:31 bs8 /kernel: arplookup 192.168.8.85 failed: could not allocate llinfo Nov 29 14:58:31 bs8 /kernel: arpresolve: can't allocate llinfo for 192.168.8.85rt Nov 29 15:10:22 bs8 /kernel: arp_rtrequest: bad gateway value Nov 29 15:10:22 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo Nov 29 15:10:22 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt Nov 29 15:10:29 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo Nov 29 15:10:29 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo Nov 29 15:10:29 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt Nov 29 15:10:29 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo Nov 29 15:10:29 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt Nov 29 15:10:33 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo Nov 29 15:10:33 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt Nov 29 15:10:45 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo Nov 29 15:10:45 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt Nov 29 15:10:46 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo Nov 29 15:10:46 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt On Thu, Nov 29, 2001 at 03:31:17PM -0800, Voradesh Yenbut wrote: > Thanks for the patch. Unfortunately, it did not solve my problem. > > The kernel was changed from 4.2 to 4.4 with the patch. After a while > the usual error messages were printed, and no communication to IP addresses > listed in the message was possible afterward. > > Below is an example of messages (192.168.85 is a HP LaserJet 4M and > 128.95.8.25 is a win2k machine.) > > > Nov 29 14:58:08 bs8 /kernel: arp_rtrequest: bad gateway value > Nov 29 14:58:08 bs8 /kernel: arplookup 192.168.8.85 failed: could not allocate llinfo > Nov 29 14:58:08 bs8 /kernel: arpresolve: can't allocate llinfo for 192.168.8.85rt > > Nov 29 14:58:31 bs8 /kernel: arplookup 192.168.8.85 failed: could not allocate llinfo > Nov 29 14:58:31 bs8 /kernel: arpresolve: can't allocate llinfo for 192.168.8.85rt > Nov 29 14:58:31 bs8 /kernel: arplookup 192.168.8.85 failed: could not allocate llinfo > Nov 29 14:58:31 bs8 /kernel: arpresolve: can't allocate llinfo for 192.168.8.85rt > > Nov 29 15:10:22 bs8 /kernel: arp_rtrequest: bad gateway value > Nov 29 15:10:22 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo > Nov 29 15:10:22 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt > Nov 29 15:10:29 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo > Nov 29 15:10:29 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo > Nov 29 15:10:29 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt > Nov 29 15:10:29 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo > Nov 29 15:10:29 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt > Nov 29 15:10:33 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo > Nov 29 15:10:33 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt > Nov 29 15:10:45 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo > Nov 29 15:10:45 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt > Nov 29 15:10:46 bs8 /kernel: arplookup 128.95.8.25 failed: could not allocate llinfo > Nov 29 15:10:46 bs8 /kernel: arpresolve: can't allocate llinfo for 128.95.8.25rt > Your routing table is screwed. These "can't allocate llinfo" say this. Cheers, -- Ruslan Ermilov Oracle Developer/DBA, ru@sunbay.com Sunbay Software AG, ru@FreeBSD.org FreeBSD committer, +380.652.512.251 Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age On Wed, Nov 28, 2001 at 04:22:08PM -0800, Paul Herman wrote: > > The following patch (against 4.4-RELEASE) solves this problem. In > -CURRENT it's a little different, but the same if condition should > apply, as long as it appears before the rt_setgate() statement. > > Voradesh, does this solve your problem? > > -Paul. > > Index: sys/net/rtsock.c > =================================================================== > RCS file: /mnt/ncvs/src/sys/net/rtsock.c,v > retrieving revision 1.44.2.4 > diff -u -r1.44.2.4 rtsock.c > --- sys/net/rtsock.c 2001/07/11 09:37:37 1.44.2.4 > +++ sys/net/rtsock.c 2001/11/27 01:33:03 > @@ -399,6 +399,14 @@ > break; > > case RTM_CHANGE: > + /* Don't let the user specify non-link information > + * for a gateway if the RTF_LLINFO flag is set. > + * We'll just leave the gateway alone. > + */ > + if (gate && (rt->rt_flags & RTF_LLINFO) && > + gate->sa_family != AF_LINK) > + gate = rt->rt_gateway; > + > if (gate && (error = rt_setgate(rt, rt_key(rt), gate))) > senderr(error); > Paul, If we deny this combo for RTM_CHANGE, we should then deny it for RTM_ADD as well. For example, "route add -host 1.2.3.4 5.6.7.8 -llinfo" shouldn't create RTF_LLINFO entry with AF_INET gateway. Perhaps in this case (RTM_ADD), the code should return EINVAL. Cheers, -- Ruslan Ermilov Oracle Developer/DBA, ru@sunbay.com Sunbay Software AG, ru@FreeBSD.org FreeBSD committer, +380.652.512.251 Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age On Sat, 1 Dec 2001, Ruslan Ermilov wrote: > On Wed, Nov 28, 2001 at 04:22:08PM -0800, Paul Herman wrote: > > > > The following patch (against 4.4-RELEASE) solves this problem. In > > -CURRENT it's a little different, but the same if condition should > > apply, as long as it appears before the rt_setgate() statement. > > If we deny this combo for RTM_CHANGE, we should then deny it for > RTM_ADD as well. For example, "route add -host 1.2.3.4 5.6.7.8 > -llinfo" shouldn't create RTF_LLINFO entry with AF_INET gateway. > Perhaps in this case (RTM_ADD), the code should return EINVAL. Hi Ruslan, Yes. In fact, it should ideally be in rt_setgate() which will catch all cases. The reason I didn't do this was because the IPV6 stack, as I found out, *does* put AF_INET information as a gateway with the LLINFO bit set. :-( This is why I went conservative and only made a small change. Adding it to RTM_ADD I think would be a good thing, and returning EINVAL should be OK as long as it works with routed (haven't checked.) -Paul. State Changed From-To: feedback->open Reset state to open, feedback had been recieved a while ago State Changed From-To: open->feedback steal this ticket from ru to obtain feedback about the current status of this problem (I will bring it back to ruslan with more information if possible :-)). Responsible Changed From-To: ru->remko Grab the ticket from ru so that i can trace the feedback. State Changed From-To: feedback->closed Feedback timeout (never received) |
We have several FreeBSD systems running DNS servers. For some unknown reasons, one of the systems serving a subnet where most clients run Windows 2000, occasionally failed to do arp address resolution. The kernel logged messages like the followings: arp_rtrequest: bad gateway value arplookup 128.95.8.74 failed: could not allocate llinfo arpresolve: can't allocate llinfo for 128.95.8.74rt arp_rtrequest: bad gateway value arplookup 128.95.8.233 failed: could not allocate llinfo arpresolve: can't allocate llinfo for 128.95.8.233rt arp_rtrequest: bad gateway value arplookup 128.95.8.232 failed: could not allocate llinfo arpresolve: can't allocate llinfo for 128.95.8.232rt arplookup 128.95.8.233 failed: could not allocate llinfo arpresolve: can't allocate llinfo for 128.95.8.233rt arp_rtrequest: bad gateway value arplookup 128.95.8.230 failed: could not allocate llinfo arpresolve: can't allocate llinfo for 128.95.8.230rt arp_rtrequest: bad gateway value arplookup 128.95.8.160 failed: could not allocate llinfo arpresolve: can't allocate llinfo for 128.95.8.160rt ARP requests to the addresses above failed afterward. A system reboot made ARP requests work again, but sooner or later the same problem comes back. As I searched FreeBSD mailing lists for a solution, several reports of similar problems were found but I did not see a good solution. Fix: I don't completely understand the arp code so may not have an insight to really correct the problem, but the following patch seems to get around the problem ("bad gateway value" is still seen but no more messages about llinfo and arp works with the address causing the message.): --- if_ether.c 2001/07/23 16:35:07 1.1 +++ if_ether.c 2001/07/23 19:13:24 @@ -199,7 +199,13 @@ case RTM_RESOLVE: if (gate->sa_family != AF_LINK || gate->sa_len < sizeof(null_sdl)) { - log(LOG_DEBUG, "arp_rtrequest: bad gateway value\n"); + log(LOG_DEBUG, "arp_rtrequest: %s bad gateway value %s\n", + inet_ntoa(SIN(rt_key(rt))->sin_addr), + gate->sa_family != AF_LINK? "family": ""); + rtrequest(RTM_DELETE, + (struct sockaddr *)rt_key(rt), + rt->rt_gateway, + rt_mask(rt), rt->rt_flags, 0); break; } SDL(gate)->sdl_type = rt->rt_ifp->if_type; How-To-Repeat: I don't know how to repeat this, but it can be simulated by making a condition in arp_rtrequest() of /usr/src/sys/netinet/if_ether.c to break out of RTM_RESOLVE. For example, The following code use a static variable: static int toggle = 1; /* added */ to simulate one fault with bad gateway value condition. case RTM_RESOLVE: if (gate->sa_family != AF_LINK || toggle || /* added */ gate->sa_len < sizeof(null_sdl)) { log(LOG_DEBUG, "arp_rtrequest: bad gateway value\n"); if (toggle) toggle = 0; /* added */ break; } After a system reboot, the system will generate "rp_rtrequest: bad gateway value" to the first host it tries to contact which is is likely to be its default gateway. Even though toggle's value is 0, subsequent attempts to contact the host generates messages: arplookup xx.xx.x.xxx failed: could not allocate llinfo arpresolve: can't allocate llinfo for xx.xx.xx.xxrt This leads to believe that a good cleanup is not automatically done to a route if for some reasons it has an error.