Bug 17889

Summary: certain type of DNS queries seem to get dropped by DNS server
Product: Base System Reporter: adrian.roy <adrian.roy>
Component: miscAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   

Description adrian.roy 2000-04-10 03:50:04 UTC
I seem to have a problem connecting to and from my FreeBSD 4.0 machine.

I am running an internal IP (10.3/16) network.  My DNS server is a WindowsNT 4.0 machine running WinRoute.  I have no DNS server running on the BSD machine.  My hosts files on ALL my machines are empty as well.  When I telnet from a 3.3 machine to the 4.0 machine, I get a login prompt just fine.  I enter the username and password and then it just hangs there for about 5 minutes.

I check the DNS server and it is saying that there was a DNS query from the 4.0 machine looking for info on the 3.3 machine.  It says the DNS query was of type 28, and that it was invalid, and that it will be dropped.  Likewise, when I try to connect to an external site from the 4.0 machine, the same DNS error occurs on the server.  For example, I will type 'telnet somehost.somedomain.com' and it just hangs there.  This happens with both TELNET and FTP.  However, the DNS error does not occur when I try to ping an address.  Ping will work fine.

At first I thought it was the NT DNS server - but - all my other machines work fine.  I have FreeBSD 3.x machines mostly, but I also have Solaris 2.5/2.6, WindowsNT, and MacOS9 all using the same DNS server.  They work fine.

Here is the kicker.  If I change the DNS server listed in resolv.conf to another DNS server out on the internet, things work fine.  This is what is blowing my mind.  I have over 80 machines on my local network that use my DNS server without a problem.  Yet, this one FreeBSD 4.0 machine refuses to work with it.  However, it seems to be able to use ANY other DNS server I pick.

I need to use my local DNS server becuase it has all the lookup info for my private IP network addresses.

I hope that's enough info.

Fix: 

The only fix that I can find, aside from choosing a different DNS server, is to remove resolv.conf completely and create entries in the /etc/hosts file.  Not very dynamic.

Once again, this was happening with both FTP and Telnet, but not ping.
How-To-Repeat: This 4.0 machine is a fresh install.  I downloaded the ISO file, burnt it to a CD, and did a basic install.  Nothing different from the last 30 BSD machines I have set up.  I entered the DNS server into resolv.conf, and the problems started from there.
Comment 1 Bill Fenner 2000-04-12 01:12:41 UTC
>I check the DNS server and it is saying that there was a DNS query from the
>4.0 machine looking for info on the 3.3 machine.  It says the DNS query was
>of type 28, and that it was invalid, and that it will be dropped.

This is a query for an IPv6 address.  Does it really drop it, or does it
reply with an empty reply?  Here's what happens when I telnet to my 4.0
box ("emachine.attlabs.att.com") from my 3.4 box ("mango.attlabs.att.com"):

% tcpdump -s 1500 udp port domain
tcpdump: listening on dc0
17:02:28.499255 emachine.attlabs.att.com.1206 > mp-dns.attlabs.att.com.domain:  33870+ PTR? 114.2.197.135.in-addr.arpa. (44)
17:02:28.501048 mp-dns.attlabs.att.com.domain > emachine.attlabs.att.com.1206:  33870* 1/7/7 PTR mango.attlabs.att.com. (371) (DF)
17:02:28.501915 emachine.attlabs.att.com.1207 > mp-dns.attlabs.att.com.domain:  33871+ AAAA? mango.attlabs.att.com. (39)
17:02:28.502909 mp-dns.attlabs.att.com.domain > emachine.attlabs.att.com.1207:  33871* 0/1/0 (106) (DF)
17:02:28.503140 emachine.attlabs.att.com.1208 > mp-dns.attlabs.att.com.domain:  33872+ AAAA? mango.attlabs.att.com.attlabs.att.com. (55)
17:02:28.504185 mp-dns.attlabs.att.com.domain > emachine.attlabs.att.com.1208:  33872 NXDomain* 0/1/0 (122) (DF)
17:02:28.504554 emachine.attlabs.att.com.1209 > mp-dns.attlabs.att.com.domain:  33873+ AAAA? mango.attlabs.att.com. (39)
17:02:28.505574 mp-dns.attlabs.att.com.domain > emachine.attlabs.att.com.1209:  33873* 0/1/0 (106) (DF)
17:02:28.505736 emachine.attlabs.att.com.1210 > mp-dns.attlabs.att.com.domain:  33874+ AAAA? mango.attlabs.att.com.attlabs.att.com. (55)
17:02:28.506769 mp-dns.attlabs.att.com.domain > emachine.attlabs.att.com.1210:  33874 NXDomain* 0/1/0 (122) (DF)
17:02:28.507112 emachine.attlabs.att.com.1211 > mp-dns.attlabs.att.com.domain:  33875+ A? mango.attlabs.att.com. (39)
17:02:28.509445 mp-dns.attlabs.att.com.domain > emachine.attlabs.att.com.1211:  33875* 1/11/10 A 135.197.2.114 (494) (DF)

If all of those AAAA? queries had to time out because the server was
dropping them instead of replying that it had no information, that
would explain why it takes so long.

I recommend:
1) Running that same tcpdump on your 4.0 machine to see what is going on
2) Trying a dig to see how your name server handles queries for aaaa
   records; "dig @nameserver aaaa some.host.name.".  It should reply
   with an empty answer section.
3) If the name server is really dropping the queries instead of replying
   to them, report this bug to the authors.

Meanwhile, you could work around the problem by using one of your
FreeBSD boxes as a name server instead of your NT box; FreeBSD comes
with a working name server.

  Bill
Comment 2 Mike Heffner freebsd_committer freebsd_triage 2001-06-26 05:10:08 UTC
State Changed
From-To: open->closed

Feedback timeout.