| Summary: | [nis] [patch] NIS host name resolving may loop forever | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Base System | Reporter: | Pavlin Ivanov Radoslavov <pavlin> | ||||
| Component: | kern | Assignee: | Remko Lodder <remko> | ||||
| Status: | Closed FIXED | ||||||
| Severity: | Affects Only Me | CC: | pavlin | ||||
| Priority: | Normal | ||||||
| Version: | 3.2-RELEASE | ||||||
| Hardware: | Any | ||||||
| OS: | Any | ||||||
| Attachments: |
|
||||||
Responsible Changed From-To: freebsd-bugs->wpaul Over to Mr. yp. I can confirm a similar bug in 4-STABLE. Here is my setup: coyote - NAT-Gateway, NIS/NFS Server, sendmail with SmartHost=mail.myisp.com, -STABLE roadrunner - Workstation, NIS/NFS Client, sendmail with SmartHost=coyote, -STABLE I was running into this bug when trying to get mutt to send via sendmail. I just recently switched to NIS and use NIS for passwd,group and hosts. Here is a snip from my /etc/hosts on coyote: (132....is the external IF) 192.168.0.146 coyote coyote.local 192.168.0.147 roadrunner roadrunner.local 132.187.222.7 gb-007 gb-007.galgenberg.net coyote.dnsalias.net .. host.conf on roadrunner _included_ NIS. Trying to send mail with mutt results in an infinite loop. /var/log/messages on _coyote_ gets flooded with these messages: Jan 30 10:00:00 coyote ypserv[103]: res_mkquery failed Jan 30 10:00:30 coyote last message repeated 11 times Jan 30 10:02:30 coyote last message repeated 36 times Jan 30 10:12:32 coyote last message repeated 180 times .. Sending mail with sendmail directly produces these errors: yp_match: clnt_call: RPC: Timed out yp_match: clnt_call: RPC: Timed out yp_match: clnt_call: RPC: Timed out .. and ps aux shows me this line: send-mail: ./h0UBv88n011766 localhost.my.domain.: user open (sendmail) where localhost.my.domain is 127.0.0.1 of course Removing "nis" from /etc/host.conf (roadrunner) and restarting sendmail "fixes" this Problem, but I'd rather like to use NIS on hosts too. I stumbled across bin/11666 which says it's a duplicate of bin/5444, but i'm not quite sure which PR matches this problem best. Responsible Changed From-To: wpaul->freebsd-bugs With bugmeister hat on, return to the general pool. I think it is likely the current assignee lost interest in this PR a long time ago. State Changed From-To: open->feedback Hello is this problem still relevant? Responsible Changed From-To: freebsd-bugs->remko grab the pr State Changed From-To: feedback->closed The submitter mentions that he no longer has access to this setup so he cannot test this. Close the ticket, please give me feedback if this is still relevant. |
In some cases, resolving a host name may loop forever (inside libc): After a NIS client sends the query to the NIS server, if the NIS server tries to use a DNS query on its own to resolve the name, and if the DNS query doesn't return immediately any result (a success or a failure, i.e. it is left to expire), the query of the NIS client itself may expire before it gets any answer from the NIS server (which is waiting for the DNS query to complete). However, the libc code doesn't handle properly this situation, and it will enter an infinite loop trying to resolve the host name. The situation is very unplesant for programs like sendmail, because the result will be that a sendmail process will be blocked by this infinite loop and will not be able to deliver the email to the rest of the recepients. Fix: Apply the following patch to src/lib/libc/yp/yplib.c, then recomplile libc and install it. However, this solution will just break the infinite loop to "only" 20 loops. A better solution is the fix the NIS server to cache its recent results, such that after the first timeout of the client's request, the second request will hit the negative answer at the server, and then the client will immediately return an error instead of looping 20 times. Note that yplib.c has a number of other places that introduce the same potential danger for infinitive loop. Search for "again:", and add there a similar counter that would break the infinitive loop. How-To-Repeat: Compile and execute the following code. Note that the problem can be observed only if the particular IP address is such that it takes at least 30-60 seconds for "nslookup" to timeout: #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <netdb.h> int main() { struct in_addr in; struct hostent *hp; /* Nothing personal regarding address "200.230.88.4". It is just that my DNS query fails for this address with "Server failed" after approx. 30-60 seconds: pavlin@xanadu[11] nslookup 200.230.88.4 Server: catarina.usc.edu Address: 128.125.51.47 *** catarina.usc.edu can't find 200.230.88.4: Server failed */ inet_aton("200.230.88.4", &in); hp = gethostbyaddr((char *)&in, sizeof(in), AF_INET); if (hp) { printf("OK\n"); } else { printf("FAILURE\n"); } exit (0); } If the chosen IP address is appropriate, you should see the loop: pavlin@catarina[284] ./a.out yp_match: clnt_call: RPC: Timed out yp_match: clnt_call: RPC: Timed out yp_match: clnt_call: RPC: Timed out yp_match: clnt_call: RPC: Timed out yp_match: clnt_call: RPC: Timed out ...