Bug 26665

Summary: [PATCH] syslogd hangs when logging from remote hosts
Product: Base System Reporter: ajk <ajk>
Component: binAssignee: jlemon
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.2-RELEASE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description ajk 2001-04-18 06:40:01 UTC
The syslogd program seems to hang after a few days of logging from
remote hosts.  The problem appears to be similar to one discovered
last year, before the resolver was changed to use kqueue()/kevent()
rather than poll().  Jonathon Lemon posted output from ktrace and a
proposed solution to -hackers, and res_send.c was subsequently
patched.

Apparently, the fix was ignored when kqueue()/kevent() was introduced.
My ktrace looks similar:

  2920 syslogd  987293451.580333 PSIG  SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0
  2920 syslogd  987293451.580414 RET   kevent -1 errno 4 Interrupted system call
  2920 syslogd  987293451.580470 CALL  gettimeofday(0xbfbfe3d0,0)
  2920 syslogd  987293451.580518 RET   gettimeofday 0
  2920 syslogd  987293451.580576 CALL  setitimer(0,0xbfbfe3c8,0xbfbfe3b8)
  2920 syslogd  987293451.580627 RET   setitimer 0
  2920 syslogd  987293451.580673 CALL  sigreturn(0xbfbfe424)
  2920 syslogd  987293451.580722 RET   sigreturn JUSTRETURN
  2920 syslogd  987293451.580768 CALL  kevent(0x6,0xbfbfe634,0x1,0xbfbfe634,0x1,0xbfbfe620)
  2920 syslogd  987293481.591332 PSIG  SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0
  2920 syslogd  987293481.591453 RET   kevent -1 errno 4 Interrupted system call
  2920 syslogd  987293481.591505 CALL  gettimeofday(0xbfbfe3d0,0)
  2920 syslogd  987293481.591556 RET   gettimeofday 0
  2920 syslogd  987293481.591612 CALL  setitimer(0,0xbfbfe3c8,0xbfbfe3b8)
  2920 syslogd  987293481.591664 RET   setitimer 0
  2920 syslogd  987293481.591708 CALL  sigreturn(0xbfbfe424)
  2920 syslogd  987293481.591757 RET   sigreturn JUSTRETURN
  2920 syslogd  987293481.591806 CALL  kevent(0x6,0xbfbfe634,0x1,0xbfbfe634,0x1,0xbfbfe620)
  2920 syslogd  987293511.602331 PSIG  SIGALRM caught handler=0x804b4c0 mask=0x1 code=0x0
  2920 syslogd  987293511.602456 RET   kevent -1 errno 4 Interrupted system call

Earlier in the ktrace, it is evident that kevent times out after 5,
10, and 20 seconds, respectively.  Presumably, the timeout is then
increased to 40 seconds, exceeding the alarm value.

Fix: This patch is similar to the one committed in version 1.32 of
res_send.c.  It should apply cleanly against 4.2-RELEASE and
-CURRENT, but I have only tested it with the former.
How-To-Repeat: 
Run syslogd with heavy traffic from remote clients for several days
in an environment in which DNS times out occasionally.  Using -a
with domain wildcards several times may aggravate the problem.
Comment 1 Will Andrews freebsd_committer freebsd_triage 2001-06-04 19:21:48 UTC
Responsible Changed
From-To: freebsd-bugs->jlemon

Jonathan, you said in private mail that you would look at this. 
That was about a month or 6 weeks ago, and this bug still exists 
and seems to be in dire need of fixing.
Comment 2 jlemon freebsd_committer freebsd_triage 2001-07-25 19:17:36 UTC
State Changed
From-To: open->closed

Fixed in -stable as of today.