Bug 192298 - net/dhcprelya consumes cpu on FreeBSD 10
Summary: net/dhcprelya consumes cpu on FreeBSD 10
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Sergey Matveychuk
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-31 22:24 UTC by Adam McDougall
Modified: 2015-08-18 16:25 UTC (History)
1 user (show)

See Also:


Attachments
increase timeout from 10 usec to 1ms (402 bytes, patch)
2014-12-28 19:05 UTC, lenzi.sergio
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam McDougall 2014-07-31 22:24:11 UTC
I've been using dhcprelya for a while because it worked well with tun0 and openvpn.  After upgrading such a server to FreeBSD 10, I noticed the load average stays around 0.70, approx 15% cpu usage from dhcprelya, and it is making repetitive calls.  systat -vmstat 1 shows 10-30 thousand interrupts per second on the cpu timer with kern.eventtimer.timer=LAPIC (default), or 2 thousand per second with i8254 (lower cpu usage, but same load average).  This issue can be reproduced easily, I just install dhcprelya on 10 and enable it on a spare interface with a dummy IP (no link necessary).  I am willing to help figure this out if I can get some help.  Thanks.

rc.conf:
dhcprelya_enable="YES"
dhcprelya_servers="192.168.0.10"
dhcprelya_ifaces="em3"


truss -p shows:
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
... (many)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
read(5,0x80185b000,524288)			 = 0 (0x0)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
nanosleep({0.001000000 })			 = 0 (0x0)
select(7,{6},0x0,0x0,{0.000010 })		 = 0 (0x0)
...

kdump says:
  1086 dhcprelya RET   select 0
  1086 dhcprelya CALL  select(0x7,0x7fffffffe9e0,0,0,0x7fffffffea90)
  1086 dhcprelya RET   select 0
  1086 dhcprelya CALL  select(0x7,0x7fffffffe9e0,0,0,0x7fffffffea90)
  1086 dhcprelya RET   select 0
  1086 dhcprelya CALL  select(0x7,0x7fffffffe9e0,0,0,0x7fffffffea90)
...

procstat -k 1086 flips between:
  PID    TID COMM             TDNAME           KSTACK                       
 1086 100395 dhcprelya        -                <running>                    
 1086 100396 dhcprelya        -                mi_switch sleepq_catch_signals sleepq_timedwait_sig _sleep bpfread devfs_read_f dofileread kern_readv sys_read amd64_syscall Xfast_syscall 

and

  PID    TID COMM             TDNAME           KSTACK                       
 1086 100395 dhcprelya        -                mi_switch sleepq_catch_signals sleepq_timedwait_sig _cv_timedwait_sig_sbt seltdwait kern_select sys_select amd64_syscall Xfast_syscall 
 1086 100396 dhcprelya        -                mi_switch sleepq_catch_signals sleepq_timedwait_sig _sleep bpfread devfs_read_f dofileread kern_readv sys_read amd64_syscall Xfast_syscall
Comment 1 John Marino freebsd_committer freebsd_triage 2014-08-01 11:28:17 UTC
over to maintainer.
Comment 2 Sergey Matveychuk freebsd_committer freebsd_triage 2014-08-01 13:40:04 UTC
(In reply to mcdouga9 from comment #0)

But what about real packet rate for DHCP packets?
Comment 3 Adam McDougall 2014-08-01 15:50:41 UTC
This is with zero DHCP requests because I can reproduce the issue with no active network connections (cables unplugged).  So far I can reproduce it on systems with 'em' nic, 82574L and 82546EB but I could not reproduce it in Xen with xn0 or re0.  I verified there is no problem in 9.
Comment 4 lenzi.sergio 2014-12-28 19:05:47 UTC
Created attachment 151043 [details]
increase timeout from 10 usec to 1ms

increase the select timeout from 10usec to 1000 usec
this will reduce the number of system calls in the select loop by 1/100
so the program does not consumes cpu on idle any more


It worked for me...
Comment 5 lenzi.sergio 2014-12-28 19:07:40 UTC
put the attach in the file directory of the port, with a name patch-xxxx
and rebuild + reinstall the port..
Comment 6 Adam McDougall 2015-01-05 15:48:31 UTC
Thank you for the patch, it greatly reduces the CPU load but does not completely eliminate it on an idle network:

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
 1295 root          2  20    0 23212K  2704K bpf     0   0:01   0.29% dhcprelya

I'll use the patch for now but I think the bug report should stay open for a fuller fix.