Bug 100519 - [netisr] suggestion to fix suboptimal network polling
Summary: [netisr] suggestion to fix suboptimal network polling
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-07-19 09:00 UTC by Arthur Hartwig
Modified: 2017-07-11 13:58 UTC (History)
1 user (show)

See Also:


Attachments
schednetisr1.diff (1.07 KB, patch)
2007-02-03 04:29 UTC, Bruce M Simpson
no flags Details | Diff
netisr.diff (2.32 KB, patch)
2007-02-05 01:06 UTC, Bruce M Simpson
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Arthur Hartwig 2006-07-19 09:00:34 UTC
Network polling has unnecessary calls to the scheduler. These require
acquiring the sched lock which imposes a variable delay depending on
contention for this lock.

in netisr_pollmore() in kern/kern_poll.c there are two calls to
schednetisrbits().  schednetisrbits is defined in net/netisr.h to set
some bits in netisr and call legacy_setsoftnet(). legacy_setsoftnet() in
net/netisr.c calls swi_sched(). swi_sched() in kern/kern_intr.c which
calls ithread_schedule() in the same file. ithread_schedule() acquires
and releases the sched_lock.

Fix: 

Since the netisr is running when netisr_pollmore() is executing and
swi_net() the main netisr despatcher loops until netisr is zero, it is
sufficient in netisr_pollmore() to just set the bits in netisr and not
also call legacy_setsoftnet():

replace the two instances of:
    schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);

in netisr_pollmore() by:
    atomic_set_rel_int(&netisr, (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
Comment 1 Gleb Smirnoff freebsd_committer freebsd_triage 2006-08-11 14:35:10 UTC
  Arthur,

On Wed, Jul 19, 2006 at 07:49:59AM +0000, Arthur Hartwig wrote:
A> >Fix:
A> Since the netisr is running when netisr_pollmore() is executing and swi_net() the main netisr despatcher loops until netisr is zero, it is sufficient in netisr_pollmore() to just set the bits in netisr and not also call legacy_setsoftnet():
A> 
A> replace the two instances of:
A>     schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
A> 
A> in netisr_pollmore() by:
A>     atomic_set_rel_int(&netisr, (1 << NETISR_POLL | 1 << NETISR_POLLMORE));

Hmm, interesting. Have you done any profiling?

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
Comment 2 Gleb Smirnoff freebsd_committer freebsd_triage 2006-09-28 18:37:43 UTC
  attach this to PR, so that it won't be lost in
my mailbox.

----- Forwarded message from Arthur Hartwig <Arthur.Hartwig@nokia.com> -----

>Hmm, interesting. Have you done any profiling?

G'day Gleb,
   My work was then largely concerned with fast forwarding performance 
between two em interfaces on FreeBSD 6.0 and in particular with trying 
to get some performance improvement at high rates by using additional 
CPUs. I did some timing using the tsc and found that the em driver in 
polling mode was using roughly roughly half the time spent in 
ether_input() to the point of putting a frame on the output queue. So I 
thought if I ran the polling code on one CPU and ether_input() on 
another CPU I would get an observable improvement. This turned out to be 
the case.

   At some stage I noticed that vmstat showed lots of interrupts even 
though I was using polling mode.  I noticed v_intr was incremented in 
swi_sched() and intr_execute_handlers(). Another system not using 
polling mode didn't show the same high interrupt rate. After making the 
described change  I found the interrupt rate reported by vmstat dropped 
significantly.  The change didn't result in a significant performance 
improvement but clearly reduce the overhead in the polling thread, thus 
making more CPU time available to the system.

Arthur



   I found interrupt driven fast forwarding performed significantly 
worse on a dual CPU system than on a single CPU system. Some profiling 
showed a lot of contention for the routing table locks, sometimes with 
significant delays between a thread attempting to acquire a lock and its 
actual acquisition. Changing to polling mode seemed as if would help 
reduce lock contention and so improve performance. A number of tweaks 
combined to get noloss fast forwarding performance between two GigE 
interfaces on a dual CPU system up to about what it was on a single CPU 
interrupt driven system

Arthur

----- End forwarded message -----
Comment 3 Bruce M Simpson freebsd_committer freebsd_triage 2006-09-28 18:49:11 UTC
Responsible Changed
From-To: freebsd-bugs->bms

I'll take this for the sake of convenience...
Comment 4 Bruce M Simpson 2007-02-03 04:29:37 UTC
For convenience, I've attached a diff against bleeding edge HEAD.

When I put fxp0 into polling mode with this patch, network operations 
appear to wedge.

anglepoise# sysctl debug.mpsafenet
debug.mpsafenet: 1
anglepoise# uname -a
FreeBSD anglepoise.lon.incunabulum.net 7.0-CURRENT FreeBSD 7.0-CURRENT 
#19: Sat Feb  3 04:20:32 GMT 2007     
bms@anglepoise.lon.incunabulum.net:/usr/obj/usr/home/bms/head/src/sys/ANGLEPOISE7  
amd64
anglepoise# ifconfig fxp0
fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        options=48<VLAN_MTU,POLLING>
        inet 192.168.123.17 netmask 0xffffff00 broadcast 192.168.123.255
        ether 00:90:27:59:40:2c
        media: Ethernet autoselect (10baseT/UTP)
        status: active
anglepoise# ifconfig fxp0 -polling

 -> network operations resume
Comment 5 Bruce M Simpson 2007-02-05 01:06:06 UTC
I tried the attached patch. With SWI_DELAY, nothing happens and
traffic stalls. Without it, it's business as usual.

Of course, with !SWI_DELAY. this will force a call to
intr_event_schedule_thread(), making no real difference.

I should point out all my testing was done on a uniprocessor
amd64 system with an fxp card, though debug.mpsafenet is 1.
Comment 6 Luigi Rizzo 2007-02-05 07:49:52 UTC
On Mon, Feb 05, 2007 at 01:06:06AM +0000, Bruce M Simpson wrote:
> I tried the attached patch. With SWI_DELAY, nothing happens and

[for the benefit of the archives]

hmmm... so either the analysis in the PR is not correct and
the call is done while the isr is not under service, or
there is something that is done in swi_sched which we are
not taking into account properly.

	cheers
	luigi

> traffic stalls. Without it, it's business as usual.
> 
> Of course, with !SWI_DELAY. this will force a call to
> intr_event_schedule_thread(), making no real difference.
> 
> I should point out all my testing was done on a uniprocessor
> amd64 system with an fxp card, though debug.mpsafenet is 1.
> 

> Index: src/sys/kern/kern_poll.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/kern/kern_poll.c,v
> retrieving revision 1.28
> diff -u -p -r1.28 kern_poll.c
> --- src/sys/kern/kern_poll.c	6 Dec 2006 06:34:55 -0000	1.28
> +++ src/sys/kern/kern_poll.c	5 Feb 2007 00:50:30 -0000
> @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD: src/sys/kern/kern_po
>  
>  #include <sys/param.h>
>  #include <sys/systm.h>
> +#include <sys/bus.h>
> +#include <sys/interrupt.h>
>  #include <sys/kernel.h>
>  #include <sys/socket.h>			/* needed by net/if.h		*/
>  #include <sys/sockio.h>
> @@ -314,7 +316,10 @@ hardclock_device_poll(void)
>  		if (phase != 0)
>  			suspect++;
>  		phase = 1;
> -		schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
> +		/* optimize out unnecessary calls to the scheduler. */
> +		atomic_set_rel_int(&netisr,
> +		    (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
> +		swi_sched(net_ih, SWI_DELAY);
>  		phase = 2;
>  	}
>  	if (pending_polls++ > 0)
> @@ -371,7 +376,10 @@ netisr_pollmore()
>  	mtx_lock(&poll_mtx);
>  	phase = 5;
>  	if (residual_burst > 0) {
> -		schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
> +		/* optimize out unnecessary calls to the scheduler. */
> +		atomic_set_rel_int(&netisr,
> +		    (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
> +		swi_sched(net_ih, SWI_DELAY);
>  		mtx_unlock(&poll_mtx);
>  		/* will run immediately on return, followed by netisrs */
>  		return;
> Index: src/sys/net/netisr.h
> ===================================================================
> RCS file: /home/ncvs/src/sys/net/netisr.h,v
> retrieving revision 1.33
> diff -u -p -r1.33 netisr.h
> --- src/sys/net/netisr.h	7 Jan 2005 01:45:35 -0000	1.33
> +++ src/sys/net/netisr.h	5 Feb 2007 00:50:30 -0000
> @@ -69,6 +69,7 @@
>  #ifdef _KERNEL
>  
>  void legacy_setsoftnet(void);
> +extern void *net_ih;
>  
>  extern volatile unsigned int	netisr;	/* scheduling bits for network */
>  #define	schednetisr(anisr) do {						\
> Index: src/sys/net/netisr.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/net/netisr.c,v
> retrieving revision 1.18
> diff -u -p -r1.18 netisr.c
> --- src/sys/net/netisr.c	28 Nov 2006 11:19:36 -0000	1.18
> +++ src/sys/net/netisr.c	5 Feb 2007 00:50:30 -0000
> @@ -82,7 +82,7 @@ struct netisr {
>  	int		ni_flags;
>  } netisrs[32];
>  
> -static void *net_ih;
> +void *net_ih;
>  
>  /*
>   * Not all network code is currently capable of running MPSAFE; however,
Comment 7 Bruce M Simpson freebsd_committer freebsd_triage 2007-02-05 11:33:28 UTC
State Changed
From-To: open->feedback

Think we're back to the drawing board here, at least for the current 
code base...
Comment 8 Arthur Hartwig 2007-02-05 21:31:32 UTC
I haven't yet looked at the 7.0 code, but in the 6.0 code the call to 
schednetisrbits() in hardclock_device_poll() was necessary to get the 
netisr thread running.  But in netisr_pollmore() the call to the 
scheduler was unnecessary because the netisr was already running. The 
patch by bms did rather more than I suggested (/replace two instance . . 
. in netisrpollmore()/) in that it it removed the call to 
schednetisrbits() in hardclock_device_poll()

I'm about to go to work. I try to take a look at the 7.0 code later today.

Arthur
Comment 9 Arthur Hartwig 2007-02-06 01:35:17 UTC
I've looked at the HEAD version of kern_poll.c and it seems similar 
enough to 6.0 version.
The call to schednetisrbits() in hardclock device_poll() needs to remain 
to ensure the netisr is scheduled to run. However, in netisr_pollmore() 
the two calls to schednetisrbits() can be replaced as I originally 
described.

My reading of the diffs is that you replaced the call to 
schednetisrbits() in hardclock_device_poll() (which I didn't suggest) 
and replaced only one of the two calls to schednetisrbits() in 
netisr_pollmore() (which is only part of what I suggested.)

Arthur
Comment 10 Bruce M Simpson freebsd_committer freebsd_triage 2007-02-25 16:18:13 UTC
State Changed
From-To: feedback->open

Back to the net pool 


Comment 11 Bruce M Simpson freebsd_committer freebsd_triage 2007-02-25 16:18:13 UTC
Responsible Changed
From-To: bms->net

Back to the net pool
Comment 12 Robert Watson freebsd_committer freebsd_triage 2007-02-26 11:55:47 UTC
Responsible Changed
From-To: net->freebsd-net

Assign to freebsd-net instead of net, since that's the more usual name 
for net@ assignments.