Bug 256375 - iflib/if_em: unplugging network cable causes huge KTorrent slowdown
Summary: iflib/if_em: unplugging network cable causes huge KTorrent slowdown
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
Keywords: IntelNetworking, iflib, performance
Depends on:
Reported: 2021-06-02 11:57 UTC by Alexey Dokuchaev
Modified: 2021-09-15 18:22 UTC (History)
5 users (show)

See Also:

procstat -kk output (3.95 KB, text/plain)
2021-06-02 11:57 UTC, Alexey Dokuchaev
no flags Details
dtrace output, cable plugged (1.97 KB, text/plain)
2021-06-03 04:10 UTC, Alexey Dokuchaev
no flags Details
dtrace output, cable unplugged (1.46 KB, text/plain)
2021-06-03 04:25 UTC, Alexey Dokuchaev
no flags Details
dtrace without iflib_admin_intr_deferred() call (1.32 KB, text/plain)
2021-06-03 11:56 UTC, Alexey Dokuchaev
no flags Details
ifmedia status test program (1.46 KB, text/plain)
2021-06-04 10:49 UTC, Aleksandr Fedorov
no flags Details
RELEASE-13 stock iflib driver log (925.71 KB, text/plain)
2021-06-04 12:02 UTC, Aleksandr Fedorov
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Dokuchaev freebsd_committer 2021-06-02 11:57:27 UTC
Created attachment 225494 [details]
procstat -kk output

I'm experiencing strange and very prominent performance loss with KTorrent when unplugging Ethernet cable from em0, that is, switch to WiFi: it now takes several tens seconds for it to update the window contents, react to mouse clicks, and transfer speeds drop down to single bytes per second.  At this time, the ktorrent process, as displayed by top(1), is in "iflib" state which occasionally briefly changes to "e1000_".

But once I plug the network cable back, everything goes back to normal, and the process returns to "select" state like it should.  This happens on Lenovo L470 laptop with the following NIC:

em0@pci0:0:31:6:        class=0x020000 rev=0x21 hdr=0x00 vendor=0x8086 device=0x15d8 subvendor=0x17aa subdevice=0x505f
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection (4) I219-V'
    class      = network
    subclass   = ethernet

Both today's kernel and my normal Jan/Feb'ish kernel behave identically.  The driver from `net/intel-em-kmod' port does not exhibit this problem.  I'm attaching procstat -kk $ktorrent_pid output, in case it could be useful.  This line looks suspicious as if something is holding the lock:

mi_switch+0xc1 _sx_xlock_hard+0x3d1 iflib_media_status+0xf2 ifmedia_ioctl+0x16a ifhwioctl+0x2bd ifioctl+0x48d kern_ioctl+0x23b sys_ioctl+0xf6 amd64_syscall+0x100 fast_syscall_common+0xf8

Please advise on any other debug information I could provide which may help to track the bug down and fix it.
Comment 1 Eric Joyner freebsd_committer 2021-06-02 20:15:52 UTC
It would be nice to figure out what exactly that "e1000_" refers to. Without looking, I have a hunch that it refers to "e1000_delay".

Is ktorrent trying to query the link status of connections? I don't know what syscall it's making that would lead to iflib_media_status() being called.
Comment 2 Alexey Dokuchaev freebsd_committer 2021-06-03 04:10:50 UTC
Created attachment 225515 [details]
dtrace output, cable plugged
Comment 3 Alexey Dokuchaev freebsd_committer 2021-06-03 04:25:37 UTC
Created attachment 225516 [details]
dtrace output, cable unplugged

Alexander suggested I run the following DTrace script, with and without the cable, to see if the problem has something to do with the lock or rather em_if_update_admin_status():

dtrace -n 'fbt::iflib_media_status:entry { self->ts = timestamp; } fbt::iflib_media_status:return /self->ts/ {@[execname, "ns"] = quantize(timestamp - self->ts); self->ts = 0; }'

So, two things: 1) KTorrent is not required to reproduce the bug (good), ifconfig(8) is also slowed down, these is actually a visible delay of its console output when I run without the cable; and 2) apparently em_if_update_admin_status() is bogusly and repeatedly checking hw link status despite, as Alexander had mentioned, that there actually is a link status field which is being checked in other functions.
Comment 4 Alexey Dokuchaev freebsd_committer 2021-06-03 05:27:49 UTC
(In reply to Eric Joyner from comment #1)
> It would be nice to figure out what exactly that "e1000_" refers to.
I've patched those strings to make them pinpoint exact source code location:

"e1000_" refers to sys/dev/e1000/e1000_osdep.h:97:pause("e1000_delay", ms_scale(x));

"iflib" refers to sys/net/iflib.c:531:#define CTX_LOCK_INIT(_sc)  sx_init(&(_sc)->ifc_ctx_sx, "iflib ctx lock")
Comment 5 Alexey Dokuchaev freebsd_committer 2021-06-03 06:47:23 UTC
(In reply to Alexey Dokuchaev from comment #4)
Sorry, I replied too soon.  Both safe_pause_ms() and safe_pause_us() are called.
Comment 6 Alexey Dokuchaev freebsd_committer 2021-06-03 11:56:29 UTC
Created attachment 225522 [details]
dtrace without iflib_admin_intr_deferred() call

Alexander suggested commenting out iflib_admin_intr_deferred() call in sys/dev/e1000/if_em.c like this:

> @@ -1564,7 +1564,7 @@ em_if_media_status(if_ctx_t ctx, struct ifmediareq *ifmr)
>  	INIT_DEBUGOUT("em_if_media_status: begin");
> -	iflib_admin_intr_deferred(ctx);
> +	//iflib_admin_intr_deferred(ctx);
>  	ifmr->ifm_status = IFM_AVALID;
>  	ifmr->ifm_active = IFM_ETHER;
This had caused minor, yet noticeable improvement in responsiveness: KTorrent now stays mostly in "select" state with occasional "e1000_" and rare "iflib" during which it's still stuck for several (vs several tens before) seconds, but can nonetheless perform transfers again at reasonable speeds.  However, it's still not as good as once I "kldunload if_em".
Comment 7 Eric Joyner freebsd_committer 2021-06-03 15:14:03 UTC
(In reply to Alexey Dokuchaev from comment #6)

That's something I would've suggested, too. The iflib_media_status() function already calls the driver's admin status function before calling the driver's media status function, so I don't think it's strictly necessary to call the admin task again (and thereby call the driver's admin status function more-or-less immediately afterwards!).

I need to actually check, but the admin status function is probably being called every half second, and maybe something in there that checks link status is taking too long, especially when there is no media?
Comment 8 Aleksandr Fedorov freebsd_committer 2021-06-04 10:49:44 UTC
Created attachment 225542 [details]
ifmedia status test program
Comment 9 Aleksandr Fedorov freebsd_committer 2021-06-04 12:01:20 UTC
I wrote a quick and dirty program to test getting link status in a loop. See attachments.

To start:
c++ ifmediatst.cpp -o ifmediatst
./ifmediatst em0

Also I found Notebook (DELL Latitude 5280) with I219-LM network card.

Some tests output. RELASE-13.0 with iflib based driver:
./ifmediatst em0

iteration: 16631 state: active, duration: 0.146374 ms
iteration: 16632 state: active, duration: 0.148981 ms
iteration: 16633 state: active, duration: 0.149550 ms
iteration: 16634 state: active, duration: 0.147517 ms
iteration: 16635 state: no carrier, duration: 111.931014 ms
iteration: 16636 state: no carrier, duration: 224.454816 ms
iteration: 16637 state: no carrier, duration: 111.525601 ms
iteration: 16638 state: no carrier, duration: 111.994844 ms
iteration: 16639 state: no carrier, duration: 111.995998 ms
iteration: 16640 state: no carrier, duration: 111.996868 ms
iteration: 16641 state: no carrier, duration: 111.992766 ms
iteration: 16642 state: no carrier, duration: 111.994449 ms
iteration: 16643 state: no carrier, duration: 111.996805 ms
iteration: 16644 state: no carrier, duration: 111.979432 ms
iteration: 16645 state: no carrier, duration: 111.992068 ms

RELEASE-13.0 with removed iflib_admin_intr_deferred(ctx) from em_if_media_status():
./ifmediatst em0
iteration: 7412 state: active, duration: 0.048852 ms
iteration: 7413 state: active, duration: 0.047400 ms
iteration: 7414 state: active, duration: 0.046978 ms
iteration: 7415 state: active, duration: 0.048232 ms
iteration: 7416 state: active, duration: 0.047024 ms
iteration: 7417 state: active, duration: 0.046711 ms
iteration: 7418 state: active, duration: 0.048584 ms
iteration: 7419 state: active, duration: 0.048262 ms
iteration: 7420 state: no carrier, duration: 55.209966 ms
iteration: 7421 state: no carrier, duration: 55.980619 ms
iteration: 7422 state: no carrier, duration: 55.990572 ms
iteration: 7423 state: no carrier, duration: 55.990879 ms
iteration: 7424 state: no carrier, duration: 55.991856 ms
iteration: 7425 state: no carrier, duration: 55.995035 ms
iteration: 7426 state: no carrier, duration: 111.989782 ms
iteration: 7427 state: no carrier, duration: 55.992155 ms
iteration: 7428 state: no carrier, duration: 55.982886 ms
iteration: 7429 state: no carrier, duration: 55.988871 ms
iteration: 7430 state: no carrier, duration: 55.990932 ms
iteration: 7431 state: no carrier, duration: 55.992620 ms
iteration: 7432 state: no carrier, duration: 55.990883 ms
iteration: 7433 state: no carrier, duration: 55.992068 ms
iteration: 7434 state: no carrier, duration: 55.992049 ms
iteration: 7435 state: no carrier, duration: 55.994219 ms
iteration: 7436 state: no carrier, duration: 111.993054 ms
iteration: 7437 state: no carrier, duration: 55.988053 ms
iteration: 7438 state: no carrier, duration: 55.992884 ms
iteration: 7439 state: no carrier, duration: 55.990556 ms
iteration: 7440 state: no carrier, duration: 55.991261 ms
iteration: 7441 state: no carrier, duration: 55.993390 ms
iteration: 7442 state: no carrier, duration: 55.991350 ms
iteration: 7443 state: no carrier, duration: 55.990535 ms
iteration: 7444 state: no carrier, duration: 52.996427 ms
iteration: 7445 state: no carrier, duration: 55.971019 ms
iteration: 7446 state: no carrier, duration: 111.987514 ms
iteration: 7447 state: no carrier, duration: 55.990414 ms

As you can see, active state: 0.1 ms -> 0.04 ms, no carrier: 111 ms -> 55 ms. Periodically, in the absence of a carrier, the value reaches 111 ms. I think this is the influence of the timer that calls em_if_update_admin_status().

RELEASE-13.0 with driver from net/intel-em-kmod:
./ifmediatst em0
iteration: 276057 state: active, duration: 0.000437 ms
iteration: 276058 state: active, duration: 0.000438 ms
iteration: 276059 state: active, duration: 0.000440 ms
iteration: 276060 state: active, duration: 0.000435 ms
iteration: 276061 state: no carrier, duration: 50.117849 ms
iteration: 276062 state: no carrier, duration: 50.113035 ms
iteration: 276063 state: no carrier, duration: 50.109604 ms
iteration: 276064 state: no carrier, duration: 50.110543 ms
iteration: 276065 state: no carrier, duration: 50.109672 ms
iteration: 276066 state: no carrier, duration: 50.109586 ms
iteration: 276067 state: no carrier, duration: 50.109930 ms
iteration: 276068 state: no carrier, duration: 50.113659 ms
iteration: 276069 state: no carrier, duration: 100.288162 ms
iteration: 276070 state: no carrier, duration: 50.109680 ms
iteration: 276071 state: no carrier, duration: 50.110047 ms
iteration: 276072 state: no carrier, duration: 50.110437 ms
iteration: 276122 state: no carrier, duration: 50.111246 ms
iteration: 276123 state: no carrier, duration: 50.110519 ms
iteration: 276124 state: no carrier, duration: 50.110134 ms
iteration: 276125 state: no carrier, duration: 100.281411 ms
iteration: 276126 state: no carrier, duration: 50.111783 ms
iteration: 276127 state: no carrier, duration: 50.110226 ms
iteration: 276128 state: no carrier, duration: 50.112179 ms
iteration: 276153 state: no carrier, duration: 50.112264 ms
iteration: 276154 state: no carrier, duration: 50.111985 ms
iteration: 276155 state: no carrier, duration: 50.113115 ms
iteration: 276156 state: no carrier, duration: 50.112568 ms
iteration: 276157 state: no carrier, duration: 100.300982 ms
iteration: 276158 state: no carrier, duration: 50.109478 ms
iteration: 276159 state: no carrier, duration: 50.109434 ms
iteration: 276160 state: no carrier, duration: 50.111644 ms
iteration: 276161 state: no carrier, duration: 50.109683 ms

A port tree driver is significantly faster in active state. Without a carrier, it's about the same: 55 vs 50 ms.
Also, the port tree driver has a longer timer period.

Alexey, may you perform similar tests with ifmediatst utility?
Comment 10 Aleksandr Fedorov freebsd_committer 2021-06-04 12:02:34 UTC
Created attachment 225543 [details]
RELEASE-13 stock iflib driver log
Comment 11 Alexey Dokuchaev freebsd_committer 2021-06-04 14:22:50 UTC
(In reply to Aleksandr Fedorov from comment #9)
> Alexey, may you perform similar tests with ifmediatst utility?
My results mostly agree with yours, however, they are slightly higher and there are more outliers:

Active, ports driver
15688292 samples, 0.000440~52.404189, mean 0.000695, stdev 0.035022

Unplugged, ports driver
699 samples, 50.136990~351.355143, mean 67.735449, stdev 39.253407

Active, removed iflib_admin_intr_deferred()
117707 samples, 0.046730~160.303206, mean 0.098550, stdev 1.004555

Unplugged, removed iflib_admin_intr_deferred()
169 samples, 52.865603~497.009039, mean 75.350125, stdev 52.074932

Active, unpatched
40666 samples, 0.046684~185.143686, mean 0.170622, stdev 1.997476

Unplugged, unpatched
61 samples, 54.975691~439.951595, mean 135.355143, stdev 66.179309
Comment 12 Aleksandr Fedorov freebsd_committer 2021-06-04 15:01:07 UTC
Yes, the result is similar.

May you run the following DTrace script with plugged and unplugged cable?

dtrace -qn 'fbt::iflib_media_status:entry /execname == "ktorrent"/ {@a=count()} tick-1s {printa(@a); trunc(@a,0);}'

I'm wondering how many calls a ktorrent makes per second.
Comment 13 Alexey Dokuchaev freebsd_committer 2021-06-04 19:53:36 UTC
(In reply to Aleksandr Fedorov from comment #12)
> I'm wondering how many calls a ktorrent makes per second.
Unplugged: 661 samples, 1 ~ 11, mean 7.562784, stdev 2.144989

Plugged: 263 samples, 1 ~ 116, mean 11.958175, stdev 28.016662

These are with stock (unpatched) base driver.
Comment 14 Kevin Bowling freebsd_committer 2021-06-17 01:02:07 UTC
(In reply to Alexey Dokuchaev from comment #11)
I think it is probably ok to remove the 'iflib_admin_intr_deferred' line in 'em_if_media_status' but by your numbers it doesn't seem to be causal.

There are some tricky/unpleasant things about this hardware, if you go dig down in the e1000 API there is a significant loop with a significant delay in e1000_phy_has_link_generic.  It may make sense to do some memoization on low quality hw like this, and be more careful about enqueuing the admin task if it's already queued.