Created attachment 225494 [details] procstat -kk output I'm experiencing strange and very prominent performance loss with KTorrent when unplugging Ethernet cable from em0, that is, switch to WiFi: it now takes several tens seconds for it to update the window contents, react to mouse clicks, and transfer speeds drop down to single bytes per second. At this time, the ktorrent process, as displayed by top(1), is in "iflib" state which occasionally briefly changes to "e1000_". But once I plug the network cable back, everything goes back to normal, and the process returns to "select" state like it should. This happens on Lenovo L470 laptop with the following NIC: em0@pci0:0:31:6: class=0x020000 rev=0x21 hdr=0x00 vendor=0x8086 device=0x15d8 subvendor=0x17aa subdevice=0x505f vendor = 'Intel Corporation' device = 'Ethernet Connection (4) I219-V' class = network subclass = ethernet Both today's kernel and my normal Jan/Feb'ish kernel behave identically. The driver from `net/intel-em-kmod' port does not exhibit this problem. I'm attaching procstat -kk $ktorrent_pid output, in case it could be useful. This line looks suspicious as if something is holding the lock: mi_switch+0xc1 _sx_xlock_hard+0x3d1 iflib_media_status+0xf2 ifmedia_ioctl+0x16a ifhwioctl+0x2bd ifioctl+0x48d kern_ioctl+0x23b sys_ioctl+0xf6 amd64_syscall+0x100 fast_syscall_common+0xf8 Please advise on any other debug information I could provide which may help to track the bug down and fix it.
It would be nice to figure out what exactly that "e1000_" refers to. Without looking, I have a hunch that it refers to "e1000_delay". Is ktorrent trying to query the link status of connections? I don't know what syscall it's making that would lead to iflib_media_status() being called.
Created attachment 225515 [details] dtrace output, cable plugged
Created attachment 225516 [details] dtrace output, cable unplugged Alexander suggested I run the following DTrace script, with and without the cable, to see if the problem has something to do with the lock or rather em_if_update_admin_status(): dtrace -n 'fbt::iflib_media_status:entry { self->ts = timestamp; } fbt::iflib_media_status:return /self->ts/ {@[execname, "ns"] = quantize(timestamp - self->ts); self->ts = 0; }' So, two things: 1) KTorrent is not required to reproduce the bug (good), ifconfig(8) is also slowed down, these is actually a visible delay of its console output when I run without the cable; and 2) apparently em_if_update_admin_status() is bogusly and repeatedly checking hw link status despite, as Alexander had mentioned, that there actually is a link status field which is being checked in other functions.
(In reply to Eric Joyner from comment #1) > It would be nice to figure out what exactly that "e1000_" refers to. I've patched those strings to make them pinpoint exact source code location: "e1000_" refers to sys/dev/e1000/e1000_osdep.h:97:pause("e1000_delay", ms_scale(x)); "iflib" refers to sys/net/iflib.c:531:#define CTX_LOCK_INIT(_sc) sx_init(&(_sc)->ifc_ctx_sx, "iflib ctx lock")
(In reply to Alexey Dokuchaev from comment #4) Sorry, I replied too soon. Both safe_pause_ms() and safe_pause_us() are called.
Created attachment 225522 [details] dtrace without iflib_admin_intr_deferred() call Alexander suggested commenting out iflib_admin_intr_deferred() call in sys/dev/e1000/if_em.c like this: > @@ -1564,7 +1564,7 @@ em_if_media_status(if_ctx_t ctx, struct ifmediareq *ifmr) > > INIT_DEBUGOUT("em_if_media_status: begin"); > > - iflib_admin_intr_deferred(ctx); > + //iflib_admin_intr_deferred(ctx); > > ifmr->ifm_status = IFM_AVALID; > ifmr->ifm_active = IFM_ETHER; This had caused minor, yet noticeable improvement in responsiveness: KTorrent now stays mostly in "select" state with occasional "e1000_" and rare "iflib" during which it's still stuck for several (vs several tens before) seconds, but can nonetheless perform transfers again at reasonable speeds. However, it's still not as good as once I "kldunload if_em".
(In reply to Alexey Dokuchaev from comment #6) That's something I would've suggested, too. The iflib_media_status() function already calls the driver's admin status function before calling the driver's media status function, so I don't think it's strictly necessary to call the admin task again (and thereby call the driver's admin status function more-or-less immediately afterwards!). I need to actually check, but the admin status function is probably being called every half second, and maybe something in there that checks link status is taking too long, especially when there is no media?
Created attachment 225542 [details] ifmedia status test program
I wrote a quick and dirty program to test getting link status in a loop. See attachments. To start: c++ ifmediatst.cpp -o ifmediatst ./ifmediatst em0 Also I found Notebook (DELL Latitude 5280) with I219-LM network card. Some tests output. RELASE-13.0 with iflib based driver: ./ifmediatst em0 ... iteration: 16631 state: active, duration: 0.146374 ms iteration: 16632 state: active, duration: 0.148981 ms iteration: 16633 state: active, duration: 0.149550 ms iteration: 16634 state: active, duration: 0.147517 ms iteration: 16635 state: no carrier, duration: 111.931014 ms iteration: 16636 state: no carrier, duration: 224.454816 ms iteration: 16637 state: no carrier, duration: 111.525601 ms iteration: 16638 state: no carrier, duration: 111.994844 ms iteration: 16639 state: no carrier, duration: 111.995998 ms iteration: 16640 state: no carrier, duration: 111.996868 ms iteration: 16641 state: no carrier, duration: 111.992766 ms iteration: 16642 state: no carrier, duration: 111.994449 ms iteration: 16643 state: no carrier, duration: 111.996805 ms iteration: 16644 state: no carrier, duration: 111.979432 ms iteration: 16645 state: no carrier, duration: 111.992068 ms ... RELEASE-13.0 with removed iflib_admin_intr_deferred(ctx) from em_if_media_status(): ./ifmediatst em0 ... iteration: 7412 state: active, duration: 0.048852 ms iteration: 7413 state: active, duration: 0.047400 ms iteration: 7414 state: active, duration: 0.046978 ms iteration: 7415 state: active, duration: 0.048232 ms iteration: 7416 state: active, duration: 0.047024 ms iteration: 7417 state: active, duration: 0.046711 ms iteration: 7418 state: active, duration: 0.048584 ms iteration: 7419 state: active, duration: 0.048262 ms iteration: 7420 state: no carrier, duration: 55.209966 ms iteration: 7421 state: no carrier, duration: 55.980619 ms iteration: 7422 state: no carrier, duration: 55.990572 ms iteration: 7423 state: no carrier, duration: 55.990879 ms iteration: 7424 state: no carrier, duration: 55.991856 ms iteration: 7425 state: no carrier, duration: 55.995035 ms iteration: 7426 state: no carrier, duration: 111.989782 ms iteration: 7427 state: no carrier, duration: 55.992155 ms iteration: 7428 state: no carrier, duration: 55.982886 ms iteration: 7429 state: no carrier, duration: 55.988871 ms iteration: 7430 state: no carrier, duration: 55.990932 ms iteration: 7431 state: no carrier, duration: 55.992620 ms iteration: 7432 state: no carrier, duration: 55.990883 ms iteration: 7433 state: no carrier, duration: 55.992068 ms iteration: 7434 state: no carrier, duration: 55.992049 ms iteration: 7435 state: no carrier, duration: 55.994219 ms iteration: 7436 state: no carrier, duration: 111.993054 ms iteration: 7437 state: no carrier, duration: 55.988053 ms iteration: 7438 state: no carrier, duration: 55.992884 ms iteration: 7439 state: no carrier, duration: 55.990556 ms iteration: 7440 state: no carrier, duration: 55.991261 ms iteration: 7441 state: no carrier, duration: 55.993390 ms iteration: 7442 state: no carrier, duration: 55.991350 ms iteration: 7443 state: no carrier, duration: 55.990535 ms iteration: 7444 state: no carrier, duration: 52.996427 ms iteration: 7445 state: no carrier, duration: 55.971019 ms iteration: 7446 state: no carrier, duration: 111.987514 ms iteration: 7447 state: no carrier, duration: 55.990414 ms ... As you can see, active state: 0.1 ms -> 0.04 ms, no carrier: 111 ms -> 55 ms. Periodically, in the absence of a carrier, the value reaches 111 ms. I think this is the influence of the timer that calls em_if_update_admin_status(). RELEASE-13.0 with driver from net/intel-em-kmod: ./ifmediatst em0 ... iteration: 276057 state: active, duration: 0.000437 ms iteration: 276058 state: active, duration: 0.000438 ms iteration: 276059 state: active, duration: 0.000440 ms iteration: 276060 state: active, duration: 0.000435 ms iteration: 276061 state: no carrier, duration: 50.117849 ms iteration: 276062 state: no carrier, duration: 50.113035 ms iteration: 276063 state: no carrier, duration: 50.109604 ms iteration: 276064 state: no carrier, duration: 50.110543 ms iteration: 276065 state: no carrier, duration: 50.109672 ms iteration: 276066 state: no carrier, duration: 50.109586 ms iteration: 276067 state: no carrier, duration: 50.109930 ms iteration: 276068 state: no carrier, duration: 50.113659 ms iteration: 276069 state: no carrier, duration: 100.288162 ms iteration: 276070 state: no carrier, duration: 50.109680 ms iteration: 276071 state: no carrier, duration: 50.110047 ms iteration: 276072 state: no carrier, duration: 50.110437 ms ... iteration: 276122 state: no carrier, duration: 50.111246 ms iteration: 276123 state: no carrier, duration: 50.110519 ms iteration: 276124 state: no carrier, duration: 50.110134 ms iteration: 276125 state: no carrier, duration: 100.281411 ms iteration: 276126 state: no carrier, duration: 50.111783 ms iteration: 276127 state: no carrier, duration: 50.110226 ms iteration: 276128 state: no carrier, duration: 50.112179 ms ... iteration: 276153 state: no carrier, duration: 50.112264 ms iteration: 276154 state: no carrier, duration: 50.111985 ms iteration: 276155 state: no carrier, duration: 50.113115 ms iteration: 276156 state: no carrier, duration: 50.112568 ms iteration: 276157 state: no carrier, duration: 100.300982 ms iteration: 276158 state: no carrier, duration: 50.109478 ms iteration: 276159 state: no carrier, duration: 50.109434 ms iteration: 276160 state: no carrier, duration: 50.111644 ms iteration: 276161 state: no carrier, duration: 50.109683 ms ... A port tree driver is significantly faster in active state. Without a carrier, it's about the same: 55 vs 50 ms. Also, the port tree driver has a longer timer period. Alexey, may you perform similar tests with ifmediatst utility?
Created attachment 225543 [details] RELEASE-13 stock iflib driver log
(In reply to Aleksandr Fedorov from comment #9) > Alexey, may you perform similar tests with ifmediatst utility? My results mostly agree with yours, however, they are slightly higher and there are more outliers: Active, ports driver 15688292 samples, 0.000440~52.404189, mean 0.000695, stdev 0.035022 Unplugged, ports driver 699 samples, 50.136990~351.355143, mean 67.735449, stdev 39.253407 Active, removed iflib_admin_intr_deferred() 117707 samples, 0.046730~160.303206, mean 0.098550, stdev 1.004555 Unplugged, removed iflib_admin_intr_deferred() 169 samples, 52.865603~497.009039, mean 75.350125, stdev 52.074932 Active, unpatched 40666 samples, 0.046684~185.143686, mean 0.170622, stdev 1.997476 Unplugged, unpatched 61 samples, 54.975691~439.951595, mean 135.355143, stdev 66.179309
Yes, the result is similar. May you run the following DTrace script with plugged and unplugged cable? dtrace -qn 'fbt::iflib_media_status:entry /execname == "ktorrent"/ {@a=count()} tick-1s {printa(@a); trunc(@a,0);}' I'm wondering how many calls a ktorrent makes per second.
(In reply to Aleksandr Fedorov from comment #12) > I'm wondering how many calls a ktorrent makes per second. Unplugged: 661 samples, 1 ~ 11, mean 7.562784, stdev 2.144989 Plugged: 263 samples, 1 ~ 116, mean 11.958175, stdev 28.016662 These are with stock (unpatched) base driver.
(In reply to Alexey Dokuchaev from comment #11) I think it is probably ok to remove the 'iflib_admin_intr_deferred' line in 'em_if_media_status' but by your numbers it doesn't seem to be causal. There are some tricky/unpleasant things about this hardware, if you go dig down in the e1000 API there is a significant loop with a significant delay in e1000_phy_has_link_generic. It may make sense to do some memoization on low quality hw like this, and be more careful about enqueuing the admin task if it's already queued.