Bug 252165 - usb network and mii bus media status race condition
Summary: usb network and mii bus media status race condition
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-26 16:14 UTC by Ali Abdallah
Modified: 2022-09-28 10:13 UTC (History)
6 users (show)

See Also:


Attachments
Workaround race in link status change (1.50 KB, patch)
2021-01-04 19:37 UTC, Ali Abdallah
no flags Details | Diff
dtrace ure usb traces (90.43 KB, text/plain)
2022-09-28 10:13 UTC, Ali Abdallah
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ali Abdallah 2020-12-26 16:14:32 UTC
I'm using if_ure on a USB-C dock station. I'm having the following random issue:

Dec 25 12:20:10 Fryzen495 kernel: ue0: link state changed to UP
Dec 25 12:20:10 Fryzen495 kernel: ue0: link state changed to DOWN
Dec 25 12:20:10 Fryzen495 kernel: ue0: link state changed to UP
Dec 25 12:20:58 Fryzen495 kernel: ue0: link state changed to DOWN
Dec 25 12:21:01 Fryzen495 kernel: ue0: link state changed to UP
Dec 25 12:21:07 Fryzen495 dhclient[20388]: New IP Address (ue0): 192.168.1.145
Dec 25 12:21:07 Fryzen495 dhclient[21191]: New Subnet Mask (ue0): 255.255.255.0
Dec 25 12:21:07 Fryzen495 dhclient[22099]: New Broadcast Address (ue0): 192.168.1.255
Dec 25 12:21:07 Fryzen495 dhclient[22651]: New Routers (ue0): 192.168.1.254

After digging into the code, I've discovered that the issue is caused by the following race condition:

In mii/mii.c, the function miibus_linkchg(device_t dev) reads mii->mii_media_status and sets the link state accordingly. 

static void
miibus_linkchg(device_t dev)
{
...

   if (mii->mii_media_status & IFM_AVALID) {
       if (mii->mii_media_status & IFM_ACTIVE)
           link_state = LINK_STATE_UP;
       else
           link_state = LINK_STATE_DOWN;

...
}

On the other hand, the function mii_pollstat running on another thread, polls the media status, which is using rgephy.c code in my case. (via PHY_SERVICE call)

static void
rgephy_status(struct mii_softc *sc)
{
   mii->mii_media_status = IFM_AVALID;
   mii->mii_media_active = IFM_ETHER;

   /* HERE IS THE ISSUE
     The read in miibus_linkchg is randomly seeing mii->mii_media_status as IFM_AVALID,
     but it does not yet have the IFM_ACTIVE flag set, which makes it trigger a link state change...
   */

   if (rgephy_linkup(sc) != 0)
      mii->mii_media_status |= IFM_ACTIVE;
   ...

This is an issue in all miibus modules, rgephy.c, truephy.c, mlphy.c, etc...

Please note that also in mii_pollstat, mii->mii_media_status is initialized to 0 (not sure why), and that also races with miibus_linkchg.

There are many ways to fix it, locally I'm not resetting mii->mii_media_status until rgephy_linkup(sc) returns link down.
Comment 1 Rodney W. Grimes freebsd_committer 2020-12-27 10:14:11 UTC
I believe I have seen this several times in the past, mostly when I am doing diskless boots via pxe/ipxe/nfs.
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2020-12-27 12:20:28 UTC
^Triage: set to net@ as this affect miibus.
Comment 3 Ali Abdallah 2021-01-04 19:37:43 UTC
Created attachment 221274 [details]
Workaround race in link status change

(In reply to Rodney W. Grimes from comment #1)
The various mii modules are "terribly" SMP broken, they all reset the media status to

mii->mii_media_status = IFM_AVALID; 

then call the hardware specific function to check the link and eventually set

mii->mii_media_status |= IFM_ACTIVE;

But during this time the call to miibus_linkchg might trigger a link change, as it randomly sees the IFM_AVALID flag only, before the IFM_ACTIVE being set.

Attached is the workaround I'm currently applying to avoid the issue on my T495 with the Thinkpad USB-C gen2 docking station (My hardware uses rgephy.c). If you think this is the way to go, I can provide a full patch for the rest of the mii bus modules.
Comment 4 Hans Petter Selasky freebsd_committer 2021-01-07 14:47:42 UTC
Could you extend your patch to fix the other cases you see aswell, and create a patch at https://reviews.freebsd.org/differential/ with full context?
Comment 5 Ali Abdallah 2021-01-08 07:19:18 UTC
(In reply to Hans Petter Selasky from comment #4)

Yes I'll do that.
Comment 6 Ali Abdallah 2021-01-11 12:30:07 UTC
I had a deeper look into the issue, applying the work-around described in comment 3 is definitely not the way to go, the problem lies somewhere else.

Actually the code path that end up calling miibus_linkchg is mii_tick and mii_poll (through mii_phy_update). At the first look they seem to be running without lock, but looking at the usb ethernet specific code (if_ure.c in my case), a common lock is acquired before calling those functions.

in ure_tick the lock is already acquired beforehand, by the usb ethernet code 

static void
ure_tick(struct usb_ether *ue)
....
URE_LOCK_ASSERT(sc, MA_OWNED); 
mii_tick(mii);
....

ure_ifmedia_sts is what polls for media status, and it acquires the same lock before calling mii_pollstat.

static void
ure_ifmedia_sts(struct ifnet *ifp, struct ifmediareq *ifmr)
...
URE_LOCK(sc);
mii_pollstat(mii);
...
URE_UNLOCK(sc);

What I'm seeing is that, despite the lock, sometimes they run both in parallel, which is causing the race in the media status... I'm not sure why, and it is kind of driving me crazy.
Comment 7 Hans Petter Selasky freebsd_committer 2021-01-11 12:39:13 UTC
Did you try using dtrace, or adding simple prints:

printf("%d %s:%d\n", ticks, __FILE__, __LINE__);

--HPS
Comment 8 Ali Abdallah 2021-01-11 12:43:57 UTC
Yes I did add simple printf, showing the thread id, and this is what I have

Jan  8 16:24:47 frix230 kernel: tick begin  thread 0xfffff800a2343000
Jan  8 16:24:47 frix230 kernel: tick finish thread 0xfffff800a2343000
Jan  8 16:24:48 frix230 kernel: 
Jan  8 16:24:48 frix230 kernel: poll begin  thread 0xfffff8001e67c000
Jan  8 16:24:48 frix230 kernel: poll finish thread 0xfffff8001e67c000
Jan  8 16:24:48 frix230 kernel: 
Jan  8 16:24:48 frix230 kernel: 
Jan  8 16:24:48 frix230 kernel: poll begin  thread 0xfffff8001e67c000
Jan  8 16:24:48 frix230 kernel: poll finish thread 0xfffff8001e67c000
Jan  8 16:24:48 frix230 kernel: 
Jan  8 16:24:48 frix230 kernel: tick begin  thread 0xfffff800a2343000
Jan  8 16:24:48 frix230 kernel: 
Jan  8 16:24:48 frix230 kernel: poll begin  thread 0xfffff8001e67c000 
Jan  8 16:24:48 frix230 kernel: Link down detected from thread 0xfffff800a2343000 
Jan  8 16:24:48 frix230 kernel: tick finish thread 0xfffff800a2343000
Jan  8 16:24:48 frix230 kernel: ue0: link state changed to DOWN
Jan  8 16:24:48 frix230 kernel:
Jan  8 16:24:48 frix230 kernel: Link up detected from thread 0xfffff8001e67c000
Jan  8 16:24:48 frix230 kernel: poll finish thread 0xfffff8001e67c000
Jan  8 16:24:48 frix230 kernel: ue0: link state changed to UP
Jan  8 16:24:48 frix230 kernel: 

All those print are after having acquired the lock and before releasing it. As you can see, the poll thread begins while the tick one didn't finish yet.
Comment 9 Ali Abdallah 2021-01-11 12:59:03 UTC
How can that be even possible? I've added also the following static variable in ure_tick

URE_LOCK_ASSERT(sc, MA_OWNED);                                                          
printf("\ntick begin thread %p\n", curthread);
ticking = 1;
mii_tick(mii);                                                                                                                
ticking = 0;
printf("\ntick end thread %p\n", curthread);

And in ure_ifmedia_sts

URE_LOCK(sc);
printf("poll begin %p\n", curthread);
if (!ticking)                                                                                                                 
  mii_pollstat(mii);
printf("poll end %p\n", curthread);
...
URE_UNLOCK(sc);

And it is running fine since last Friday, which proves further that the poll thread is being executed while the tick on is running, despite the fact that they use the same lock!

The only difference I personally see here, unlike mii_tick, ure_ifmedia_sts is called at the end of a syscall, not sure if this is relevant here.
Comment 10 Hans Petter Selasky freebsd_committer 2021-01-11 13:24:25 UTC
Hi,

Can you print "%s" curthread->td_proc->p_comm

and

curthread->td_name

Probably two threads are racing there!

--HPS
Comment 11 Ali Abdallah 2021-01-11 13:44:09 UTC
The additional strings to print changed a bit the timing, it is no longer racing on media status, but media type (it went from 1000baseT <full-duplex> to nothing during the race below).

Jan 11 14:40:42 frix230 kernel: tick begin thread 0xfffff8027121a740 ure0 usb
Jan 11 14:40:42 frix230 kernel: 
Jan 11 14:40:42 frix230 kernel: poll begin 0xfffff801ca32f000 ifconfig ifconfig
Jan 11 14:40:42 frix230 kernel: poll finish 0xfffff801ca32f000 ifconfig ifconfig
Jan 11 14:40:42 frix230 kernel: 
Jan 11 14:40:42 frix230 kernel: poll begin 0xfffff801ca32f000 ifconfig ifconfig
Jan 11 14:40:42 frix230 kernel: poll finish 0xfffff801ca32f000 ifconfig ifconfig
Jan 11 14:40:42 frix230 kernel: 
Jan 11 14:40:42 frix230 kernel: poll begin 0xfffff801ca32f000 ifconfig ifconfig
Jan 11 14:40:42 frix230 kernel: poll finish 0xfffff801ca32f000 ifconfig ifconfig
Jan 11 14:40:42 frix230 kernel: tick end thread 0xfffff8027121a740 ure0 usb

For the test, I'm calling ifconfig continuously from a script, but any network monitoring applet will trigger the issue.
Comment 12 Ali Abdallah 2021-01-20 13:55:24 UTC
Any possible explanation of what is happening here? Thanks.
Comment 13 Ali Abdallah 2021-03-01 17:25:22 UTC
I had a better look at the code, and I think I have an explanation of what is the root cause of the issue. 

The function uether_rxflush releases the lock, causing a mess up with lock/unlock orders.

void
uether_rxflush(struct usb_ether *ue)
{
        ...       
        UE_LOCK_ASSERT(ue, MA_OWNED);
                
        n = mbufq_flush(&ue->ue_rxq);
        UE_UNLOCK(ue);
        NET_EPOCH_ENTER(et);
        ...
        NET_EPOCH_EXIT(et);
        UE_LOCK(ue);
}   

I've dtraced the calls

2  64153           ure_ifmedia_sts:return      155646206056
0  64153           ure_ifmedia_sts:return      155647214955
0  64152            ure_ifmedia_sts:entry      155647476515
2  64153           ure_ifmedia_sts:return      155648459628
2  64152            ure_ifmedia_sts:entry      155648490676
1  64220             uether_rxflush:entry      155649463422  <-
0  64153           ure_ifmedia_sts:return      155649466344  <-
1  64221            uether_rxflush:return      155649488201


When uether_rxflush:entry is called, it has the lock, but then it is releasing it and re-acquiring it later, causing the mess with ure_ifmedia_sts IMHO.
Comment 14 Ali Abdallah 2021-03-02 07:39:28 UTC
Even without UE_LOCK and UE_UNLOCK of comment 13 the issue is still occurring... 

The following dtrace is observed when the link state fictitiously changes 

2  64192               ue_tick_task:entry      242528044179
2  64175                   ure_tick:entry      242528050925
1  64152            ure_ifmedia_sts:entry      242529137165
2  64176                  ure_tick:return      242529369904
1  64152            ure_ifmedia_sts:entry      242529474008
2  64153           ure_ifmedia_sts:return      242530486029

For whatever reason... Normally the sequence should be 

2  64152            ure_ifmedia_sts:entry      243552084077
2  64153           ure_ifmedia_sts:return      243553071554
0  64192               ue_tick_task:entry      243553441743
0  64175                   ure_tick:entry      243553444178
2  64176                  ure_tick:return      243554572011
3  64152            ure_ifmedia_sts:entry      243663084286
3  64153           ure_ifmedia_sts:return      243664074467
Comment 15 Hans Petter Selasky freebsd_committer 2021-03-02 10:43:15 UTC
Probably all of the chip configuration and status reads need to be serialized on a USB process callback thread.

--HPS
Comment 16 Ali Abdallah 2021-03-02 16:47:22 UTC
(In reply to Hans Petter Selasky from comment #15)

Probably, but I'm really not sure how to deal in a usb callback thread with the SIOCSIFMEDIA ioctl.

Also, I'm still not able to identify the root cause of the issue, the fact that the usb thread and the ifconfig ioctl thread use the same lock should avoid any race, but for some reason it is not happening. I would still like to understand what is going wrong.
Comment 17 Ali Abdallah 2022-04-08 13:56:42 UTC
After digging further into the code, I'm still unable to understand how the inversion described in comment 14 is even possible...

On a spare system T430 with a USB ethernet, I've compiled the kernel with 

options WITNESS
options INVARIANTS
options INVARIANT_SUPPORT
options DEBUG_LOCKS
options DIAGNOSTIC

The up/down link issue is even more frequent with the debug kernel, but all I see is 

if_delmulti_locked: detaching ifnet instance 0xfffff8001880b800
ue0: link state changed to DOWN
ue0: link state changed to UP
ue0: link state changed to DOWN
ue0: link state changed to UP
ue0: link state changed to DOWN
ue0: link state changed to UP

This issue is driving me crazy, any fbsd kernel locking expert to help here?
Comment 18 Hans Petter Selasky freebsd_committer 2022-04-08 14:47:30 UTC
dtrace does not work if a function was inlined in the same C-file it was declared!

Try to compile a kernel w/o optimizations, -O0, first.

--HPS
Comment 19 Ali Abdallah 2022-04-08 16:58:43 UTC
(In reply to Hans Petter Selasky from comment #18)
Compiling the kernel with COPTFLAGS=-O0 and CFLAGS=-O0 resulted in non-working dtrace... (dtrace: invalid probe specifier... "/usr/lib/dtrace/mbuf.d", line 1: failed to copy type of 'm_data': Type information is in parent and unavailable.) But anyway the problem occurs, with no particular dmesg from witness or debug_lock.

On the other hand, on the standard generic kernel, when the issue is not occurring, the following dtrace is shown

# dtrace -n "fbt:kernel:mii_pollstat:*,fbt:uether:ue_tick_task:*,fbt:if_ure:ure_tick:*,fbt:if_ure:ure_ifmedia_sts:*,fbt:kernel:mii_tick:*"

  0  92491            ure_ifmedia_sts:entry 
  0  35254               mii_pollstat:entry 
  2  35255              mii_pollstat:return 
  2  92492           ure_ifmedia_sts:return 

  3  92425               ue_tick_task:entry 
  3  92489                   ure_tick:entry 
  3  35276                   mii_tick:entry 
  3  35277                  mii_tick:return 
  3  92490                  ure_tick:return 

Which are the two "code-path" checking for link status. Now with the same dtrace probes, when the link up/down problem occurs, I see 

  3  92425               ue_tick_task:entry 
  3  92489                   ure_tick:entry 
  3  35276                   mii_tick:entry 
  0  92491            ure_ifmedia_sts:entry  <----
  0  35254               mii_pollstat:entry  <----
  3  35277                  mii_tick:return 
  3  92490                  ure_tick:return

How that is possible? ure_ifmedia_sts should acquire the same mutex lock as ue_tick_task before calling mii_pollstat. And this is not happening apparently.
Comment 20 Hans Petter Selasky freebsd_committer 2022-04-08 20:20:09 UTC
Can you make dtrace print the backtrace? Maybe it is another ethernet device?

--HPS
Comment 21 Ali Abdallah 2022-04-09 06:24:47 UTC
(In reply to Hans Petter Selasky from comment #20)
Not sure how to do that, but anyway, it is the only USB ethernet device I have that uses if_ure and mii bus, if not loaded, nothing is shown in the mentioned dtrace command.
Comment 22 vxasxfepbikrfqdruz 2022-09-19 19:15:57 UTC
I believe I am having the same issue with if_axe using a usb to ethernet adapter based on the ASIX AX88178 chipset. It works (sort of) and I can get the full speed I'm expecting but dmesg shows the interface as going UP and DOWN constantly which seems to manifest as dropped packets. If there's any information I can provide please ask.

Has there been any progress on this? (In reply to Ali Abdallah from comment #6) I know it's not a good fix but would you be willing to apply that workaround to if_axe so I could build it myself?
Comment 23 Ali Abdallah 2022-09-23 09:59:24 UTC
(In reply to vxasxfepbikrfqdruz from comment #22)

Unfortunately I had no time recently to continue debugging the issue, I'm still using the ugly workaround in my local branch
Comment 24 Ali Abdallah 2022-09-28 10:13:21 UTC
Created attachment 236901 [details]
dtrace ure usb traces

I got another machine on which I didn't apply the workaround, running 13.1, with a USB Lan adapter that uses if_ure, I used the following dtrace probes.

dtrace -n 'fbt:miibus:mii_pollstat:*,fbt:uether:ue_tick_task:*,fbt:if_ure:ure_tick:*,fbt:if_ure:ure_ifmedia_sts:*,fbt:miibus:mii_tick:*,fbt:miibus:miibus_linkchg:*, fbt:usb:* { printf("%p", curthread); }'

You can see from the attached debug log (numbered lines)

1  48573            ure_ifmedia_sts:entry fffffe00dc75c720
1  48475               mii_pollstat:entry fffffe00dc75c720

But before exit entry of re_ifmedia_sts:return of thread fffffe00dc75c720, you can see:

1052   0  48573            ure_ifmedia_sts:entry fffffe00e00271e0
1053   2  27791        usb_command_wrapper:entry fffffe0082a17ac0
1054   2  27738      usbd_callback_wrapper:entry fffffe0082a17ac0
1055   0  48475               mii_pollstat:entry fffffe00e00271e0

And that exactly when the link up/down happened.

So thread fffffe00e00271e0 managed to go ahead and call mii_pollstat, acquiring the sc mutex, despite that thread fffffe00dc75c720 didn't release it yet, you can see that the return from "polling the media state" on thread fffffe00dc75c720 comes way after the ure_ifmedia_sts:entry of thread fffffe00e00271e0

1517   3  48476              mii_pollstat:return fffffe00dc75c720
1518   3  48574           ure_ifmedia_sts:return fffffe00dc75c720

__snip if_ure__ 

ure_ifmedia_sts
{
  ...
  URE_LOCK(sc);                                                                                                                
  mii_pollstat(mii);
  ifmr->ifm_active = mii->mii_media_active;
  ifmr->ifm_status = mii->mii_media_status;
  URE_UNLOCK(sc);
}
__snip__

The mutex locked/released by URE_LOCK, is the same mutex used in if_ure, USB xfer and USB transfer, so I guess something is not playing well in the release/acquire sequence...

Will try if time permits to dig more into the USB transfer/process code, but I have to say that I'm extremely surprised that this bug didn't attract much attention, it actually renders any USB Ethernet device unusable on FreeBSD...