Bug 234846 - [lagg] race condition when adding port
Summary: [lagg] race condition when adding port
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: Alexander Motin
Depends on:
Reported: 2019-01-11 08:26 UTC by Alexandre martins
Modified: 2019-01-31 18:19 UTC (History)
8 users (show)

See Also:

use LAGG_PORT_DISABLED for a port until it fully added (585 bytes, patch)
2019-01-11 18:29 UTC, Eugene Grosbein
no flags Details | Diff
Naive initial cut at a patch (1.27 KB, patch)
2019-01-16 21:12 UTC, Stephen Hurd
no flags Details | Diff
Naive initial cut at a patch (1.76 KB, patch)
2019-01-16 22:48 UTC, Stephen Hurd
no flags Details | Diff
use LAGG_PORT_DISABLED while adding/removing a port (761 bytes, patch)
2019-01-22 19:52 UTC, Eugene Grosbein
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexandre martins 2019-01-11 08:26:42 UTC

I'm facing some random crash when I'm adding a port into an LACP aggregate.

In the coredump, I found that the crash is due to a race condition when the interface is added to the aggregate.

The condition is the following:

The thread 1 (ifconfig lagg0 laggport em0) is adding an interface to an LACP aggregate. The code enter the function lagg_port_create in the file if_lagg.c.
That function "publish" an update of the interface with the following code:

/* Change the interface type */
lp->lp_iftype = ifp->if_type;
ifp->if_type = IFT_IEEE8023ADLAG;
ifp->if_lagg = lp;
lp->lp_ioctl = ifp->if_ioctl;
ifp->if_ioctl = lagg_port_ioctl;
lp->lp_output = ifp->if_output;
ifp->if_output = lagg_port_output;

In the thread 2, thereafter that publication, a packet come on the interface em0, use the new state of the interface, so lagg_input_p is called. Because the setup of the LACP is not finished (the call to lagg_proto_addport is after the publication), the LACP code uses a NULL pointer and the kernel crashes.

Can I change the code of the lagg_port_create function and put the "publication" of the new state of the interface at the end of the function ?

Best regards,

Comment 1 Eugene Grosbein freebsd_committer 2019-01-11 18:29:26 UTC
Created attachment 201036 [details]
use LAGG_PORT_DISABLED for a port until it fully added

lagg_input() drops incoming mbuf until LAGG_PORT_DISABLED is cleared
Comment 2 Eugene Grosbein freebsd_committer 2019-01-11 18:29:56 UTC
Please try attached patch. It should replace a panic with packed drop if an interface being added to an aggregate is up and receives a packet in process of inclusion to the aggregate.

Apply it with command:

cd /usr/src && patch < /path/to/patch

Then rebuild if_lagg.ko kernel module if you use it:

cd /usr/src/sys/modules/if_lagg
make obj depend && make all install

Or rebuild the kernel if your kernel configuration includes device lagg.
Comment 3 Alexandre martins 2019-01-14 09:24:04 UTC
The race was introduced by the commit 333612. That commit is only in version 12 and up.

I updated the ticket, sorry for that mistake.

I'm testing your patch (with minor changes to be applyed on 12.0 version).

Thank you
Comment 4 Eugene Grosbein freebsd_committer 2019-01-14 10:27:31 UTC
Assign to author of r333612.

Stephen, please take a look at the PR 234846 that describes a panic after your change to lagg(4).
Comment 5 Alexandre martins 2019-01-14 11:01:29 UTC
After some test, the crash seems disapear when adding port.

But there is the same race in the function lagg_port_destroy, that delete the LACP port then restore the interface type.
Comment 6 Stephen Hurd freebsd_committer 2019-01-14 20:01:37 UTC
Which part of LACP uses a NULL pointer?  Do you have a backtrace?
Comment 7 Alexandre martins 2019-01-15 08:03:14 UTC
I have only the network thread, the ifconfig thread have its backtrace truncated, but the td->td_proc show me that the command was "ifconfig lagg0 aggport em0"

The crash is due to the "lp_psc" field that is NULL into the "struct lagg_port"

There it is:

(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:219
#1  doadump (textdump=<optimized out>) at ../../../kern/kern_shutdown.c:298
#2  0xffffffff80459680 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:486
#3  0xffffffff80459a25 in vpanic (fmt=<optimized out>, ap=<optimized out>) at ../../../kern/kern_shutdown.c:889

#4  0xffffffff804598f3 in panic (fmt=<unavailable>) at ../../../kern/kern_shutdown.c:818
#5  0xffffffff8067e848 in trap_fatal (frame=<optimized out>, eva=<optimized out>) at ../../../amd64/amd64/trap.c:858

#6  0xffffffff8067ec27 in trap_pfault (frame=<unavailable>, usermode=<optimized out>) at 
#7  0xffffffff8067e24a in trap (frame=0xfffffe0d900b4630) at ../../../amd64/amd64/trap.c:447
#8  <signal handler called>
#9  lacp_iscollecting (lgp=0xfffff80572adbe00) at ../../../net/ieee8023ad_lacp.h:324
#10 lagg_lacp_input (sc=<optimized out>, lp=0xfffff80572adbe00, m=<optimized out>) at ../../../net/if_lagg.c:2176

#11 0xffffffff80521155 in lagg_proto_input (sc=<optimized out>, m=<optimized out>, sc=<optimized 
out>, lp=<optimized out>, m=<optimized out>) at ../../../net/if_lagg.c:362
#12 lagg_input (ifp=0xfffff8034bfc3800, m=0xfffff8065cb60a00) at ../../../net/if_lagg.c:1667
#13 0xffffffff8051d555 in ether_input (ifp=0xfffff8034bfc3800, m=0xfffff8065cb60a00) at ../../../net/if_ethersubr.c:794

#14 0xffffffff8052d0fa in pollif_recv () at ../../../net/netisr_pollif.c:529
#15 0xffffffff8052c7a5 in swi_net (arg=0xfffffe10ca797880) at ../../../net/netisr.c:824
#16 0xffffffff8042fe5b in intr_event_execute_handlers (p=<optimized out>, ie=0xfffff8034bf12a00) 
at ../../../kern/kern_intr.c:1297
#17 0xffffffff80430456 in ithread_execute_handlers (ie=<optimized out>, p=<optimized out>) 
at ../../../kern/kern_intr.c:1310
#18 ithread_loop (arg=0xfffff8034befd7c0) at ../../../kern/kern_intr.c:1394
#19 0xffffffff8042d595 in fork_exit (callout=0xffffffff804303c0 <ithread_loop>, arg=0xfffff8034befd7c0, 
frame=0xfffffe0d900b49c0) at ../../../kern/kern_fork.c:1027
#20 <signal handler called>
Comment 8 Stephen Hurd freebsd_committer 2019-01-16 19:22:49 UTC
Thanks, I'm having problems reproducing this, do you have a similar panic in lagg_port_destroy()?
Comment 9 Stephen Hurd freebsd_committer 2019-01-16 21:12:48 UTC
Created attachment 201200 [details]
Naive initial cut at a patch

Initial cut at a patch using LAGG_RLOCK(), which requires moving the lacp_linkstate() call out of lacp_port_create().
Comment 10 Stephen Hurd freebsd_committer 2019-01-16 22:48:45 UTC
Created attachment 201203 [details]
Naive initial cut at a patch

Updated patch.
Comment 11 Alexandre martins 2019-01-17 08:37:33 UTC
(In reply to Stephen Hurd from comment #8)

Yes, we also have the race when removing the port, but more unlikely. In that case, the backtrace looks the same.
Comment 12 Stephen Hurd freebsd_committer 2019-01-17 18:03:50 UTC
(In reply to Alexandre martins from comment #11)

What revision or release is that backtrace from?
Comment 13 Alexandre martins 2019-01-18 08:33:47 UTC
I'm at the r321477 :

Comment 14 Stephen Hurd freebsd_committer 2019-01-18 18:22:31 UTC
(In reply to Alexandre martins from comment #13)

Ok, I'm a bit confused here, I must be missing something.  Here's my (mis)understanding... please correct me where I'm wrong.

- This bug was initially filed against 11.2.
- Comment #3 says that the issue is caused by r333612.
- Bug was assigned to me as a result.
- Version was changed to 12.0-STABLE as a result.
- Panic occurs with r321477.
- r321477 < r333612, so the r333612 change is not present when the panic occurs.
- r321477 is 12.0-CURRENT (should show in uname -r)

If my understanding is correct, can you test with 13.0-CURRENT, 12.0-RELEASE or 12-STABLE?
Comment 15 Alexandre martins 2019-01-21 09:43:36 UTC
Hello Stephen,

I'm sorry, I have made a mistake when I give the revision of the bug, It's the 317696 that introduce the bug, when the lock was split in two (SX plus RM).

I'm resume all of my story:

- We are working on a base of FreeBSD 10.3, and faced crashes with LACP
- We updated the if_lagg code to the revision 321477, which solved our initial problem
- However, another crash is occurring (the one that I put the backtrace here)
- When I searched the cause of the race, I take the wrong lock change (333612 instead of 321477)

The root cause of the problem remain the same: ether_input call lagg_input with the lacp port not fully initialized.
The function lagg_input locks the (RM / EPOCK) in read mode but that lock is not taken in lagg_port_create/lagg_port_destroy, who is locked by the SX.

The first patch with LAGG_PORT_DISABLED works for us.

My apologies for the confusion.

Best regards,

Comment 16 Stephen Hurd freebsd_committer 2019-01-21 20:08:54 UTC
Reassigning and re-opening.
Comment 17 Eugene Grosbein freebsd_committer 2019-01-21 20:46:34 UTC
Alexander, please take a look at the PR 234846 reporting panic in lagg(4) after your commit https://svnweb.freebsd.org/base?view=revision&revision=317696
Comment 18 Eugene Grosbein freebsd_committer 2019-01-22 19:52:15 UTC
Created attachment 201340 [details]
use LAGG_PORT_DISABLED while adding/removing a port

Pleas try updated patch. This revision disables port before performing removal, too.
Comment 19 Alexandre martins 2019-01-23 09:22:02 UTC
Hi Eugene,

The patch seems to work for the creation part.

However, the race into the destroy part is still here:

Thread 1 (network):
ether_input checks the type of the interface and will call lagg_input

Thread 2 (ifconfig -laggport):
delete the port, and resets the type of the iface and set ifp->if_lagg = NULL;

Thread 1 (network):
is entering lagg_input function and crash because the pointer if_lagg is now NULL
Comment 20 Alexandre martins 2019-01-24 09:42:26 UTC

Another crash has appear this night. It is the same race as the destroy part, but in the create:

Thread 1 (ifconfig laggport):
Creates the port (with the DISABLE flag), and sets the type of the iface. This thread will set ifp->if_lagg in few instruction ...

Thread 2 (network):
Checks the type and enter lagg_input function and crashes because the pointer if_lagg is still NULL