Bug 189003

Summary: [lagg] Page fault in lacp_req() while the lagg is being destroyed
Product: Base System Reporter: Alan Somers <asomers>
Component: kernAssignee: Alan Somers <asomers>
Status: Closed FIXED    
Severity: Affects Some People    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description Alan Somers freebsd_committer freebsd_triage 2014-04-25 23:40:01 UTC
If you do an "ifconfig -am" in one thread while doing an "ifconfig lagg0 destroy" in another thread, at least two panics may result.  One is in lacp_req(), caused by NULL == lsc.

What happens is that the "ifconfig lagg0 destroy" thread does this:
1) lagg_clone_destroy() acquires LAGG_WLOCK(sc)
2) lagg_clone_destroy() calls lagg_lacp_detach, which calls lacp_detach, which sets sc->sc_psc = NULL
3) lagg_clone_destroy() calls LAGG_WUNLOCK(sc)

then the "ifconfig status" thread does this:
1) calls lagg_ioctl(SIOCGLAGG)
2) lagg_ioctl() acquires LAGG_RLOCK(sc, &tracker)
3) lagg_ioctl() calls sc->sc_req, which dereferences to lacp_req
4) lacp_req does *lsc = LACP_SOFTC(sc), which returns NULL
5) lacp_req dereferences lsc, and panics


db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe009781d380
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe009781d430
witness_warn() at witness_warn+0x4b5/frame 0xfffffe009781d4f0
trap_pfault() at trap_pfault+0x59/frame 0xfffffe009781d590
trap() at trap+0x4d5/frame 0xfffffe009781d7a0
calltrap() at calltrap+0x8/frame 0xfffffe009781d7a0
--- trap 0xc, rip = 0xffffffff81eb9b44, rsp = 0xfffffe009781d860, rbp = 0xfffffe009781d890 ---
lacp_req() at lacp_req+0x14/frame 0xfffffe009781d890
lagg_ioctl() at lagg_ioctl+0x270/frame 0xfffffe009781d970
ifioctl() at ifioctl+0xbf7/frame 0xfffffe009781da30
kern_ioctl() at kern_ioctl+0x22b/frame 0xfffffe009781da90
sys_ioctl() at sys_ioctl+0x13c/frame 0xfffffe009781dae0
amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe009781dbf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe009781dbf0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800fa045a, rsp = 0x7fffffffd808, rbp = 0x7fffffffe290 ---

Fix: The purpose of lacp_req is to return LACP property information to userland when you do "ifconfig lagg0".  So I think that it would be ok if it returned a block full of zeros.  This would only happen while the interface is being destroyed, and userland should be able to deal with that.  So my proposed fix (attached), is to simply check for NULL == lsc and return early.

Patch attached with submission follows:
How-To-Repeat: First, backout change 253687.  That will increase the likelihood of hitting this panic.

Run this script:

#! /usr/local/bin/bash

ifconfig tap0 create
sleep .2
ifconfig tap1 create
sleep .2
ifconfig tap2 create
sleep .2
ifconfig tap0 up
sleep .2
ifconfig tap1 up
sleep .2
ifconfig tap2 up
sleep .2

while true; do
        echo "About to create"
        ifconfig lagg0 create
        #sleep 0.2

        echo "About to up"
        ifconfig lagg0 up laggproto lacp laggport tap0 laggport tap1 laggport tap2 192.0.0.2/24
        sleep 0.2

        echo "About to destroy"
        ifconfig lagg0 destroy
        sleep 0.2
done &

while true; do
        ifconfig -am > /dev/null
done
Comment 1 Alan Somers freebsd_committer freebsd_triage 2014-04-25 23:41:09 UTC
Responsible Changed
From-To: freebsd-bugs->asomers

I'll take it
Comment 2 dfilter service freebsd_committer freebsd_triage 2014-05-02 17:24:13 UTC
Author: asomers
Date: Fri May  2 16:24:09 2014
New Revision: 265232
URL: http://svnweb.freebsd.org/changeset/base/265232

Log:
  Fix a panic caused by doing "ifconfig -am" while a lagg is being destroyed.
  The thread that is destroying the lagg has already set sc->sc_psc=NULL when
  the "ifconfig -am" thread gets to lacp_req().  It tries to dereference
  sc->sc_psc and panics.  The solution is for lacp_req() to check the value of
  sc->sc_psc.  If NULL, harmlessly return an lacp_opreq structure full of
  zeros.  Full details in GNATS.
  
  PR:		kern/189003
  Reviewed by:	timeout on freebsd-net@
  MFC after:	3 weeks
  Sponsored by:	Spectra Logic Corporation

Modified:
  head/sys/net/ieee8023ad_lacp.c

Modified: head/sys/net/ieee8023ad_lacp.c
==============================================================================
--- head/sys/net/ieee8023ad_lacp.c	Fri May  2 16:15:34 2014	(r265231)
+++ head/sys/net/ieee8023ad_lacp.c	Fri May  2 16:24:09 2014	(r265232)
@@ -590,10 +590,20 @@ lacp_req(struct lagg_softc *sc, caddr_t 
 {
 	struct lacp_opreq *req = (struct lacp_opreq *)data;
 	struct lacp_softc *lsc = LACP_SOFTC(sc);
-	struct lacp_aggregator *la = lsc->lsc_active_aggregator;
+	struct lacp_aggregator *la;
 
-	LACP_LOCK(lsc);
 	bzero(req, sizeof(struct lacp_opreq));
+	
+	/* 
+	 * If the LACP softc is NULL, return with the opreq structure full of
+	 * zeros.  It is normal for the softc to be NULL while the lagg is
+	 * being destroyed.
+	 */
+	if (NULL == lsc)
+		return;
+
+	la = lsc->lsc_active_aggregator;
+	LACP_LOCK(lsc);
 	if (la != NULL) {
 		req->actor_prio = ntohs(la->la_actor.lip_systemid.lsi_prio);
 		memcpy(&req->actor_mac, &la->la_actor.lip_systemid.lsi_mac,
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 3 Alan Somers freebsd_committer freebsd_triage 2014-05-02 17:25:04 UTC
State Changed
From-To: open->patched

Patched by revision 265232
Comment 4 commit-hook freebsd_committer freebsd_triage 2014-10-06 23:18:03 UTC
A commit references this bug:

Author: asomers
Date: Mon Oct  6 23:17:02 UTC 2014
New revision: 272672
URL: https://svnweb.freebsd.org/changeset/base/272672

Log:
  MFC r265232

  Fix a panic caused by doing "ifconfig -am" while a lagg is being destroyed.
  The thread that is destroying the lagg has already set sc->sc_psc=NULL when
  the "ifconfig -am" thread gets to lacp_req().  It tries to dereference
  sc->sc_psc and panics.  The solution is for lacp_req() to check the value of
  sc->sc_psc.  If NULL, harmlessly return an lacp_opreq structure full of
  zeros.  Full details in GNATS.

  PR:	189003

Changes:
_U  stable/10/
  stable/10/sys/net/ieee8023ad_lacp.c
Comment 5 commit-hook freebsd_committer freebsd_triage 2014-10-07 15:22:20 UTC
A commit references this bug:

Author: asomers
Date: Tue Oct  7 15:21:20 UTC 2014
New revision: 272705
URL: https://svnweb.freebsd.org/changeset/base/272705

Log:
  MFC r265232

  Fix a panic caused by doing "ifconfig -am" while a lagg is being destroyed.
  The thread that is destroying the lagg has already set sc->sc_psc=NULL when
  the "ifconfig -am" thread gets to lacp_req().  It tries to dereference
  sc->sc_psc and panics.  The solution is for lacp_req() to check the value of
  sc->sc_psc.  If NULL, harmlessly return an lacp_opreq structure full of
  zeros.  Full details in GNATS.

  PR:   189003

Changes:
_U  stable/9/
_U  stable/9/sys/
_U  stable/9/sys/net/
  stable/9/sys/net/ieee8023ad_lacp.c
Comment 6 Alan Somers freebsd_committer freebsd_triage 2014-10-07 15:25:06 UTC
MFCed to stable/10 by r272672 and to stable/9 by r272705.