My wireless network card is an Intel PRO/Wireless 5100, and I'm using the iwn driver. /etc/rc.conf contains the following: wlans_iwn0="wlan0" ifconfig_wlan0="WPA SYNCDHCP" And /etc/wpa_supplicat.conf has the appropriate settings for some access points. When the system boots, the network is established correctly, but whenever I need to restart it via '/etc/rc.d/netif restart', when I ping my access point around 10 packets are sent before the network goes down and 'ifconfig wlan0' shows it is looking for different APs (or even the same AP in diverse channels, for example). When a connection is established to the AP again, it goes down after a few seconds again. If I do '/etc/rc.d/netif restart' again, the connection stops dropping. How-To-Repeat: /etc/rc.d/netif restart
Responsible Changed From-To: freebsd-bugs->freebsd-net Over to maintainer(s).
State Changed From-To: open->suspended This is known issue. There is race in devd and our rc-subsystem if wpa_supplicant is involved effectivly resulting in starting wpa_supplicant twice. Both instances try to take over the wlan device which results in what you are seeing. I have no idea how to fix this right now, so this has to wait until I'm able to think of proper fix. As a workaround, don't use netif restart but kldunload if_iwn; kldload if_iwn instead.
Responsible Changed From-To: freebsd-net->bschmidt over to me
> There is race in devd and our rc-subsystem if wpa_supplicant is involved > effectivly resulting in starting wpa_supplicant twice. Both instances try > to take over the wlan device which results in what you are seeing. > I have no idea how to fix this right now, so this has to wait until I'm able > to think of proper fix. Perhaps, wrapping wpa_supplicant invocation into "lockf -t0" would help to eliminate race? Eugene Grosbein
On Tuesday, January 04, 2011 09:08:24 Eugene Grosbein wrote: > > There is race in devd and our rc-subsystem if wpa_supplicant is involved > > effectivly resulting in starting wpa_supplicant twice. Both instances try > > to take over the wlan device which results in what you are seeing. > > I have no idea how to fix this right now, so this has to wait until I'm > > able to think of proper fix. > > Perhaps, wrapping wpa_supplicant invocation into "lockf -t0" would help > to eliminate race? Possibly, but I don't think this is the way to go. Currently wpa_supplicant has this code: /* * Mark the interface as down to ensure wpa_supplicant has exclusive * access to the net80211 state machine, do this before opening the * route socket to avoid a false event that the interface disappeared. */ if (getifflags(drv, &flags) == 0) (void) setifflags(drv, flags &~ IFF_UP); This code works such that it will send an event to already running wpa_supplicant instances which will then terminate. This does indeed work if there's enough delay between invocations, though, if there is just a small delay (~100ms or something), that event doesn't get passed probably. I think we should start looking into possible solution at that point, trying to figure out why the the event doesn't get passed (probably because the interface is not yet up at that point) will get us closer to proper solution. -- Bernhard
On 04.01.2011 15:06, Bernhard Schmidt wrote: >> Perhaps, wrapping wpa_supplicant invocation into "lockf -t0" would help >> to eliminate race? > > Possibly, but I don't think this is the way to go. > > Currently wpa_supplicant has this code: > /* > * Mark the interface as down to ensure wpa_supplicant has exclusive > * access to the net80211 state machine, do this before opening the > * route socket to avoid a false event that the interface disappeared. > */ > if (getifflags(drv, &flags) == 0) > (void) setifflags(drv, flags &~ IFF_UP); > > This code works such that it will send an event to already running > wpa_supplicant instances which will then terminate. This does indeed work if > there's enough delay between invocations, though, if there is just a small > delay (~100ms or something), that event doesn't get passed probably. I think > we should start looking into possible solution at that point, trying to figure > out why the the event doesn't get passed (probably because the interface is > not yet up at that point) will get us closer to proper solution. Proper fine-grained locking was always good solution for race problem :-) How about using flock(2) in wpa_supplicant source code? Eugene Grosbein
On Tuesday, January 04, 2011 10:09:15 Eugene Grosbein wrote: > On 04.01.2011 15:06, Bernhard Schmidt wrote: > >> Perhaps, wrapping wpa_supplicant invocation into "lockf -t0" would help > >> to eliminate race? > > > > Possibly, but I don't think this is the way to go. > > > > Currently wpa_supplicant has this code: > > /* > > > > * Mark the interface as down to ensure wpa_supplicant has > > exclusive * access to the net80211 state machine, do this > > before opening the * route socket to avoid a false event that > > the interface disappeared. */ > > > > if (getifflags(drv, &flags) == 0) > > > > (void) setifflags(drv, flags &~ IFF_UP); > > > > This code works such that it will send an event to already running > > wpa_supplicant instances which will then terminate. This does indeed work > > if there's enough delay between invocations, though, if there is just a > > small delay (~100ms or something), that event doesn't get passed > > probably. I think we should start looking into possible solution at that > > point, trying to figure out why the the event doesn't get passed > > (probably because the interface is not yet up at that point) will get us > > closer to proper solution. > > Proper fine-grained locking was always good solution for race problem :-) > How about using flock(2) in wpa_supplicant source code? I don't see any flock'able resource shared between instances, do you? -- Bernhard
On 04.01.2011 15:39, Bernhard Schmidt wrote: >> Proper fine-grained locking was always good solution for race problem :-) >> How about using flock(2) in wpa_supplicant source code? > > I don't see any flock'able resource shared between instances, do you? Just use pidfile(3) :-)
Hi, can you give attached patch a shot? Just apply it to /etc/devd.conf and restart devd. This should fix the issue with netif restart. Thanks. -- Bernhard
State Changed From-To: suspended->feedback feedback requested
On 01/17/2011 18:27, Bernhard Schmidt wrote: > Hi, > > can you give attached patch a shot? Just apply it to /etc/devd.conf and > restart devd. This should fix the issue with netif restart. > > Thanks. Hi, I applied the patch, then stopped devd and netif (in this order). After that, I started devd and netif (in this order). I did not lose packets when pinging a remote host, nor did I lose any after ~2 netif restarts. In the third time, I started losing more packets than before, and the problem persisted after another restart. I then stopped devd again, then stopped netif again, started both again and the problem disappeared. So it seems not to have completely vanished. Should I revert the patch?
On Wednesday, January 19, 2011 01:41:32 Raphael Kubo da Costa wrote: > On 01/17/2011 18:27, Bernhard Schmidt wrote: > > Hi, > > > > can you give attached patch a shot? Just apply it to /etc/devd.conf > > and restart devd. This should fix the issue with netif restart. > > > > Thanks. > > Hi, > > I applied the patch, then stopped devd and netif (in this order). > After that, I started devd and netif (in this order). > > I did not lose packets when pinging a remote host, nor did I lose any > after ~2 netif restarts. In the third time, I started losing more > packets than before, and the problem persisted after another restart. > > I then stopped devd again, then stopped netif again, started both > again and the problem disappeared. So it seems not to have > completely vanished. > > Should I revert the patch? While the 'packet loss' occurs, can you do a 'ps xauw | grep wpa'? if there aren't 2 instances of wpa_supplicant running, that's a new issue. -- Bernhard
On 01/19/2011 05:14, Bernhard Schmidt wrote: > On Wednesday, January 19, 2011 01:41:32 Raphael Kubo da Costa wrote: >> On 01/17/2011 18:27, Bernhard Schmidt wrote: >>> Hi, >>> >>> can you give attached patch a shot? Just apply it to /etc/devd.conf >>> and restart devd. This should fix the issue with netif restart. >>> >>> Thanks. >> >> Hi, >> >> I applied the patch, then stopped devd and netif (in this order). >> After that, I started devd and netif (in this order). >> >> I did not lose packets when pinging a remote host, nor did I lose any >> after ~2 netif restarts. In the third time, I started losing more >> packets than before, and the problem persisted after another restart. >> >> I then stopped devd again, then stopped netif again, started both >> again and the problem disappeared. So it seems not to have >> completely vanished. >> >> Should I revert the patch? > > While the 'packet loss' occurs, can you do a 'ps xauw | grep wpa'? if > there aren't 2 instances of wpa_supplicant running, that's a new issue. Indeed, there are 2 wpa_supplicant instances running when the packet losses occur. If I stop both devd and netif and start netif, I get one single wpa_supplicant instance and no packet loss.
State Changed From-To: feedback->open feedback received
Responsible Changed From-To: bschmidt->freebsd-wireless back to pool
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
Fixed in base r343249.