Bug 221845

Summary: bridge0 causes kernel panic on BBB using usb wifi in hostap mode (RT3071)
Product: Base System Reporter: Russell Haley <russ.haley>
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: New ---    
Severity: Affects Only Me    
Priority: ---    
Version: CURRENT   
Hardware: arm   
OS: Any   
Attachments:
Description Flags
Kernel panic caused by client request for web page. none

Description Russell Haley 2017-08-27 06:46:27 UTC
Created attachment 185799 [details]
Kernel panic caused by client request for web page.

Running BBB through ftdi cable. 
Asus WiFi Adapter, RT3071 chipset
https://wikidevi.com/files/Ralink/RT307x%20product%20brief.pdf

root@bbb:~ # uname -a
FreeBSD bbb.highfell.local 12.0-CURRENT FreeBSD 12.0-CURRENT #7 r321601M: Thu Aug 17 22:13:21 PDT 2017     russellh@prescott.highfell.local:/usr/home/russellh/FreeBSD/rh-armv6/obj/arm.armv6/usr/home/russellh/FreeBSD/rh-armv6/src/sys/BEAGLEBONE-MMCCAM  arm

root@bbb:~ # cat /boot/loader.conf 
if_run0_load="YES"
wlan_mac_load="YES"

root@bbb:~ # cat /etc/rc.conf
hostname="bbb.highfell.local"
ifconfig_cpsw0="inet 192.168.2.101 netmask 255.255.255.0"
defaultrouter="192.168.2.1"
hostapd_enable="YES"
wlans_run0="wlan0"
create_args_wlan0="wlanmode hostap"
ifconfig_wlan0="up"
#gateway_enable="YES" 
cloned_interfaces="bridge0"
ifconfig_bridge0="addm cpsw0 addm wlan0 up"

sshd_enable="YES"
sendmail_enable="NONE"
sendmail_submit_enable="NO"
sendmail_outbound_enable="NO"
sendmail_msp_queue_enable="NO"
growfs_enable="YES"

root@bbb:~ # cat /etc/hostapd.conf 
interface=wlan0
debug=1
ctrl_interface=/var/run/hostapd
ctrl_interface_group=wheel
ssid=freebsd
wpa=2
wpa_passphrase=testing
wpa_key_mgmt=WPA-PSK
wpa_pairwise=CCMP

root@bbb:~ # cat /etc/resolv.conf 
# Generated by resolvconf
nameserver 192.168.2.1

Hi!

So I'm only partially successful repeating your test so far, but I can cause a kernel panic! The following are my observations:

1) Before the kernel loads, loader give the following errors:

can't find 'if_run'
can't find 'wlan_mac'

2) It seems the run0 usb wi-fi interface only comes up after the bridge0 is already enabled. dmesg does NOT capture the output from the failed attempt to add the non-existent wlan0 interface. However, I grabbed it from the boot output in the serial console:

#From dmesg:

ugen1.2: <Ralink 802.11 n WLAN> at usbus1
random: unblocking device.
bridge0: Ethernet address: 02:94:dd:d7:a3:00
cpsw0: link state changed to DOWN
cpsw0: promiscuous mode enabled
bridge0: link state changed to DOWN
cpsw0: link state changed to UP
bridge0: link state changed to UP
run0 on uhub1
run0: <1.0> on usbus1
run0: MAC/BBP RT3572 (rev 0x0223), RF RT3052 (MIMO 2T2R), address 60:a4:4c:ec:c9:a5
ieee80211_load_module: load the wlan_amrr module by hand for now.
wlan0: Ethernet address: 60:a4:4c:ec:c9:a5
run0: firmware RT3071 ver. 0.33 loaded

#From console grab:

eeding entropy: .
ifconfig: SIOCIFCREATE2: Invalid argument
bridge0: Ethernet address: 02:94:dd:d7:a3:00
Created clone interfaces: bridge0.
cpsw0: link state changed to DOWN
cpsw0: promiscuous mode enabled
bridge0: link state changed to DOWN
ifconfig: BRDGADD wlan0: No such file or directory
cpsw0: link state changed to UP
bridge0: link state changed to UP
Starting Network: lo0 cpsw0 bridge0.
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128 
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 
        inet 127.0.0.1 netmask 0xff000000 
        groups: lo 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
cpsw0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8000b<RXCSUM,TXCSUM,VLAN_MTU,LINKSTATE>
        ether a0:f6:fd:8a:c5:be
        hwaddr a0:f6:fd:8a:c5:be
        inet 192.168.2.101 netmask 0xffffff00 broadcast 192.168.2.255 
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:94:dd:d7:a3:00
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: cpsw0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 1 priority 128 path cost 55
        groups: bridge 
        nd6 options=9<PERFORMNUD,IFDISABLED>
Starting devd.
run0 on uhub1
run0: <1.0> on usbus1
run0: MAC/BBP RT3572 (rev 0x0223), RF RT3052 (MIMO 2T2R), address 60:a4:4c:ec:c9:a5
ieee80211_load_module: load the wlan_amrr module by hand for now.
wlan0: Ethernet address: 60:a4:4c:ec:c9:a5
Created wlan(4) interfaces: wlan0.
run0: firmware RT3071 ver. 0.33 loaded
Starting Network: wlan0.
wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 60:a4:4c:ec:c9:a5
        hwaddr 60:a4:4c:ec:c9:a5
        groups: wlan 
        ssid "" channel 11 (2462 MHz 11g)
        regdomain FCC country US authmode OPEN privacy OFF txpower 30
        scanvalid 60 protmode CTS wme dtimperiod 1 -dfs bintval 0
        media: IEEE 802.11 Wireless Ethernet autoselect <hostap> (autoselect <hostap>)
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
add host 127.0.0.1: gateway lo0 fib 0: route already in table
add net default: gateway 192.168.2.1
add host ::1: gateway lo0 fib 0: route already in table


*Something else to note about this setup output is that wlan0 did NOT get the ssid or the security setup from /etc/hostapd.conf

After boot I manually add the wlan0 to the bridge and then set the ssid

root@bbb:~ # ifconfig bridge0 addm wlan0
root@bbb:~ # ifconfig wlan0 ssid freebsd

I brought the interface down and back up again which made the AP is available to the clients. I open the ipod and get the system to associate with the ap and enter the following information

static IP

address: 192.168.2.102
subnet: 255.255.255.0
router: 192.168.2.1
dns : 192.168.1

After numerous wrong attempts at configuring the client, I managed to get exactly ONE request through. The freebsd.org page came up. I then tried to search for the ookla page and my bbb kernel paniced! (yay!)
https://pastebin.com/zB9AnWTv

The next time I booted the entire board hung right after the usb wifi adapter loaded (chop of hung board output, full output here https://pastebin.com/M09C5NEP):

cpsw0: link state changed to UP
bridge0: link state changed to UP
Starting Network: lo0 cpsw0 bridge0.
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128 
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 
        inet 127.0.0.1 netmask 0xff000000 
        groups: lo 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
cpsw0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8000b<RXCSUM,TXCSUM,VLAN_MTU,LINKSTATE>
        ether a0:f6:fd:8a:c5:be
        hwaddr a0:f6:fd:8a:c5:be
        inet 192.168.2.101 netmask 0xffffff00 broadcast 192.168.2.255 
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:94:dd:d7:a3:00
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: cpsw0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 1 priority 128 path cost 55
        groups: bridge 
        nd6 options=9<PERFORMNUD,IFDISABLED>
Starting devd.
run0 on uhub1
run0: <1.0> on usbus1

U-Boot SPL 2015.10-00001-g143c9ee (Nov 06 2015 - 15:27:19)
bad magic


I can cause the entire OS to hang it seems. Sometimes it boots, sometimes it hangs. The lights on the cpsw0 interface still blink but the serial console is dead. I'm trying to *avoid* triggering that so I don't know the sequence that's causing it. However, I can cause the kernel to panic relatively quickly after a handful of pages. No more than three full page requests so far.  It seems there is a bad memory happening in bridge_broadcast() at bridge_broadcast+0x1c4? 

https://www.freebsd.org/cgi/man.cgi?apropos=0&sektion=9&query=m_dup

Anyway, that's all the time I have for this weekend.
Comment 1 Russell Haley 2017-09-06 17:33:35 UTC
Original post on freebsd-arm@ that spurred my investigation. Complete conversation here: http://freebsd.1045724.x6.nabble.com/Beaglebone-Black-FreeBSD-USB-WiFi-WAP-td6201439.html

Is anyone using a BeagleBone Black and USB Wifi as a WAP?  If so, what kind of throughput do you get on your Wifi?  I’m getting very bad network performance.

Hardware:

- BeagleBone Black rev C.
- Tried with two different USB Wifi Adapters:
        - Edimax N150 (small nub type adapter)
        - LB Link from Adafruit (https://www.adafruit.com/product/1030)
        - Both show as RTL8188CUS chips (via dmesg)
- Wired connection connected for access to my network and internet

OS:

-  Mostly with the most recent Beaglebone Snapshot of 11.0
-  A few quick tests with the 11. snapshot image

Connection Speeds:

- Using nuttcp (over 100Mbps ethernet) between the BB and another machine, I can get ~90+ Mbps in each direction  so I don’t think there is an issue with that part of the connection.
- Using the BB as a WAP and connection my iPad I see 1 to 5 Mbps down and < 1 Mbps tested via the Ookla Speedtest app.
- Connecting to my ancient Netgear WAP using the same app and test, I get roughly 10 Mbps in either direction with the same test.


Configs (pretty much taken right from the handbook):

/boot/loader.conf:

if_urtwn_load="YES"
wlan_mac_load="YES"

/etc/rc.conf (relevant parts)

ifconfig_cpsw0="inet XX.XX.XX.XX netmask 0xffffff00”
defaultrouter=“YY.YY.YY.YY"
hostapd_enable="YES"
wlans_urtwn0="wlan0"
create_args_wlan0="wlanmode hostap"
ifconfig_wlan0="up"
cloned_interfaces="bridge0"
ifconfig_bridge0="addm cpsw0 addm wlan0 up”

/etc/hostapd.conf

interface=wlan0
debug=1
ctrl_interface=/var/run/hostapd
ctrl_interface_group=wheel
ssid=REDACTED
wpa=2
wpa_passphrase=REDACTED
wpa_key_mgmt=WPA-PSK
wpa_pairwise=CCMP



Any ideas?  Is this just the limit of USB wifi on this board?
Comment 2 Russell Haley 2017-09-06 17:36:01 UTC
I've been digging into the code for if_bridge.c, which is found under
sys/net. bridge_broadcast only has one call to m_dup on line 2553.

            mc = m_dup(m, M_NOWAIT);
            if (mc == NULL) {
                if_inc_counter(sc->sc_ifp, IFCOUNTER_OERRORS, 1);
                continue;
            }

This is just a guess: I'm wondering if the M_NOWAIT is causing the
panic because... er... "someone has a sleep lock they shouldn't"?  I
don't really know what I'm talking about though (a little bit of
knowledge...).

I guess I'd have to try and correlate to some sort of lock begin held
in the adapter specific code?
Comment 3 Russell Haley 2017-09-07 07:01:16 UTC
I've been following the code through and wound up at sys/arm/ti/cpsw/if_cpsw.c. cpsw_intr_rx is defined on line 1554. The function uses a macro called CPSW_RX_LOCK which is defined on line 349. The macro contains an assert on a transmit lock (tx.lock). I theorise the statement on line 350 is causing my exception? I also wonder if the lock being held between lines 1561 and 1570 is causing the delay in the bridge interface that is causing the original posters' slow throughput. Is it necessary to hold the lock until after the cpsw_write_4 on line 1569 or could it be performed outside the lock?