Bug 144898

Summary: [wpi] [panic] wpi panics system
Product: Base System Reporter: kamikaze
Component: kernAssignee: Bernhard Schmidt <bschmidt>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
patch.txt
none
wpi_rfkill.diff none

Description kamikaze 2010-03-20 08:50:03 UTC
A couple of days ago I upgraded my notebook from 2gb to 8gb ram. Ever since my wpi wireless works very unreliable. An established connection just disappears suddenly often after only a couple of minutes (the wireles LED turns itself off). Afterwards it's no longer possible to find wireless networks. Scans always return nothing.

This is not the worst, though. It also panics the system quite often (I'm using a really slow GSM connection, to avoid the panics). I.e. the system freezes without creating a dump.

Once I happened to be looking at the error console and pressed the radio button. This is what I recorded (using pen and paper):

wpi0: Hardware Switch Enabled
wpi0: could not set power mode
wpi0: device config failed
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address	= 0x20
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff802501ab
stack pointer		= 0x28:0xffffff80e7c01b70
frame pointer		= 0x28:0xffffff80e7c01ba0
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= resume, IOPL = 0
current process		= 0 (wpi0 taskq)
trap number		= 12
panic: page fault
cpuid = 1



As I said, the system doesn't dump, no idea why. This is an HP Compag 6510b, Intel Core2Duo, 8gb memory.

pciconf -lv:
hostb0@pci0:0:0:0:	class=0x060000 card=0x30c0103c chip=0x2a008086 rev=0x0c hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Mobile PM965/GM965/GL960 Express Processor to DRAM Controller'
    class      = bridge
    subclass   = HOST-PCI
vgapci0@pci0:0:2:0:	class=0x030000 card=0x30c0103c chip=0x2a028086 rev=0x0c hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Mobile 965 Express Integrated Graphics Controller'
    class      = display
    subclass   = VGA
vgapci1@pci0:0:2:1:	class=0x038000 card=0x30c0103c chip=0x2a038086 rev=0x0c hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Mobile 965 Express Integrated Graphics Controller'
    class      = display
uhci0@pci0:0:26:0:	class=0x0c0300 card=0x30c0103c chip=0x28348086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) USB UHCI *4'
    class      = serial bus
    subclass   = USB
uhci1@pci0:0:26:1:	class=0x0c0300 card=0x30c0103c chip=0x28358086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) USB UHCI *5'
    class      = serial bus
    subclass   = USB
ehci0@pci0:0:26:7:	class=0x0c0320 card=0x30c0103c chip=0x283a8086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'ICH8 Enhanced USB2 Enhanced Host Controller (81EC1043 (?))'
    class      = serial bus
    subclass   = USB
hdac0@pci0:0:27:0:	class=0x040300 card=0x30c0103c chip=0x284b8086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel audio controller embedded with the 82801H chipset ( ICH8 chipset ) (82801H)'
    class      = multimedia
    subclass   = HDA
pcib1@pci0:0:28:0:	class=0x060400 card=0x30c0103c chip=0x283f8086 rev=0x03 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) PCIe Port 1'
    class      = bridge
    subclass   = PCI-PCI
pcib2@pci0:0:28:1:	class=0x060400 card=0x30c0103c chip=0x28418086 rev=0x03 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) PCIe Port 2'
    class      = bridge
    subclass   = PCI-PCI
pcib3@pci0:0:28:2:	class=0x060400 card=0x30c0103c chip=0x28438086 rev=0x03 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) PCIe Port 3'
    class      = bridge
    subclass   = PCI-PCI
pcib4@pci0:0:28:4:	class=0x060400 card=0x30c0103c chip=0x28478086 rev=0x03 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) PCIe Port 5'
    class      = bridge
    subclass   = PCI-PCI
uhci2@pci0:0:29:0:	class=0x0c0300 card=0x30c0103c chip=0x28308086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) USB UHCI *1'
    class      = serial bus
    subclass   = USB
uhci3@pci0:0:29:1:	class=0x0c0300 card=0x30c0103c chip=0x28318086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) USB UHCI *2'
    class      = serial bus
    subclass   = USB
uhci4@pci0:0:29:2:	class=0x0c0300 card=0x30c0103c chip=0x28328086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) USB UHCI *3'
    class      = serial bus
    subclass   = USB
ehci1@pci0:0:29:7:	class=0x0c0320 card=0x30c0103c chip=0x28368086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) USB2 EHCI *1'
    class      = serial bus
    subclass   = USB
pcib5@pci0:0:30:0:	class=0x060401 card=0x30c0103c chip=0x24488086 rev=0xf3 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801 Family (ICH2/3/4/5/6/7/8/9-M) Hub Interface to PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
isab0@pci0:0:31:0:	class=0x060100 card=0x30c0103c chip=0x28158086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801HEM (ICH8M-E) LPC Interface Controller'
    class      = bridge
    subclass   = PCI-ISA
atapci0@pci0:0:31:1:	class=0x01018a card=0x30c0103c chip=0x28508086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801H (ICH8 Family) Ultra ATA Storage Controllers'
    class      = mass storage
    subclass   = ATA
ahci0@pci0:0:31:2:	class=0x010601 card=0x30c0103c chip=0x28298086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Mobile SATA AHCI Controller'
    class      = mass storage
    subclass   = SATA
wpi0@pci0:16:0:0:	class=0x028000 card=0x135c103c chip=0x42228086 rev=0x02 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Intel 3945ABG Wireless LAN controller (10208086)'
    class      = network
bge0@pci0:24:0:0:	class=0x020000 card=0x30c0103c chip=0x169314e4 rev=0x02 hdr=0x00
    vendor     = 'Broadcom Corporation'
    device     = 'Ethernet Controller Broadcom Netlink Gigabit (BCM5787A)'
    class      = network
    subclass   = ethernet
none0@pci0:2:4:0:	class=0x060700 card=0x30c0103c chip=0x04761180 rev=0xb6 hdr=0x02
    vendor     = 'Ricoh Company, Ltd.'
    device     = 'Ricoh R/RL/5C476(II) (unknown)'
    class      = bridge
    subclass   = PCI-CardBus



dmesg |grep wpi:
wpi0: <Intel(R) PRO/Wireless 3945ABG> mem 0xe4100000-0xe4100fff irq 17 at device 0.0 on pci16
wpi0: Driver Revision 20071127
wpi0: Hardware Revision (0x1)
wpi0: Regulatory Domain: MoW2
wpi0: Hardware Type: B
wpi0: Hardware Revision: ?
wpi0: SKU does support 802.11a
wpi0: [ITHREAD]

How-To-Repeat: Use wpi wireless on an amd64 system with 8gb ram. If you use wpa_supplicant it panics more frequently.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2010-03-20 13:53:51 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-net

Over to maintainer(s).
Comment 2 kamikaze 2010-03-30 15:43:15 UTC
Because there were so many net related commits in RELENG_8
recently I rebuilt my system today.

It appears that the system now doesn't panic any more. If at
all the connection, now works for a maximum of 10 seconds,
though.

I have to reload the if_wpi and wpifw modules to get it
running again (for another 10 seconds).

This is what ttyv0 looks like, the beacon misses start
after a couple of seconds:


wpi0: Regulatory Domain: MoW2
wpi0: Hardware Type: B
wpi0: Hardware Revision: ?
wpi0: SKU does support 802.11a
wpi0: [ITHREAD]
wlan0: Ethernet address: 00:1c:bf:58:3a:87
wpi0: timeout resetting Tx ring 1
wpi0: timeout resetting Tx ring 3
wpi0: timeout resetting Tx ring 4
microcode alive notification version 10e02 alive 1
microcode alive notification version 10e02 alive 1
wpi_newstate: INIT -> SCAN flags 0x0
wpi0: scan timeout
wpi_newstate: SCAN -> SCAN flags 0x0
microcode alive notification version 10e02 alive 1
microcode alive notification version 10e02 alive 1
wpi_newstate: SCAN -> AUTH flags 0x0
config chan 1 flags 8005 cck f ofdm 15
wpi_newstate: AUTH -> ASSOC flags 0x0
wpi_newstate: ASSOC -> RUN flags 0x0
config chan 1 flags 8005
wpi0: need multicast update callback
wpi0: need multicast update callback
wpi0: need multicast update callback
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
Beacon miss: 2728567458 >= 7
...

I also recognized that wlan0 always lists "txpower 0",
even if I try to set it manually (txpowermax is 50.0).
Comment 3 jeff 2010-05-02 20:45:41 UTC
This is happening to me as well, 8.0-release p2, amd64, 4 GB RAM, same
motherboard chipset and graphics card ( Dell D630 ).

wpi0: device timeout
wpi0: could not set power mode
wpi0: device config failed

kldunload -> kldload panics the system often but not always. wpi0 device
timeouts happen more often with X/gnome running. wireless will not work
after this until a reboot ( when kldload doesn't cause kernel panics )
Comment 4 kamikaze 2010-05-02 20:55:02 UTC
On 02/05/2010 21:45, jeff curry wrote:
> This is happening to me as well, 8.0-release p2, amd64, 4 GB RAM, same
> motherboard chipset and graphics card ( Dell D630 ).
> 
> wpi0: device timeout
> wpi0: could not set power mode
> wpi0: device config failed
> 
> kldunload -> kldload panics the system often but not always. wpi0 device
> timeouts happen more often with X/gnome running. wireless will not work
> after this until a reboot ( when kldload doesn't cause kernel panics )

I have found a very reliable (as in 100%) way to panic the system.

- Turn radio off (via hardware switch)
- Load the driver and firmware
- Turn radio on (via hardware switch)

Voilà, instant panic.

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Comment 5 kamikaze 2010-06-30 08:59:36 UTC
Just a followup. The device works when I reduce kern.hz to 200Hz.

There's a 50% package loss, but TCP works fine as long as you can
live with the increased latency. Of course you can forget about
UDP.

I can still reliably produce a panic by turning the hardware radio
switch off while the device is up.

Also the range of the device is really bad. I cannot establish a
connection in the exact same spot where an atheros device works
flawlessly and reports great reception.

Regards

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Comment 6 Alex Kozlov 2010-08-05 22:52:26 UTC
Hi, Dominic

It's seems to be common issue for many wireless if drivers.
Can You please try this patch? Thanks.


--
Adios
Comment 7 kamikaze 2010-08-22 09:25:09 UTC
Hello,

On 05/08/2010 23:52, Alex Kozlov wrote:
> It's seems to be common issue for many wireless if drivers.
> Can You please try this patch? Thanks.

I finally got around to testing it. I'm mostly using an Atheros
card, nowadays. Though I have to hotswap it to get past the BIOS.

Your patch fixes all the reliably reproducible panics.

I can scan for wireless networks and it yields the available networks.
I can connect to a network and use it. There is no package loss,
I'm currently stress-testing the connection, and it appears to be
reliable.

Now the downside. If I turn the network down and bring it up again
it turns blind. An "ifconfig wlan0 list scan" will yield nothing
and I cannot connect to a network. So I can use it only once.

Unloading the module in between does not help either, so I have
to reboot if I want to use it again.

What kind of data are you interested in? Dmesg? Anything else?

Regards

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Comment 8 bschmidt 2010-08-22 14:38:00 UTC
Hi,

please give attached patch a shot, it should fix the issues with the
RFKill button.

Currently I'm not able to reproduce the other issues, I don't have 8
GB RAM though.

--
Bernhard
Comment 9 kamikaze 2010-08-22 18:48:46 UTC
On 22/08/2010 15:38, Bernhard Schmidt wrote:
> Hi,
> 
> please give attached patch a shot, it should fix the issues with the
> RFKill button.
> 
> Currently I'm not able to reproduce the other issues, I don't have 8
> GB RAM though.

I have now both your patch an Alex's patch applied and everything
seems to be working. I hope both patches make it into RELENG_8.

I'll get back to you once I've used it for a time. Alex's patch
already fixed the panic when turning the radio switch off, but
with your patch I can also connect to a network more than once.

Regards

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Comment 10 bschmidt 2010-08-23 08:06:46 UTC
On Sun, Aug 22, 2010 at 19:48, Dominic Fandrey <kamikaze@bsdforen.de> wrote:
> On 22/08/2010 15:38, Bernhard Schmidt wrote:
>> Hi,
>>
>> please give attached patch a shot, it should fix the issues with the
>> RFKill button.
>>
>> Currently I'm not able to reproduce the other issues, I don't have 8
>> GB RAM though.
>
> I have now both your patch an Alex's patch applied and everything
> seems to be working. I hope both patches make it into RELENG_8.
>
> I'll get back to you once I've used it for a time. Alex's patch
> already fixed the panic when turning the radio switch off, but
> with your patch I can also connect to a network more than once.

Alex's patch addresses a different issue, there is a race with
wpa_supplicant (/etc/rc.d/netif restart) which might free a node while
one of the wpi functions is using it. It shouldn't make any difference
in your case, though, it's nice to fix this now and I probably going
to commit it anyways.

Please let me know if there a any stability issues left, it made a
pretty good impressions over this weekend while I did run some tests.

Thanks.

-- 
Bernhard
Comment 11 Bernhard Schmidt freebsd_committer freebsd_triage 2010-08-23 08:11:33 UTC
Responsible Changed
From-To: freebsd-net->bschmidt

Over to me.
Comment 12 kamikaze 2010-08-23 08:20:12 UTC
On 23/08/2010 09:06, Bernhard Schmidt wrote:
> On Sun, Aug 22, 2010 at 19:48, Dominic Fandrey <kamikaze@bsdforen.de> wrote:
>> On 22/08/2010 15:38, Bernhard Schmidt wrote:
>>> Hi,
>>>
>>> please give attached patch a shot, it should fix the issues with the
>>> RFKill button.
>>>
>>> Currently I'm not able to reproduce the other issues, I don't have 8
>>> GB RAM though.
>>
>> I have now both your patch an Alex's patch applied and everything
>> seems to be working. I hope both patches make it into RELENG_8.
>>
>> I'll get back to you once I've used it for a time. Alex's patch
>> already fixed the panic when turning the radio switch off, but
>> with your patch I can also connect to a network more than once.
> 
> Alex's patch addresses a different issue, there is a race with
> wpa_supplicant (/etc/rc.d/netif restart) which might free a node while
> one of the wpi functions is using it. It shouldn't make any difference
> in your case, though, it's nice to fix this now and I probably going
> to commit it anyways.
> 
> Please let me know if there a any stability issues left, it made a
> pretty good impressions over this weekend while I did run some tests.

Yesterday, I lost the connection to a WPA2 net. Afterwards I couldn't
reconnect any more. A "list scan" still yielded the current list of
available APs, though. Even after a down and up.

This morning I had a freeze when connecting to my university network.
The script is like that:

ifconfig wlan0 ssid up
sleep 2
aps=$(ifconfig wlan0 list scan)
ifconfig wlan0 ssid <AP with the best connection>
dhclient wlan0
vpnc

I have no idea at which point the freeze happened. I just repeated
the procedure manually and the system didn't freeze.

Regards

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Comment 13 bschmidt 2010-08-23 08:26:06 UTC
On Mon, Aug 23, 2010 at 09:20, Dominic Fandrey <kamikaze@bsdforen.de> wrote:
> On 23/08/2010 09:06, Bernhard Schmidt wrote:
>> Please let me know if there a any stability issues left, it made a
>> pretty good impressions over this weekend while I did run some tests.
>
> Yesterday, I lost the connection to a WPA2 net. Afterwards I couldn't
> reconnect any more. A "list scan" still yielded the current list of
> available APs, though. Even after a down and up.
>
> This morning I had a freeze when connecting to my university network.
> The script is like that:
>
> ifconfig wlan0 ssid up
> sleep 2
> aps=$(ifconfig wlan0 list scan)
> ifconfig wlan0 ssid <AP with the best connection>
> dhclient wlan0
> vpnc
>
> I have no idea at which point the freeze happened. I just repeated
> the procedure manually and the system didn't freeze.

Ok thanks, I'll look into that.

Can you enable debug output for wpi? (sysctl debug.wpi=1) and show me
the output? I expect there to be some messages related to beacon
misses.


--
Bernhard
Comment 14 Sulev-Madis Silber 2010-12-13 07:02:25 UTC
Hello.

I recently got new Lenovo T60 with that "infamous" card installed...


I'm running 8.1-RELEASE-p2, I've also applied these two patches.
It's better now, indeed, but it's still, sorry, pretty f* unstable (and
unusable).

It crashed when I played RF kill switch (I got weird "RRAAMM
ppaarriittyy eerroorr"),
then crashed when I issued scan while card was associated to open AP. I
remember doing this many times until it crashed (or freezed, to be correct).
I currently have no error message for you, active console wasn't ttyv0.


Now I'm searching either USB adapter or PCMCIA card and seriously want
to change internal card.
But this used (actually pretty unused :)) laptop is under 3 month
warranty and then I can't help you anymore with these issues (which is
important to me).

Plus I want to have two cards anyway, with at least one HostAP-capable.
Just in case some hotel decides to take such stupid amount of money for
just a one connection again...
Any suggestions on that?


And maybe we get that Intel card working too.


Thanks.
Comment 15 dfilter service freebsd_committer freebsd_triage 2010-12-18 15:25:27 UTC
Author: bschmidt
Date: Sat Dec 18 15:25:21 2010
New Revision: 216521
URL: http://svn.freebsd.org/changeset/base/216521

Log:
  Fix a panic while disabling the RF kill button, caller of the
  wpi_rfkill_resume() function will take care of the lock.
  
  PR:		kern/144898
  MFC after:	3 days

Modified:
  head/sys/dev/wpi/if_wpi.c

Modified: head/sys/dev/wpi/if_wpi.c
==============================================================================
--- head/sys/dev/wpi/if_wpi.c	Sat Dec 18 14:34:05 2010	(r216520)
+++ head/sys/dev/wpi/if_wpi.c	Sat Dec 18 15:25:21 2010	(r216521)
@@ -3004,14 +3004,12 @@ wpi_rfkill_resume(struct wpi_softc *sc)
 	if (ntries == 1000) {
 		device_printf(sc->sc_dev,
 		    "timeout waiting for thermal calibration\n");
-		WPI_UNLOCK(sc);
 		return;
 	}
 	DPRINTFN(WPI_DEBUG_TEMP,("temperature %d\n", sc->temp));
 
 	if (wpi_config(sc) != 0) {
 		device_printf(sc->sc_dev, "device config failed\n");
-		WPI_UNLOCK(sc);
 		return;
 	}
 
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 16 Bernhard Schmidt freebsd_committer freebsd_triage 2010-12-27 10:34:24 UTC
Hi,

As you might have noticed, I committed a bunch of fixed to wpi(4) which are 
now included in 8.2-RC1. Could you please test that in regard to this PR and 
let me know about the outcome?

Thanks

-- 
Bernhard
Comment 17 Bernhard Schmidt freebsd_committer freebsd_triage 2010-12-27 10:39:31 UTC
State Changed
From-To: open->feedback

feedback has been requested
Comment 18 Sulev-Madis Silber 2010-12-28 03:05:16 UTC
Monitor mode indeed works now! Good work!

Will test station connectivity later. I have no active AP here I could use.
...
Wait, I just now managed to crash it again. Like before, I issued
"ifconfig wlan0 scan" and there it went.
And again, station was associated to local public open AP, in 11g mode.

Is this something you can't replicate on your hardware?


And lately I bought new USB adapter advertised having Atheros chip. It
indeed seems to be Atheros, but it's new AR9271 chip. Sale item is
called TP-Link WN322G v3.


On 2010-12-27 12:34, Bernhard Schmidt wrote:
> Hi,
> 
> As you might have noticed, I committed a bunch of fixed to wpi(4) which are 
> now included in 8.2-RC1. Could you please test that in regard to this PR and 
> let me know about the outcome?
> 
> Thanks
>
Comment 19 Bernhard Schmidt freebsd_committer freebsd_triage 2010-12-28 08:39:10 UTC
On Tuesday 28 December 2010 04:05:16 Sulev-Madis Silber wrote:
> Monitor mode indeed works now! Good work!

fine, thanks
 
> Will test station connectivity later. I have no active AP here I could use.
> ...
> Wait, I just now managed to crash it again. Like before, I issued
> "ifconfig wlan0 scan" and there it went.
> And again, station was associated to local public open AP, in 11g mode.
> 
> Is this something you can't replicate on your hardware?

You've issued 'ifconfig scan' while you were associated to an AP, right? This 
I guess is a known issue, wpi(4) does not support background scans, which is 
what you want to do at that point. This though, should just result in a 
firmware error, did you get one? I might look into adding a check to prevent a 
scan being done while there is no actual support for it..

> And lately I bought new USB adapter advertised having Atheros chip. It
> indeed seems to be Atheros, but it's new AR9271 chip. Sale item is
> called TP-Link WN322G v3.

No support for that chip series afaik, I'm not aware of anyone working on 
porting a driver currently.

-- 
Bernhard
Comment 20 Sulev-Madis Silber 2010-12-28 09:52:13 UTC
On 2010-12-28 10:39, Bernhard Schmidt wrote:
> You've issued 'ifconfig scan' while you were associated to an AP, right? This 
> I guess is a known issue, wpi(4) does not support background scans, which is 
> what you want to do at that point. This though, should just result in a 
> firmware error, did you get one? I might look into adding a check to prevent a 
> scan being done while there is no actual support for it..

Not sure what I got ("rraamm ppaarriittyy eerrrroorr" again).
I made "screen shot" of it, because I have no serial console to get
textual information out.

But what I know is when I issue background scan, it always panics, tried
two more times.
It would be nice to have check there to disallow that.


>> And lately I bought new USB adapter advertised having Atheros chip. It
>> indeed seems to be Atheros, but it's new AR9271 chip. Sale item is
>> called TP-Link WN322G v3.
> 
> No support for that chip series afaik, I'm not aware of anyone working on 
> porting a driver currently.

I forgot to add that indeed, I tried and it's not supported. This USB ID
(0x1006) is supported in OpenBSD, however...

Sadly I currently don't know how to hack such things myself :(
It can change, of course :)
Comment 21 kamikaze 2010-12-28 13:36:05 UTC
On 27/12/2010 11:34, Bernhard Schmidt wrote:
> As you might have noticed, I committed a bunch of fixed to wpi(4) which are 
> now included in 8.2-RC1. Could you please test that in regard to this PR and 
> let me know about the outcome?


I have done some tests in the unencrypted 27c3 conference network.

This looks really good to me.

Dangerous things I tried:

* Multiple network scans in sequence WORK and don't appear
  to panic the system
* Connecting to an unencrypted network works
* Turning the WLAN hardware switch off, while connected to a
  network does NOT panic the system

I'll do WPA testing when I'm back home (~4 days from now).

Something of interest, the driver prints a WLAN off message, but
the wlan0 device still claims to be associated, when I trigger
the hardware switch.

The driver also prints a WLAN on message when I turn it back on.
But I have to manually bring the wlan0 device down and up to
reconnect.

Now the bad news. If the wireless connection is bad, ALL TCP
communication on the system breaks down. Including the communication
between X clients and X server! I.e. every X program sooner or later
looses its X connection and crashes.

Regards
Comment 22 kamikaze 2010-12-28 16:15:28 UTC
To clarify some things. That wpi somehow drags down the TCP stack
is an assumption, based on what happens when a TCP connection is
on.

What happens is that X clients sporadically die, sometimes the WM is
the client, killing all currently running X clients with it.

Thunderbird (while still running) issues complaints about SSL and
links about zlib errors. All this leads me to conclude that some
data corruption is occurring.


Also I found a new problem, this is from my dmesg:
...
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: could not map mbuf (error 12)
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
...

The wpi_rx_intr messages appeared at breakneck speed until I turned
the interface down. The mbuf error was thrown in occasionally. It
did not appear to occur at a fixed frequency.

Regards
Comment 23 Bernhard Schmidt freebsd_committer freebsd_triage 2010-12-30 10:57:18 UTC
On Tuesday 28 December 2010 17:30:15 Dominic Fandrey wrote:
>  Also I found a new problem, this is from my dmesg:
>  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
>  ..
>  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
> 
>  The wpi_rx_intr messages appeared at breakneck speed until I turned
>  the interface down. The mbuf error was thrown in occasionally. It
>  did not appear to occur at a fixed frequency.

I've seen this one too.. though, no clue about the cause nor a way to reliably 
reproduce it. Do you have a scenario where this always happens?

--
Bernhard
Comment 24 kamikaze 2010-12-30 15:28:15 UTC
On 30/12/2010 11:57, Bernhard Schmidt wrote:
> On Tuesday 28 December 2010 17:30:15 Dominic Fandrey wrote:
>>  Also I found a new problem, this is from my dmesg:
>>  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
>>  ..
>>  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
>>
>>  The wpi_rx_intr messages appeared at breakneck speed until I turned
>>  the interface down. The mbuf error was thrown in occasionally. It
>>  did not appear to occur at a fixed frequency.
> 
> I've seen this one too.. though, no clue about the cause nor a way to reliably 
> reproduce it. Do you have a scenario where this always happens?

No, sorry to say so, I don't.

Regards
Comment 25 Bernhard Schmidt freebsd_committer freebsd_triage 2010-12-30 18:18:27 UTC
On Thursday 30 December 2010 16:30:12 Dominic Fandrey wrote:
> The following reply was made to PR kern/144898; it has been noted by GNATS.
> 
> From: Dominic Fandrey <kamikaze@bsdforen.de>
> To: Bernhard Schmidt <bschmidt@freebsd.org>
> Cc: bug-followup@freebsd.org
> Subject: Re: kern/144898: [wpi] [panic] wpi panics system
> Date: Thu, 30 Dec 2010 16:28:15 +0100
> 
>  On 30/12/2010 11:57, Bernhard Schmidt wrote:
>  > On Tuesday 28 December 2010 17:30:15 Dominic Fandrey wrote:
>  >>  Also I found a new problem, this is from my dmesg:
>  >>  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
>  >>  ..
>  >>  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
>  >>  
>  >>  The wpi_rx_intr messages appeared at breakneck speed until I turned
>  >>  the interface down. The mbuf error was thrown in occasionally. It
>  >>  did not appear to occur at a fixed frequency.
>  > 
>  > I've seen this one too.. though, no clue about the cause nor a way to
>  > reliably reproduce it. Do you have a scenario where this always
>  > happens?
> 
>  No, sorry to say so, I don't.

I found a way to trigger this, after a clean boot, executing
i=0
while [ $i -lt 100 ]; do
	kldload if_wpi
	ifconfig wlan0 create wlandev wpi0
	sleep 0.5
	kldunload if_wpi
	i=$(expr $i + 1)
done

kldload if_wpi
ifconfig wlan0 create wlandev wpi0
ifconfig wlan0 ssid iwn2 channel 7 10.1.1.157/16 up

and then doing lots of RX traffic triggered it 100% reliably.
Patch coming ASAP.

-- 
Bernhard
Comment 26 dfilter service freebsd_committer freebsd_triage 2010-12-30 18:29:31 UTC
Author: bschmidt
Date: Thu Dec 30 18:29:22 2010
New Revision: 216824
URL: http://svn.freebsd.org/changeset/base/216824

Log:
  The RX path is missing a few bus_dmamap_*() calls, this results in
  modification of memory which was already free'd and eventually in:
  wpi0: could not map mbuf (error 12)
  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
  and an usuable device.
  
  PR:		kern/144898
  MFC after:	3 days

Modified:
  head/sys/dev/wpi/if_wpi.c

Modified: head/sys/dev/wpi/if_wpi.c
==============================================================================
--- head/sys/dev/wpi/if_wpi.c	Thu Dec 30 18:06:31 2010	(r216823)
+++ head/sys/dev/wpi/if_wpi.c	Thu Dec 30 18:29:22 2010	(r216824)
@@ -1052,9 +1052,18 @@ wpi_free_rx_ring(struct wpi_softc *sc, s
 
 	wpi_dma_contig_free(&ring->desc_dma);
 
-	for (i = 0; i < WPI_RX_RING_COUNT; i++)
-		if (ring->data[i].m != NULL)
-			m_freem(ring->data[i].m);
+	for (i = 0; i < WPI_RX_RING_COUNT; i++) {
+		struct wpi_rx_data *data = &ring->data[i];
+
+		if (data->m != NULL) {
+			bus_dmamap_sync(ring->data_dmat, data->map,
+			    BUS_DMASYNC_POSTREAD);
+			bus_dmamap_unload(ring->data_dmat, data->map);
+			m_freem(data->m);
+		}
+		if (data->map != NULL)
+			bus_dmamap_destroy(ring->data_dmat, data->map);
+	}
 }
 
 static int
@@ -1461,6 +1470,7 @@ wpi_rx_intr(struct wpi_softc *sc, struct
 		return;
 	}
 
+	bus_dmamap_sync(ring->data_dmat, data->map, BUS_DMASYNC_POSTREAD);
 	head = (struct wpi_rx_head *)((caddr_t)(stat + 1) + stat->len);
 	tail = (struct wpi_rx_tail *)((caddr_t)(head + 1) + le16toh(head->len));
 
@@ -1491,6 +1501,8 @@ wpi_rx_intr(struct wpi_softc *sc, struct
 		ifp->if_ierrors++;
 		return;
 	}
+	bus_dmamap_unload(ring->data_dmat, data->map);
+
 	error = bus_dmamap_load(ring->data_dmat, data->map,
 	    mtod(mnew, caddr_t), MJUMPAGESIZE,
 	    wpi_dma_map_addr, &paddr, BUS_DMA_NOWAIT);
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 27 kamikaze 2010-12-31 10:20:10 UTC
On 30/12/2010 19:18, Bernhard Schmidt wrote:
> On Thursday 30 December 2010 16:30:12 Dominic Fandrey wrote:
>> The following reply was made to PR kern/144898; it has been noted by GNATS.
>>
>> From: Dominic Fandrey <kamikaze@bsdforen.de>
>> To: Bernhard Schmidt <bschmidt@freebsd.org>
>> Cc: bug-followup@freebsd.org
>> Subject: Re: kern/144898: [wpi] [panic] wpi panics system
>> Date: Thu, 30 Dec 2010 16:28:15 +0100
>>
>>  On 30/12/2010 11:57, Bernhard Schmidt wrote:
>>  > On Tuesday 28 December 2010 17:30:15 Dominic Fandrey wrote:
>>  >>  Also I found a new problem, this is from my dmesg:
>>  >>  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
>>  >>  ..
>>  >>  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
>>  >>  
>>  >>  The wpi_rx_intr messages appeared at breakneck speed until I turned
>>  >>  the interface down. The mbuf error was thrown in occasionally. It
>>  >>  did not appear to occur at a fixed frequency.
>>  > 
>>  > I've seen this one too.. though, no clue about the cause nor a way to
>>  > reliably reproduce it. Do you have a scenario where this always
>>  > happens?
>>
>>  No, sorry to say so, I don't.
> 
> I found a way to trigger this, after a clean boot, executing
> i=0
> while [ $i -lt 100 ]; do
> 	kldload if_wpi
> 	ifconfig wlan0 create wlandev wpi0
> 	sleep 0.5
> 	kldunload if_wpi
> 	i=$(expr $i + 1)
> done
> 
> kldload if_wpi
> ifconfig wlan0 create wlandev wpi0
> ifconfig wlan0 ssid iwn2 channel 7 10.1.1.157/16 up
> 
> and then doing lots of RX traffic triggered it 100% reliably.
> Patch coming ASAP.

That it makes my X applications die, makes me worry more than this
problem, though.

Any way, I will test your patch ASAP.

Regards
Comment 28 kamikaze 2011-01-01 23:18:04 UTC
On 27/12/2010 11:34, Bernhard Schmidt wrote:

> As you might have noticed, I committed a bunch of fixed to wpi(4) which are 
> now included in 8.2-RC1. Could you please test that in regard to this PR and 
> let me know about the outcome?

I now tested WPA2 and it works, too. The system is stable in the
sense that it does not panic and there is no package loss if I ping
something.

However X applications still freeze and die, as soon as the wlan0
interface is up. Even if the interface is not associated and no
routes go through it.

Regards

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Comment 29 dfilter service freebsd_committer freebsd_triage 2011-01-02 09:04:01 UTC
Author: bschmidt
Date: Sun Jan  2 09:03:53 2011
New Revision: 216885
URL: http://svn.freebsd.org/changeset/base/216885

Log:
  MFC r216824:
  The RX path is missing a few bus_dmamap_*() calls, this results in
  modification of memory which was already free'd and eventually in:
  wpi0: could not map mbuf (error 12)
  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
  and an usuable device.
  
  PR:		kern/144898

Modified:
  stable/8/sys/dev/wpi/if_wpi.c
Directory Properties:
  stable/8/sys/   (props changed)
  stable/8/sys/amd64/include/xen/   (props changed)
  stable/8/sys/cddl/contrib/opensolaris/   (props changed)
  stable/8/sys/contrib/dev/acpica/   (props changed)
  stable/8/sys/contrib/pf/   (props changed)

Modified: stable/8/sys/dev/wpi/if_wpi.c
==============================================================================
--- stable/8/sys/dev/wpi/if_wpi.c	Sun Jan  2 03:16:47 2011	(r216884)
+++ stable/8/sys/dev/wpi/if_wpi.c	Sun Jan  2 09:03:53 2011	(r216885)
@@ -1052,9 +1052,18 @@ wpi_free_rx_ring(struct wpi_softc *sc, s
 
 	wpi_dma_contig_free(&ring->desc_dma);
 
-	for (i = 0; i < WPI_RX_RING_COUNT; i++)
-		if (ring->data[i].m != NULL)
-			m_freem(ring->data[i].m);
+	for (i = 0; i < WPI_RX_RING_COUNT; i++) {
+		struct wpi_rx_data *data = &ring->data[i];
+
+		if (data->m != NULL) {
+			bus_dmamap_sync(ring->data_dmat, data->map,
+			    BUS_DMASYNC_POSTREAD);
+			bus_dmamap_unload(ring->data_dmat, data->map);
+			m_freem(data->m);
+		}
+		if (data->map != NULL)
+			bus_dmamap_destroy(ring->data_dmat, data->map);
+	}
 }
 
 static int
@@ -1461,6 +1470,7 @@ wpi_rx_intr(struct wpi_softc *sc, struct
 		return;
 	}
 
+	bus_dmamap_sync(ring->data_dmat, data->map, BUS_DMASYNC_POSTREAD);
 	head = (struct wpi_rx_head *)((caddr_t)(stat + 1) + stat->len);
 	tail = (struct wpi_rx_tail *)((caddr_t)(head + 1) + le16toh(head->len));
 
@@ -1491,6 +1501,8 @@ wpi_rx_intr(struct wpi_softc *sc, struct
 		ifp->if_ierrors++;
 		return;
 	}
+	bus_dmamap_unload(ring->data_dmat, data->map);
+
 	error = bus_dmamap_load(ring->data_dmat, data->map,
 	    mtod(mnew, caddr_t), MJUMPAGESIZE,
 	    wpi_dma_map_addr, &paddr, BUS_DMA_NOWAIT);
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 30 dfilter service freebsd_committer freebsd_triage 2011-01-02 10:01:38 UTC
Author: bschmidt
Date: Sun Jan  2 10:01:29 2011
New Revision: 216886
URL: http://svn.freebsd.org/changeset/base/216886

Log:
  MFC r216824:
  The RX path is missing a few bus_dmamap_*() calls, this results in
  modification of memory which was already free'd and eventually in:
  wpi0: could not map mbuf (error 12)
  wpi0: wpi_rx_intr: bus_dmamap_load failed, error 12
  and an usuable device.
  
  PR:		kern/144898
  Approved by:	re (kib)

Modified:
  releng/8.2/sys/dev/wpi/if_wpi.c
Directory Properties:
  releng/8.2/sys/   (props changed)
  releng/8.2/sys/amd64/include/xen/   (props changed)
  releng/8.2/sys/cddl/contrib/opensolaris/   (props changed)
  releng/8.2/sys/contrib/dev/acpica/   (props changed)
  releng/8.2/sys/contrib/pf/   (props changed)

Modified: releng/8.2/sys/dev/wpi/if_wpi.c
==============================================================================
--- releng/8.2/sys/dev/wpi/if_wpi.c	Sun Jan  2 09:03:53 2011	(r216885)
+++ releng/8.2/sys/dev/wpi/if_wpi.c	Sun Jan  2 10:01:29 2011	(r216886)
@@ -1052,9 +1052,18 @@ wpi_free_rx_ring(struct wpi_softc *sc, s
 
 	wpi_dma_contig_free(&ring->desc_dma);
 
-	for (i = 0; i < WPI_RX_RING_COUNT; i++)
-		if (ring->data[i].m != NULL)
-			m_freem(ring->data[i].m);
+	for (i = 0; i < WPI_RX_RING_COUNT; i++) {
+		struct wpi_rx_data *data = &ring->data[i];
+
+		if (data->m != NULL) {
+			bus_dmamap_sync(ring->data_dmat, data->map,
+			    BUS_DMASYNC_POSTREAD);
+			bus_dmamap_unload(ring->data_dmat, data->map);
+			m_freem(data->m);
+		}
+		if (data->map != NULL)
+			bus_dmamap_destroy(ring->data_dmat, data->map);
+	}
 }
 
 static int
@@ -1461,6 +1470,7 @@ wpi_rx_intr(struct wpi_softc *sc, struct
 		return;
 	}
 
+	bus_dmamap_sync(ring->data_dmat, data->map, BUS_DMASYNC_POSTREAD);
 	head = (struct wpi_rx_head *)((caddr_t)(stat + 1) + stat->len);
 	tail = (struct wpi_rx_tail *)((caddr_t)(head + 1) + le16toh(head->len));
 
@@ -1491,6 +1501,8 @@ wpi_rx_intr(struct wpi_softc *sc, struct
 		ifp->if_ierrors++;
 		return;
 	}
+	bus_dmamap_unload(ring->data_dmat, data->map);
+
 	error = bus_dmamap_load(ring->data_dmat, data->map,
 	    mtod(mnew, caddr_t), MJUMPAGESIZE,
 	    wpi_dma_map_addr, &paddr, BUS_DMA_NOWAIT);
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 31 Bernhard Schmidt freebsd_committer freebsd_triage 2011-01-02 10:36:00 UTC
On Sunday 02 January 2011 00:18:04 Dominic Fandrey wrote:
> On 27/12/2010 11:34, Bernhard Schmidt wrote:
> > As you might have noticed, I committed a bunch of fixed to wpi(4)
> > which are now included in 8.2-RC1. Could you please test that in
> > regard to this PR and let me know about the outcome?
> 
> I now tested WPA2 and it works, too. The system is stable in the
> sense that it does not panic and there is no package loss if I ping
> something.

Thanks

> However X applications still freeze and die, as soon as the wlan0
> interface is up. Even if the interface is not associated and no
> routes go through it.

Before or after applying the bus_dma(9) patch? I've seen those without 
the patch but not with it. Is there anything useful in console? Error 
message or something?

-- 
Bernhard
Comment 32 kamikaze 2011-01-02 17:59:51 UTC
On 02/01/2011 11:36, Bernhard Schmidt wrote:
> On Sunday 02 January 2011 00:18:04 Dominic Fandrey wrote:
>> However X applications still freeze and die, as soon as the wlan0
>> interface is up. Even if the interface is not associated and no
>> routes go through it.
> 
> Before or after applying the bus_dma(9) patch? I've seen those without 
> the patch but not with it. Is there anything useful in console? Error 
> message or something?
> 

I just updated my sources and I'm connected for about 10 minutes
without the problem occurring. This is a record so far, so I'm
confident the problem is fixed.

Thanks a lot! This is just great!

Regards

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Comment 33 kamikaze 2011-01-02 19:48:36 UTC
On 02/01/2011 11:36, Bernhard Schmidt wrote:
> On Sunday 02 January 2011 00:18:04 Dominic Fandrey wrote:
>> However X applications still freeze and die, as soon as the wlan0
>> interface is up. Even if the interface is not associated and no
>> routes go through it.
> 
> Before or after applying the bus_dma(9) patch? I've seen those without 
> the patch but not with it. Is there anything useful in console? Error 
> message or something?
> 

A little update, I'm up for almost 2 hours without issues, now.

I think it's pretty safe for me to say that this works. Very well,
too!

Thank you so very much!

I sometimes get a "need multicast update callback", but I do not
see any issues.

Regards

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Comment 34 Bernhard Schmidt freebsd_committer freebsd_triage 2011-01-03 09:30:37 UTC
On Sunday, January 02, 2011 20:48:36 Dominic Fandrey wrote:
> On 02/01/2011 11:36, Bernhard Schmidt wrote:
> > On Sunday 02 January 2011 00:18:04 Dominic Fandrey wrote:
> >> However X applications still freeze and die, as soon as the wlan0
> >> interface is up. Even if the interface is not associated and no
> >> routes go through it.
> > 
> > Before or after applying the bus_dma(9) patch? I've seen those without
> > the patch but not with it. Is there anything useful in console? Error
> > message or something?
> 
> A little update, I'm up for almost 2 hours without issues, now.
> 
> I think it's pretty safe for me to say that this works. Very well,
> too!

Sounds good!

> Thank you so very much!

You're welcome :)

> I sometimes get a "need multicast update callback", but I do not
> see any issues.

Those are harmless and just indicate that there is a function missing in 
wpi(4), just ignore those for now.

So, after going over all the mails for this PR again, there's only one issue 
left. The one with the RFKILL button where the interface isn't coming up 
automatically after triggering. This though, is a general issue within all 
drivers currently (I might even have fix this with iwn(4) afair), not just 
wpi(4). Honestly I'd ignore this for now and address this once I have enough 
clue (it's on my TODO list) for all drivers.

So, what do you think, is it time to close this PR?

Thanks

-- 
Bernhard
Comment 35 kamikaze 2011-01-03 11:00:00 UTC
On 03/01/2011 10:30, Bernhard Schmidt wrote:
> So, what do you think, is it time to close this PR?

As far as I am concerned, go ahead. :)
Comment 36 Bernhard Schmidt freebsd_committer freebsd_triage 2011-01-05 08:09:07 UTC
State Changed
From-To: feedback->closed

Based on the latest feedback I think is safe to close this PR.