Machine panics when issueing /etc/rc.d/netif restart. wlan0: ieee80211_new_state_locked: pending SCAN -> AUTH transition lost Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xc4fe81d5 fault code = supervisor read, page not present instruction pointer = 0x20:0xc4fa9b0c stack pointer = 0x28:0xdf834b7c frame pointer = 0x28:0xdf834c34 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (iwi0 taskq) trap number = 12 panic: page fault cpuid = 0 Uptime: 30m31s Physical memory: 750 MB Dumping 78 MB: (CTRL-C to abort) (CTRL-C to abort) 63 47 31 15 at /usr/home/edwin/svn/build/sys/i386/i386/trap.c:938 #4 0xc0bc1ce0 in trap_pfault (frame=0xdf834b3c, usermode=0, eva=3305013717) at /usr/home/edwin/svn/build/sys/i386/i386/trap.c:851 #5 0xc0bc2625 in trap (frame=0xdf834b3c) at /usr/home/edwin/svn/build/sys/i386/i386/trap.c:533 #6 0xc0ba4c0b in calltrap () at /usr/home/edwin/svn/build/sys/i386/i386/exception.s:165 #7 0xc4fa9b0c in iwi_auth_and_assoc (sc=0xc4298800, vap=0xc4a61000) at /usr/home/edwin/svn/build/sys/modules/iwi/../../dev/iwi/if_iwi.c:2854 #8 0xc4faa1f9 in iwi_newstate (vap=0xc4a61000, nstate=IEEE80211_S_AUTH, arg=192) at /usr/home/edwin/svn/build/sys/modules/iwi/../../dev/iwi/if_iwi.c:1001 #9 0xc09767a1 in ieee80211_newstate_cb (xvap=0xc4a61000, npending=1) at /usr/home/edwin/svn/build/sys/net80211/ieee80211_proto.c:1654 #10 0xc08c52a2 in taskqueue_run (queue=0xc4b61cc0) at /usr/home/edwin/svn/build/sys/kern/subr_taskqueue.c:239 #11 0xc08c54ad in taskqueue_thread_loop (arg=0xc4fad074) at /usr/home/edwin/svn/build/sys/kern/subr_taskqueue.c:360 #12 0xc0862231 in fork_exit (callout=0xc08c53f0 <taskqueue_thread_loop>, arg=0xc4fad074, frame=0xdf834d38) at /usr/home/edwin/svn/build/sys/kern/kern_fork.c:843 #13 0xc0ba4c80 in fork_trampoline () at /usr/home/edwin/svn/build/sys/i386/i386/exception.s:270 (kgdb) (kgdb) frame 7 #7 0xc4fa9b0c in iwi_auth_and_assoc (sc=0xc4298800, vap=0xc4a61000) at /usr/home/edwin/svn/build/sys/modules/iwi/../../dev/iwi/if_iwi.c:2854 2854 rs.mode = mode; (kgdb) list 2849 if (error != 0) 2850 goto done; 2851 2852 /* the rate set has already been "negotiated" */ 2853 memset(&rs, 0, sizeof rs); 2854 rs.mode = mode; 2855 rs.type = IWI_RATESET_TYPE_NEGOTIATED; 2856 rs.nrates = ni->ni_rates.rs_nrates; 2857 if (rs.nrates > IWI_RATESET_SIZE) { 2858 DPRINTF(("Truncating negotiated rate set from %u\n", (kgdb) info local ic = (struct ieee80211com *) 0xc4fad000 ifp = (struct ifnet *) 0xc43f2000 ni = (struct ieee80211_node *) 0xc4fe8000 config = {bluetooth_coexistence = 0 '\0', reserved1 = 0 '\0', answer_pbreq = 0 '\0', allow_invalid_frames = 0 '\0', multicast_enabled = 1 '\001', drop_unicast_unencrypted = 0 '\0', disable_unicast_decryption = 1 '\001', drop_multicast_unencrypted = 0 '\0', disable_multicast_decryption = 1 '\001', antenna = 0 '\0', include_crc = 0 '\0', use_protection = 1 '\001', protection_ctsonly = 0 '\0', enable_multicast_filtering = 0 '\0', bluetooth_threshold = 0 '\0', silence_threshold = 0 '\0', allow_beacon_and_probe_resp = 0 '\0', allow_mgt = 0 '\0', noise_reported = 0 '\0', reserved5 = 0 '\0'} assoc = Variable "assoc" is not available. (kgdb) p ni $1 = (struct ieee80211_node *) 0xc4fe8000 (kgdb) p *ni Cannot access memory at address 0xc4fe8000 Information available: [~/svn/build] edwin@vaio>svn info [...] Revision: 205159 Last Changed Author: jilles Last Changed Rev: 205150 Last Changed Date: 2010-03-15 00:07:40 +1100 (Mon, 15 Mar 2010) In http://www.mavetju.org/~edwin/vmcore.1: -rw-r--r-- 1 1001 80 22423 Mar 15 05:43 core.txt.1.bz2 -rw-r--r-- 1 1001 80 464 Mar 15 05:43 info.1 -rw-r--r-- 1 1001 80 18104049 Mar 15 06:01 kernel.debug.bz2 -rw-r--r-- 1 1001 80 17194295 Mar 15 05:51 vmcore.1.bz2 Fix: I am able to reproduce this on demand, please let me know which steps you want me to take to troubleshoot this or test patches if needed.
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s).
Responsible Changed From-To: freebsd-fs->freebsd-net oops. pointed out by: brucec
I'm seeing this as well after an upgrade to 8.0-RELEASE-p2 from 7.2-RELEASE.
Please note that this issue is still happening on 8.1-STABLE #2 r210522 -- Edwin Groothuis Website: http://www.mavetju.org/ edwin@mavetju.org Weblog: http://www.mavetju.org/weblog/
On Wed, Jul 28, 2010 at 06:00, Edwin Groothuis <edwin@mavetju.org> wrote: > The following reply was made to PR kern/144755; it has been noted by GNAT= S. > > From: Edwin Groothuis <edwin@mavetju.org> > To: bug-followup@FreeBSD.org > Cc: > Subject: Re: kern/144755: [iwi] [panic] iwi panic when issuing /etc/rc.d/= netif restart on 8-STABLE r205159 > Date: Wed, 28 Jul 2010 13:33:42 +1000 > > =A0Please note that this issue is still happening on 8.1-STABLE #2 r21052= 2 Can you check if wpa_supplicant gets started twice after "/etc/rc.d/netif restart"? I have the feeling that this is a more or less known issue, a race between devd (/etc/pccard_ether) and /etc/rc.d/netif. There is a small window (a few ms) where this can happen and our net80211 isn't capable of handling two running wpa_supplicants on the same interface. Which leads to all kind of weird issues, .e.g. panics. --=20 Bernhard
Hello Bernhard! On Wed, Jul 28, 2010 at 12:11:17PM +0200, Bernhard Schmidt wrote: > Can you check if wpa_supplicant gets started twice after > "/etc/rc.d/netif restart"? According to core.txt there are two wpa_supplicant processes, but I don't know if it is caused by a fork of itself or if it is caused by something started twice: 0 7084 1 0 56 0 5192 0 select Ds ?? 3686:38.00 [wpa_suppli 0 7085 1 0 57 0 5192 0 select Ds ?? 2617:11.00 [wpa_suppli Based on the PPID of 1 for both, I would say it gets started twice. root wpa_supplicant 7085 root / 2 drwxr-xr-x 512 r root wpa_supplicant 7085 wd / 2 drwxr-xr-x 512 r root wpa_supplicant 7085 text /usr 4526084 -r-xr-xr-x 295932 r root wpa_supplicant 7085 0 /dev 7 crw-rw-rw- null rw root wpa_supplicant 7085 1 /dev 7 crw-rw-rw- null rw root wpa_supplicant 7085 2 /dev 7 crw-rw-rw- null rw root wpa_supplicant 7085 3* internet dgram udp c52b1000 root wpa_supplicant 7085 4* route raw 0 c527f4d4 root wpa_supplicant 7085 5 /dev 15 crw------- bpf rw root wpa_supplicant 7084 root / 2 drwxr-xr-x 512 r root wpa_supplicant 7084 wd / 2 drwxr-xr-x 512 r root wpa_supplicant 7084 text /usr 4526084 -r-xr-xr-x 295932 r root wpa_supplicant 7084 0 /dev 7 crw-rw-rw- null rw root wpa_supplicant 7084 1 /dev 7 crw-rw-rw- null rw root wpa_supplicant 7084 2 /dev 7 crw-rw-rw- null rw root wpa_supplicant 7084 3* internet dgram udp c4f520dc root wpa_supplicant 7084 4* route raw 0 c527f670 root wpa_supplicant 7084 5* local stream c4f508bc root wpa_supplicant 7084 6 /var 23622 -rw------- 4 w root wpa_supplicant 7084 7* local stream c4f5035c <-> c4f50408 root wpa_supplicant 7084 8 /dev 15 crw------- bpf rw > I have the feeling that this is a more or less known issue, a race > between devd (/etc/pccard_ether) and /etc/rc.d/netif. There is a small > window (a few ms) where this can happen and our net80211 isn't capable > of handling two running wpa_supplicants on the same interface. Which > leads to all kind of weird issues, .e.g. panics. Nice! Is there a workaround or patch available for testing? Edwin -- Edwin Groothuis Website: http://www.mavetju.org/ edwin@mavetju.org Weblog: http://www.mavetju.org/weblog/
On Thu, Aug 5, 2010 at 00:31, Edwin Groothuis <edwin@mavetju.org> wrote: > Hello Bernhard! > > On Wed, Jul 28, 2010 at 12:11:17PM +0200, Bernhard Schmidt wrote: >> Can you check if wpa_supplicant gets started twice after >> "/etc/rc.d/netif restart"? > > According to core.txt there are two wpa_supplicant processes, but > I don't know if it is caused by a fork of itself or if it is caused > by something started twice: > > =A0 =A00 =A07084 =A0 =A0 1 =A0 0 =A056 =A00 =A05192 =A0 =A0 0 select Ds = =A0 =A0?? =A03686:38.00 [wpa_suppli > =A0 =A00 =A07085 =A0 =A0 1 =A0 0 =A057 =A00 =A05192 =A0 =A0 0 select Ds = =A0 =A0?? =A02617:11.00 [wpa_suppli > > Based on the PPID of 1 for both, I would say it gets started twice. Indeed. > > [..] >> I have the feeling that this is a more or less known issue, a race >> between devd (/etc/pccard_ether) and /etc/rc.d/netif. There is a small >> window (a few ms) where this can happen and our net80211 isn't capable >> of handling two running wpa_supplicants on the same interface. Which >> leads to all kind of weird issues, .e.g. panics. > > Nice! Is there a workaround or patch available for testing? A temporary workaround is to add a "sleep 1" in /etc/pccard_ether, something I'd rather not commit. I haven't yet found a clean solution for this.. any pointers welcome. :) --=20 Bernhard
On Thu, Aug 5, 2010 at 00:31, Edwin Groothuis <edwin@mavetju.org> wrote: > [..] > Nice! Is there a workaround or patch available for testing? Please give the attached patch a try. It does not prevent wpa_supplicant from starting twice (for that you can define ctrl_interface= in wpa_supplicant.conf), but should no longer panic. Thanks. -- Bernhard
On Thu, Aug 05, 2010 at 08:10:08PM +0000, Bernhard Schmidt wrote: > The following reply was made to PR kern/144755; it has been noted by GNATS. > > From: Bernhard Schmidt <bschmidt@techwires.net> > To: Edwin Groothuis <edwin@mavetju.org> > Cc: bug-followup@freebsd.org > Subject: Re: kern/144755: [iwi] [panic] iwi panic when issuing /etc/rc.d/netif > restart on 8-STABLE r205159 > Date: Thu, 5 Aug 2010 22:08:38 +0200 > > --0015175cda7eab5eec048d191d67 > Content-Type: text/plain; charset=ISO-8859-1 > > On Thu, Aug 5, 2010 at 00:31, Edwin Groothuis <edwin@mavetju.org> wrote: > > [..] > > Nice! Is there a workaround or patch available for testing? > > Please give the attached patch a try. It does not prevent > wpa_supplicant from starting twice (for that you can define > ctrl_interface= in wpa_supplicant.conf), but should no longer panic. I can reproduce this panic for if_rum, similiar patch also helps. wlan0: ieee80211_new_state_locked: pending RUN -> SCAN transition lost wlan0: ieee80211_new_state_locked: pending RUN -> SCAN transition lost Fatal trap 12: page fault while in kernel mode fault virtual address = 0xffff fault code = supervisor read, page not present instruction pointer = 0x20:0xc0900d42 stack pointer = 0x28:0xc4f05bac frame pointer = 0x28:0xc4f05bb8 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (rum0 taskq) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c0669547,c06cb000,c0660c6a,c4f05a5c,c4f05a5c,...) at 0xc0436706 = db_trace_self_wrapper+0x26 panic(c0660c6a,c0680c79,c4f05b6c,1,1,...) at 0xc04b898d = panic+0xed trap_fatal(c06c9740,f000,1,0,c04c0ef6,...) at 0xc06474bd = trap_fatal+0x23d trap_pfault(0,c066c5e8,2d7,0,c06c9220,...) at 0xc064787a = trap_pfault+0x27a trap(c4f05b6c) at 0xc06481ab = trap+0x39b calltrap() at 0xc062d4ac = calltrap+0x6 --- trap 0xc, eip = 0xc0900d42, esp = 0xc4f05bac, ebp = 0xc4f05bb8 --- ieee80211_getcapinfo(c5caa000,ffff,c08f415a,c5caa874,c5463d00,...) at 0xc0900d42 = ieee80211_getcapinfo+0x71 ieee80211_beacon_construct(c62a8000,18,676,c50f5c00,c54e3988,...) at 0xc090308d = ieee80211_beacon_construct+0x67 ieee80211_beacon_alloc(c62a8000,c5caa874,6,2c5,5,...) at 0xc09039a0 = ieee80211_beacon_alloc+0x93 rum_newstate(c5caa000,5,ffffffff,652,c5362014,...) at 0xc9a9b55f = rum_newstate+0x259 ieee80211_newstate_cb(c5caa000,4,0,c0695c9c,0,...) at 0xc0906eb8 = ieee80211_newstate_cb+0x7a taskqueue_run(c537db00,c537db18,0,c0661905,0,...) at 0xc04ef61a = taskqueue_run+0x8a taskqueue_thread_loop(c5362074,c4f05d38,0,0,0,...) at 0xc04efd74 = taskqueue_thread_loop+0x44 fork_exit(c04efd30,c5362074,c4f05d38) at 0xc048e868 = fork_exit+0x88 fork_trampoline() at 0xc062d524 = fork_trampoline+0x8 -- Adios
Responsible Changed From-To: freebsd-net->bschmidt Over to me.
Responsible Changed From-To: bschmidt->freebsd-wireless back to pool, see also kern/153594
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
(In reply to edwin from comment #0) This one was fixed in base r258612 (In reply to Alex Kozlov from comment #9) and this one should be completely fixed in base r296237.