Fatal trap 9: general protection fault while in kernel mode cpuid = 2; apic id = 02 instruction pointer = 0x20:0xffffffff80774d1d stack pointer = 0x28:0xfffffe032dfe9420 frame pointer = 0x28:0xfffffe032dfe94f0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 11 (swi4: clock (0)) #9 0xffffffff80cc289a in trap_check (frame=0xfffffe032dfe9370) at /home/git/head/sys/amd64/amd64/trap.c:658 #10 <signal handler called> #11 0xffffffff80774d1d in _rw_wlock_cookie (c=0xdeadc0dedeadc2de, file=0xffffffff80e40db0 "/home/git/head/sys/netinet/if_ether.c", line=0x10a) at /home/git/head/sys/kern/kern_rwlock.c:264 #12 0xffffffff80a0c0a3 in arptimer (arg=0xfffff8000ffc3600) at /home/git/head/sys/netinet/if_ether.c:266 #13 0xffffffff807a663c in softclock_call_cc (c=0xfffff8000ffc36b8, cc=0xffffffff815f1280 <cc_cpu>, direct=0x0) at /home/git/head/sys/kern/kern_timeout.c:723 #14 0xffffffff807a6e6c in softclock (arg=0xffffffff815f1280 <cc_cpu>) at /home/git/head/sys/kern/kern_timeout.c:861 #15 0xffffffff80724c48 in intr_event_execute_handlers (p=0xfffff80003df8000, ie=0xfffff80003e17500) at /home/git/head/sys/kern/kern_intr.c:1262 #16 0xffffffff80725a37 in ithread_execute_handlers (p=0xfffff80003df8000, ie=0xfffff80003e17500) at /home/git/head/sys/kern/kern_intr.c:1275 #17 0xffffffff80725866 in ithread_loop (arg=0xfffff80003e2e040) at /home/git/head/sys/kern/kern_intr.c:1356 #18 0xffffffff807201f9 in fork_exit (callout=0xffffffff807256f0 <ithread_loop>, arg=0xfffff80003e2e040, frame=0xfffffe032dfe9ac0) at /home/git/head/sys/kern/kern_fork.c:1035 #19 <signal handler called> (kgdb) up 12 #12 0xffffffff80a0c0a3 in arptimer (arg=0xfffff8000ffc3600) at /home/git/head/sys/netinet/if_ether.c:266 266 IF_AFDATA_LOCK(ifp); (kgdb) l 261 262 callout_stop(&lle->lle_timer); 263 264 /* XXX: LOR avoidance. We still have ref on lle. */ 265 LLE_WUNLOCK(lle); 266 IF_AFDATA_LOCK(ifp); 267 LLE_WLOCK(lle); 268 269 /* Guard against race with other llentry_free(). */ 270 if (lle->la_flags & LLE_LINKED) { (kgdb) p *ifp Cannot access memory at address 0xdeadc0dedeadc0de
Fix is possibly here - waiting for review: https://reviews.freebsd.org/D4605 Known issue. --HPS
Reproducable with recent 11-STABLE doing ifconfig up/down in sequence. --HPS 270 LLE_REMREF(lle); 271 lltable_unlink_entry(lle->lle_tbl, lle); 272 } 273 IF_AFDATA_UNLOCK(ifp); 274 275 size_t pkts_dropped = llentry_free(lle); 276 277 ARPSTAT_ADD(dropped, pkts_dropped); 278 ARPSTAT_INC(timeouts); 279 (kgdb) bt #0 doadump (textdump=1) at pcpu.h:221 #1 0xffffffff80ae755e in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff80ae7b2b in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff80ae7963 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff80fc0361 in trap_fatal (frame=0xfffffe0f92146720, eva=952) at /usr/src/sys/amd64/amd64/trap.c:841 #5 0xffffffff80fc0553 in trap_pfault (frame=0xfffffe0f92146720, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:691 #6 0xffffffff80fbfafc in trap (frame=0xfffffe0f92146720) at /usr/src/sys/amd64/amd64/trap.c:442 #7 0xffffffff80fa2e41 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #8 0xffffffff80ae2abb in __rw_wlock_hard (c=<value optimized out>, tid=<value optimized out>, file=<value optimized out>, line=<value optimized out>) at /usr/src/sys/kern/kern_rwlock.c:830 #9 0xffffffff80c60086 in arptimer (arg=<value optimized out>) at /usr/src/sys/netinet/if_ether.c:265 #10 0xffffffff80b01d6a in softclock_call_cc (c=<value optimized out>, cc=<value optimized out>, direct=<value optimized out>) at /usr/src/sys/kern/kern_timeout.c:729 #11 0xffffffff80b022e4 in softclock (arg=<value optimized out>) at /usr/src/sys/kern/kern_timeout.c:867 #12 0xffffffff80aa239f in intr_event_execute_handlers (p=<value optimized out>, ie=<value optimized out>) at /usr/src/sys/kern/kern_intr.c:1262 #13 0xffffffff80aa29b6 in ithread_loop (arg=<value optimized out>) at /usr/src/sys/kern/kern_intr.c:1275 #14 0xffffffff80a9efe5 in fork_exit (callout=0xffffffff80aa28f0 <ithread_loop>, arg=0xfffff8000ad70700, frame=0xfffffe0f92146ac0) at /usr/src/sys/kern/kern_fork.c:1038 #15 0xffffffff80fa337e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 #16 0x0000000000000000 in ?? ()
> _rw_wlock_cookie Was seen on r309481 (only once, ifconfig(8) was not touched).
I have hit this 2x in the last 4 days on stable 11.1: I'm still looking for a reliable way to recreate. #7 0xffffffff807835de in trap (frame=0xfffffe000038a7b0) at pcpu.h:222 #8 0xffffffff80767781 in calltrap () at /usr/home/daveb/p4work/SpectraBSD_stable/sys/amd64/amd64/exception.S:236 #9 0xffffffff804ba395 in _rw_wlock_cookie (c=0xdeadc0dedeadc2de, file=0xffffffff808839e0 "/usr/home/daveb/p4work/SpectraBSD_stable/sys/netinet/if_ether.c", line=287) at /usr/home/daveb/p4work/SpectraBSD_stable/sys/kern/kern_rwlock.c:272 #10 0xffffffff805d348f in arptimer (arg=0xfffff8014c37a200) at /usr/home/daveb/p4work/SpectraBSD_stable/sys/netinet/if_ether.c:287 #11 0xffffffff804d5fb8 in softclock_call_cc (c=0xfffff8014c37a2b8, cc=<value optimized out>, direct=<value optimized out>) at /usr/home/daveb/p4work/SpectraBSD_stable/sys/kern/kern_timeout.c:729 #12 0xffffffff804d639c in softclock (arg=0xffffffff80ed1100) at /usr/home/daveb/p4work/SpectraBSD_stable/sys/kern/kern_timeout.c:867 #13 0xffffffff80483fc9 in intr_event_execute_handlers (p=<value optimized out>, ie=0xfffff8011f130000) at /usr/home/daveb/p4work/SpectraBSD_stable/sys/kern/kern_intr.c:1262 #14 0xffffffff80484666 in ithread_loop (arg=0xfffff8011f0fd5c0) at /usr/home/daveb/p4work/SpectraBSD_stable/sys/kern/kern_intr.c:1275 #15 0xffffffff804816b4 in fork_exit (callout=0xffffffff804845b0 <ithread_loop>, arg=0xfffff8011f0fd5c0, frame=0xfffffe000038aac0) at /usr/home/daveb/p4work/SpectraBSD_stable/sys/kern/kern_fork.c:1042 #16 0xffffffff80767cbe in fork_trampoline () at /usr/home/daveb/p4work/SpectraBSD_stable/sys/amd64/amd64/exception.S:611 #17 0x0000000000000000 in ?? () (kgdb) frame 10 #10 0xffffffff805d348f in arptimer (arg=0xfffff8014c37a200) at /usr/home/daveb/p4work/SpectraBSD_stable/sys/netinet/if_ether.c:287 287 IF_AFDATA_LOCK(ifp); (kgdb) set $lle = (struct llentry *)arg (kgdb) p $lle $6 = (struct llentry *) 0xfffff8014c37a200 (kgdb) p *$lle $7 = { lle_next = { le_next = 0x0, le_prev = 0xfffff80c29487a00 }, r_l3addr = { addr4 = { s_addr = 0 }, addr6 = { __u6_addr = { __u6_addr8 = 0xfffff8014c37a210 "", __u6_addr16 = 0xfffff8014c37a210, __u6_addr32 = 0xfffff8014c37a210 } } }, r_linkdata = 0xfffff8014c37a220 "$\001�#\234�", r_hdrlen = 14 '\016', spare0 = 0xfffff8014c37a239 "", r_flags = 1, r_skip_req = 1, lle_tbl = 0xfffff80c6a454d00, lle_head = 0xfffff80c29487a00, lle_free = 0xffffffff805db670 <in_lltable_destroy_lle>, la_hold = 0x0, la_numheld = 0, la_expire = 11654, la_flags = 1, la_asked = 0, la_preempt = 5, ln_state = 2, ln_router = 0, ln_ntick = 0, lle_remtime = 0, lle_hittime = 0, lle_refcnt = 1, ll_addr = 0xfffff8014c37a220 "$\001�#\234�", lle_chain = { le_next = 0x0, le_prev = 0x0 }, lle_timer = { c_links = { le = { le_next = 0x0, le_prev = 0xffffffff80ed1218 }, sle = { sle_next = 0x0 }, tqe = { tqe_next = 0x0, tqe_prev = 0xffffffff80ed1218 } }, c_time = 50057632637741, c_precision = 268435437, c_arg = 0xfffff8014c37a200, c_func = 0xffffffff805d31f0 <arptimer>, c_lock = 0x0, c_flags = 0, c_iflags = 144, c_cpu = 0 }, lle_lock = { lock_object = { lo_name = 0xffffffff8087f95e "lle", lo_flags = 90374144, lo_data = 0, lo_witness = 0xfffffe00009fe180 }, rw_lock = 1 }, req_mtx = { lock_object = { lo_name = 0xffffffff808848ce "lle req", lo_flags = 16973824, lo_data = 0, lo_witness = 0xfffffe00009fe200 }, mtx_lock = 4 } } (kgdb) p *$lle->lle_tbl $9 = { llt_link = { sle_next = 0xdeadc0dedeadc0de }, llt_af = -559038242, llt_hsize = -559038242, lle_head = 0xdeadc0dedeadc0de, llt_ifp = 0xdeadc0dedeadc0de, llt_lookup = 0xdeadc0dedeadc0de, llt_alloc_entry = 0xdeadc0dedeadc0de, llt_delete_entry = 0xdeadc0dedeadc0de, llt_prefix_free = 0xdeadc0dedeadc0de, llt_dump_entry = 0xdeadc0dedeadc0de, llt_hash = 0xdeadc0dedeadc0de, llt_match_prefix = 0xdeadc0dedeadc0de, llt_free_entry = 0xdeadc0dedeadc0de, llt_foreach_entry = 0xdeadc0dedeadc0de, llt_link_entry = 0xdeadc0dedeadc0de, llt_unlink_entry = 0xdeadc0dedeadc0de, llt_fill_sa_entry = 0xdeadc0dedeadc0de, llt_free_tbl = 0xdeadc0dedeadc0de } (kgdb)
I think you can reproduce like this: DEF_ROUTE=x.x.x.x # replace with IP of your default route vmstat -m | grep llt arp -d $DEF_ROUTE ping $DEF_ROUTE vmstat -m | grep llt arp -d $DEF_ROUTE ping $DEF_ROUTE
(In reply to Hans Petter Selasky from comment #5) I am unable to reproduce with that test.
OK, but do you see the memory leakage. Try to run some other software which causes kernel memory to be allocated and freed.
... while running the script above.
(In reply to Hans Petter Selasky from comment #8) I've tried your test and more in an infinite loop; all I get is various crashes in the IPv6 code (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225438). I even recompiled without INET6; no arptimer() crashes yet.
Does applying the fix at: https://reviews.freebsd.org/D4605 resolve those IPv6 based panics? It might be you're just lucky that the IPv4 arptimer is not panicing. --HPS
After 38 days of quiet, this bug has reemerged in our CI tests. I have hit this again 4 times in the past 24 hours (2x on stable 11.0 and 2x on stable 11.1). I'm going to try Hans Peter Salasky's proposed patch ...
Back in March, 2018, I added hselasky's patch to our 11.1 base. I and have not seen this issues since. I did not add the patch to our 11.0 release; this bug reared its head in our 11.0 based release stream. I do know what triggers this event; I can not reproduce at will, but its seems that Hans Petter's patch is worthy. #6 0xffffffff8081908c in trap (frame=0xfffffe07c4ff37b0) at sys/amd64/amd64/trap.c:442 #7 0xffffffff807fd341 in calltrap () at sys/amd64/amd64/exception.S:236 #8 0xffffffff80625e64 in arptimer (arg=<value optimized out>) at sys/netinet/if_ether.c:265 #9 0xffffffff8050dd0a in softclock_call_cc (c=<value optimized out>, cc=<value optimized out>, direct=<value optimized out>) Again the ifp from lle->lle_tbl->llt_ifp is invalid: p *$lle->lle_tbl $4 = { llt_link = { sle_next = 0x0 }, llt_af = 0, llt_hsize = 0, lle_head = 0xfffff800286a6000, llt_ifp = 0x2, llt_lookup = 0xc1, llt_alloc_entry = 0, llt_delete_entry = 0x9, llt_prefix_free = 0xc0, llt_dump_entry = 0x1667, llt_hash = 0, llt_match_prefix = 0x1d97, llt_free_entry = 0x1a667b, llt_foreach_entry = 0x1, llt_link_entry = 0, llt_unlink_entry = 0, llt_fill_sa_entry = 0, llt_free_tbl = 0 }
Alexander: Any news? When will the patches discussed hit the tree?
A commit references this bug: Author: ae Date: Tue Jul 17 11:33:23 UTC 2018 New revision: 336405 URL: https://svnweb.freebsd.org/changeset/base/336405 Log: Move invoking of callout_stop(&lle->lle_timer) into llentry_free(). This deduplicates the code a bit, and also implicitly adds missing callout_stop() to in[6]_lltable_delete_entry() functions. PR: 209682, 225927 Submitted by: hselasky (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D4605 Changes: head/sys/net/if_llatbl.c head/sys/netinet/in.c head/sys/netinet6/in6.c
Assign to committer resolving
A commit references this bug: Author: ae Date: Wed Aug 8 16:09:29 UTC 2018 New revision: 337460 URL: https://svnweb.freebsd.org/changeset/base/337460 Log: MFC r336405: Move invoking of callout_stop(&lle->lle_timer) into llentry_free(). This deduplicates the code a bit, and also implicitly adds missing callout_stop() to in[6]_lltable_delete_entry() functions. PR: 209682, 225927 Changes: _U stable/11/ stable/11/sys/net/if_llatbl.c stable/11/sys/netinet/in.c stable/11/sys/netinet6/in6.c
Fixed in head/ and stable/11. Thanks!