Summary: | if_tun: use after free panic after mbuf freed upon if_tun teardown | ||
---|---|---|---|
Product: | Base System | Reporter: | WHR <msl0000023508> |
Component: | kern | Assignee: | Cy Schubert <cy> |
Status: | Closed Not A Bug | ||
Severity: | Affects Only Me | CC: | cy, markj |
Priority: | --- | Keywords: | crash |
Version: | 12.0-STABLE | ||
Hardware: | amd64 | ||
OS: | Any |
Description
WHR
2019-06-27 04:34:59 UTC
Lines from '/var/log/messages' that added right before the panic: Jun 27 11:31:38 x kernel: tun2: bpf attached Jun 27 11:31:38 x kernel: tun2: link state changed to UP Jun 27 11:31:39 x ppp[11229]: tun2: Warning: Add route failed: 10.x.x.0: errno: Value too large to be stored in data type Jun 27 11:31:39 x ppp[11229]: tun2: Warning: Add route failed: 10.x.x.0: errno: Value too large to be stored in data type Jun 27 11:31:39 x ppp[11229]: tun2: Warning: Add route failed: 10.x.x.0: errno: Value too large to be stored in data type Jun 27 11:31:39 x ppp[11229]: tun2: Warning: ff02::/: Change route failed: errno: Network is unreachable Jun 27 11:31:39 x syslogd: last message repeated 1 times Jun 27 11:31:41 x ppp[9428]: tun0: Warning: ff02::/: Change route failed: errno: Network is unreachable Jun 27 11:31:41 x ppp[9428]: tun0: Warning: Delete route failed: 10.x.x.x: errno: Address already in use Jun 27 11:31:41 x kernel: in_scrubprefix: err=65, old prefix delete failed Jun 27 11:31:41 x kernel: in_scrubprefix: err=17, new prefix add failed Jun 27 11:31:41 x kernel: tun0: link state changed to DOWN Jun 27 11:31:41 x ppp[9428]: tun0: Warning: Delete route failed: 10.x.x.x: errno: Address already in use (private information are masked with x) > The IP Filter module is custom built that been applied patches from bug #238796 > and https://sourceforge.net/p/hacking-freebsd/freebsd-patches/ci/master
> /tree/10.3/ipfilter-local-output-tcp-checksum.diff
Please try this without your custom patches.
Browsing through your git repo, you have a fair number of custom patches to FreeBSD. You will need to revert all your custom patches and reproduce the problem without them before I can accept this PR. After you remove all your customizations, not just the ipfilter customizations but all of them, if it still panics, there is another test I will ask you to perform. Let's do this test first. (In reply to Cy Schubert from comment #3) I don't have other custom patches applied on this kernel (12.0-STABLE r349024). Please don't speculation just because you seen the other patches. Good. Look in the dump at frame 18 and print m, please. Is ipfilter a kld or statically linked in the kernel? Is if_tun a kld or statically linked in the kernel? uname -a output, please. Next, instead of using ipfilter create ipfw rules and try the same using ipfw. Then the same with pf. This appears not to be an ipfilter problem as the address of mbuf appears to be garbage, before ipfilter is entered. In ipf the panic occurs here: ipf_check_wrapper(struct mbuf **mp, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) By testing ipfw we need to see if the panic also occurs here: ipfw_check_packet(struct mbuf **m0, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) And by testing pf we also need to see if the panic also occurs here: pf_check_in(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) Given that **mp points to a bad address, we need to ascertain that **m0 and **m also point to the same bad address. The panic is set up further up the stack in if_tun. All three packet filters will panic at their respective entry points. Performing the suggested test will prove this. In summary. remove the ipfilter rules and create the same ipfw rules. Then create pf rules. Both should panic the same as ipfilter does. The tun interface was deleted and the mbuf was freed while the mbuf address was passed to ipfilter. Unfortunately ipfilter was the victim. The tests will prove the other packet filters will also be victims. Another question: How did you tear down the tun interface? Was it ppp in base or a port? Is this the first time you have had this panic or is this panic consistent? (In reply to Cy Schubert from comment #6) > Look in the dump at frame 18 and print m, please. (kgdb) frame 18 #18 0xffffffff80cd878c in tunwrite (dev=<optimized out>, uio=<optimized out>, flag=<optimized out>) at /usr/src/sys/net/if_tun.c:996 996 netisr_dispatch(isr, m); (kgdb) print m $1 = (struct mbuf *) 0xfffff80004370e00 (kgdb) print *m $2 = {{m_next = 0x0, m_slist = {sle_next = 0x0}, m_stailq = {stqe_next = 0x0}}, {m_nextpkt = 0x0, m_slistpkt = {sle_next = 0x0}, m_stailqpkt = {stqe_next = 0x0}}, m_data = 0xfffff80004370e5c "E", m_len = 84, m_type = 1, m_flags = 2, {{m_pkthdr = {{ snd_tag = 0x0, rcvif = 0x0}, tags = {slh_first = 0x0}, len = 84, flowid = 0, csum_flags = 0, fibnum = 0, cosqos = 0 '\000', rsstype = 0 '\000', {rcv_tstmp = 0, { l2hlen = 0 '\000', l3hlen = 0 '\000', l4hlen = 0 '\000', l5hlen = 0 '\000', spare = 0}}, PH_per = {eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}, PH_loc = { eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}}, {m_ext = {{ext_count = 33554432, ext_cnt = 0x5400004502000000}, ext_buf = 0x31c014000400000 <error: Cannot access memory at address 0x31c014000400000>, ext_size = 30015754, ext_type = 10, ext_flags = 772609, ext_free = 0xe35bf8eb0000, ext_arg1 = 0x88ace5d1438e2, ext_arg2 = 0xf0e0d0c0b0a0908}, m_pktdat = 0xfffff80004370e58 ""}}, m_dat = 0xfffff80004370e20 ""}} This structure looks ok. ipfilter is in a KLD 'ipl.ko'. if_tun is staticly linked in 'kernel'. In fact this kernel image was downloaded from a FreeBSD mirror site, prebuilt snapshot ISO-image. > uname -a output, please. FreeBSD SMSF-RouterBox 12.0-STABLE FreeBSD 12.0-STABLE r349024 GENERIC amd64 > This appears not to be an ipfilter problem as the address of mbuf appears to be garbage, before ipfilter is entered. I thinks so. Unfortunately I can no longer reproduce this panic, so I may not have a change to test ipfw or pf. > Is this the first time you have had this panic or is this panic consistent? Only this time. > Another question: How did you tear down the tun interface? Was it ppp in base or a port? This tun interface was shutdown by ppp(8), as it is closing its interface on exit. frame 17 p *m frame 16 p *m frame 15 p *m frame 14 p *m frame 13 p *m Was an application running at the time you shut down your ppp session? Were you pinging some address or was your computer replying to a ping at the time? frame 12 p m p *ifp (In reply to Cy Schubert from comment #8) > Was an application running at the time you shut down your ppp session? Were you pinging some address or was your computer replying to a ping at the time? Right before the kernel panic occurred, this server is just accepted an incoming PPP over SSH connection, ppp(8) was started to handle this connection; ppp(8) then uses tun2. 2 seconds later an old PPP over SSH connection uses tun0 is destroyed. (In reply to Cy Schubert from comment #8) (kgdb) frame 17 #17 0xffffffff80ced3df in netisr_dispatch_src (proto=1, source=<optimized out>, m=0xfffff80042563000) at /usr/src/sys/net/netisr.c:1122 1122 netisr_proto[proto].np_handler(m); (kgdb) p *m $1 = {{m_next = 0xbb66a00000480, m_slist = {sle_next = 0xbb66a00000480}, m_stailq = { stqe_next = 0xbb66a00000480}}, {m_nextpkt = 0xffffffff, m_slistpkt = { sle_next = 0xffffffff}, m_stailqpkt = {stqe_next = 0xffffffff}}, m_data = 0x0, m_len = 0, m_type = 0, m_flags = 0, {{m_pkthdr = {{snd_tag = 0x0, rcvif = 0x0}, tags = {slh_first = 0x0}, len = 0, flowid = 0, csum_flags = 0, fibnum = 0, cosqos = 0 '\000', rsstype = 0 '\000', { rcv_tstmp = 4, {l2hlen = 4 '\004', l3hlen = 0 '\000', l4hlen = 0 '\000', l5hlen = 0 '\000', spare = 0}}, PH_per = {eight = "\200\370\071\200\377\377\377\377", sixteen = {63616, 32825, 65535, 65535}, thirtytwo = {2151282816, 4294967295}, sixtyfour = {18446744071565867136}, unintptr = {18446744071565867136}, ptr = 0xffffffff8039f880 <adadone>}, PH_loc = {eight = "\030\t\000\000\001\000\000", sixteen = {2328, 0, 1, 0}, thirtytwo = {2328, 1}, sixtyfour = {4294969624}, unintptr = { 4294969624}, ptr = 0x100000918}}, {m_ext = {{ext_count = 56143392, ext_cnt = 0xfffff8000358ae20}, ext_buf = 0x1 <error: Cannot access memory at address 0x1>, ext_size = 0, ext_type = 0, ext_flags = 0, ext_free = 0x80000080, ext_arg1 = 0x3, ext_arg2 = 0xfffff8004d8758d0}, m_pktdat = 0xfffff80042563058 " \256X\003"}}, m_dat = 0xfffff80042563020 ""}} (kgdb) frame 16 #16 0xffffffff80d57f93 in ip_input (m=0x0) at /usr/src/sys/netinet/ip_input.c:828 828 (*inetsw[ip_protox[ip->ip_p]].pr_input)(&m, &hlen, ip->ip_p); (kgdb) p *m Cannot access memory at address 0x0 (kgdb) frame 15 #15 0xffffffff80d573b2 in icmp_input (mp=0xfffffe00005dd8c0, offp=0xfffffe00005dd8bc, proto=1) at /usr/src/sys/netinet/ip_icmp.c:640 640 icmp_reflect(m); (kgdb) p *m $2 = {{m_next = 0x0, m_slist = {sle_next = 0x0}, m_stailq = {stqe_next = 0x0}}, {m_nextpkt = 0x0, m_slistpkt = {sle_next = 0x0}, m_stailqpkt = {stqe_next = 0x0}}, m_data = 0xfffff80004370e5c "E", m_len = 84, m_type = 1, m_flags = 2, {{m_pkthdr = {{ snd_tag = 0x0, rcvif = 0x0}, tags = {slh_first = 0x0}, len = 84, flowid = 0, csum_flags = 0, fibnum = 0, cosqos = 0 '\000', rsstype = 0 '\000', {rcv_tstmp = 0, { l2hlen = 0 '\000', l3hlen = 0 '\000', l4hlen = 0 '\000', l5hlen = 0 '\000', spare = 0}}, PH_per = {eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}, PH_loc = { eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}}, {m_ext = {{ext_count = 33554432, ext_cnt = 0x5400004502000000}, ext_buf = 0x31c014000400000 <error: Cannot access memory at address 0x31c014000400000>, ext_size = 30015754, ext_type = 10, ext_flags = 772609, ext_free = 0xe35bf8eb0000, ext_arg1 = 0x88ace5d1438e2, ext_arg2 = 0xf0e0d0c0b0a0908}, m_pktdat = 0xfffff80004370e58 ""}}, m_dat = 0xfffff80004370e20 ""}} (kgdb) frame 14 #14 icmp_reflect (m=0xfffff80004370e00) at /usr/src/sys/netinet/ip_icmp.c:911 911 icmp_send(m, opts); (kgdb) p *m $3 = {{m_next = 0x0, m_slist = {sle_next = 0x0}, m_stailq = {stqe_next = 0x0}}, {m_nextpkt = 0x0, m_slistpkt = {sle_next = 0x0}, m_stailqpkt = {stqe_next = 0x0}}, m_data = 0xfffff80004370e5c "E", m_len = 84, m_type = 1, m_flags = 2, {{m_pkthdr = {{ snd_tag = 0x0, rcvif = 0x0}, tags = {slh_first = 0x0}, len = 84, flowid = 0, csum_flags = 0, fibnum = 0, cosqos = 0 '\000', rsstype = 0 '\000', {rcv_tstmp = 0, { l2hlen = 0 '\000', l3hlen = 0 '\000', l4hlen = 0 '\000', l5hlen = 0 '\000', spare = 0}}, PH_per = {eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}, PH_loc = { eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}}, {m_ext = {{ext_count = 33554432, ext_cnt = 0x5400004502000000}, ext_buf = 0x31c014000400000 <error: Cannot access memory at address 0x31c014000400000>, ext_size = 30015754, ext_type = 10, ext_flags = 772609, ext_free = 0xe35bf8eb0000, ext_arg1 = 0x88ace5d1438e2, ext_arg2 = 0xf0e0d0c0b0a0908}, m_pktdat = 0xfffff80004370e58 ""}}, m_dat = 0xfffff80004370e20 ""}} (kgdb) frame 13 #13 0xffffffff80d569e7 in icmp_send (m=<optimized out>, opts=0x0) at /usr/src/sys/netinet/ip_icmp.c:947 947 (void) ip_output(m, opts, NULL, 0, NULL, NULL); (kgdb) p *m value has been optimized out (kgdb) p m $4 = <optimized out> (kgdb) frame 12 #12 ip_output (m=0xfffff80004370e00, opt=<optimized out>, ro=<optimized out>, flags=0, imo=0x0, inp=<optimized out>) at /usr/src/sys/netinet/ip_output.c:571 571 switch (ip_output_pfil(&m, ifp, inp, dst, &fibnum, &error)) { (kgdb) p m $5 = (struct mbuf *) 0xfffff80004370e00 (kgdb) p *ifp $6 = {if_link = {cstqe_next = 0xbb66a00000480}, if_clones = {le_next = 0xffffffff, le_prev = 0x0}, if_groups = {cstqh_first = 0x0, cstqh_last = 0x0}, if_alloctype = 0 '\000', if_softc = 0x0, if_llsoftc = 0x0, if_l2com = 0x4, if_dname = 0xffffffff8039f880 <adadone> "UH\211\345AWAVAUATSH\203\354\070I\211\365D\213vxA\203\346\017A\215F\377\203\370\t\017\207`\004", if_dunit = 2328, if_index = 1, if_index_reserved = 0, if_xname = " \256X\003\000\370\377\377\001\000\000\000\000\000\000", if_description = 0x0, if_flags = -2147483520, if_drv_flags = 0, if_capabilities = 3, if_capenable = 0, if_linkmib = 0xfffff8004d8758d0, if_linkmiblen = 0, if_refcount = 0, if_type = 0 '\000', if_addrlen = 0 '\000', if_hdrlen = 0 '\000', if_link_state = 0 '\000', if_mtu = 0, if_metric = 0, if_baudrate = 0, if_hwassist = 4575801, if_epoch = 30000, if_lastchange = { tv_sec = 0, tv_usec = 0}, if_snd = {ifq_head = 0x0, ifq_tail = 0x540c1ab28406103, ifq_len = 0, ifq_maxlen = 0, ifq_mtx = {lock_object = {lo_name = 0x0, lo_flags = 0, lo_data = 0, lo_witness = 0xfffffe000ee70000}, mtx_lock = 32768}, ifq_drv_head = 0x0, ifq_drv_tail = 0x0, ifq_drv_len = 0, ifq_drv_maxlen = 0, altq_type = 0, altq_flags = 0, altq_disc = 0x0, altq_ifp = 0x0, altq_enqueue = 0x0, altq_dequeue = 0x0, altq_request = 0x0, altq_clfier = 0x0, altq_classify = 0x0, altq_tbr = 0x0, altq_cdnr = 0x0}, if_linktask = { ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0x0, ta_context = 0x0}, if_addr_lock = {lock_object = {lo_name = 0x0, lo_flags = 0, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, if_addrhead = {cstqh_first = 0x0, cstqh_last = 0x0}, if_multiaddrs = {cstqh_first = 0x0, cstqh_last = 0x0}, if_amcount = 0, if_addr = 0x0, if_hw_addr = 0x0, if_broadcastaddr = 0x0, if_afdata_lock = {lock_object = {lo_name = 0x0, lo_flags = 0, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, if_afdata = { 0x0 <repeats 42 times>}, if_afdata_initialized = 0, if_fib = 0, if_vnet = 0x0, if_home_vnet = 0x0, if_vlantrunk = 0x0, if_bpf = 0x0, if_pcount = 0, if_bridge = 0x0, if_lagg = 0x0, if_pf_kif = 0x0, if_carp = 0x0, if_label = 0x0, if_netmap = 0x0, if_output = 0x0, if_input = 0x0, if_bridge_input = 0x0, if_bridge_output = 0x0, if_bridge_linkstate = 0x0, if_start = 0x0, if_ioctl = 0x0, if_init = 0x0, if_resolvemulti = 0x0, if_qflush = 0x0, if_transmit = 0x0, if_reassign = 0x0, if_get_counter = 0x0, if_requestencap = 0x0, if_counters = {0x0 <repeats 12 times>}, if_hw_tsomax = 0, if_hw_tsomaxsegcount = 0, if_hw_tsomaxsegsize = 0, if_snd_tag_alloc = 0x0, if_snd_tag_modify = 0x0, if_snd_tag_query = 0x0, if_snd_tag_free = 0x0, if_pcp = 0 '\000', if_netdump_methods = 0x0, if_epoch_ctx = {data = {0x0, 0x0}}, if_unused = {0x0, 0x0, 0x0, 0x0}, if_ispare = {0, 0, 0, 0}} (kgdb) p *m $7 = {{m_next = 0x0, m_slist = {sle_next = 0x0}, m_stailq = {stqe_next = 0x0}}, {m_nextpkt = 0x0, m_slistpkt = {sle_next = 0x0}, m_stailqpkt = {stqe_next = 0x0}}, m_data = 0xfffff80004370e5c "E", m_len = 84, m_type = 1, m_flags = 2, {{m_pkthdr = {{ snd_tag = 0x0, rcvif = 0x0}, tags = {slh_first = 0x0}, len = 84, flowid = 0, csum_flags = 0, fibnum = 0, cosqos = 0 '\000', rsstype = 0 '\000', {rcv_tstmp = 0, { l2hlen = 0 '\000', l3hlen = 0 '\000', l4hlen = 0 '\000', l5hlen = 0 '\000', spare = 0}}, PH_per = {eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}, PH_loc = { eight = "\000\000\000\000\000\000\000", sixteen = {0, 0, 0, 0}, thirtytwo = {0, 0}, sixtyfour = {0}, unintptr = {0}, ptr = 0x0}}, {m_ext = {{ext_count = 33554432, ext_cnt = 0x5400004502000000}, ext_buf = 0x31c014000400000 <error: Cannot access memory at address 0x31c014000400000>, ext_size = 30015754, ext_type = 10, ext_flags = 772609, ext_free = 0xe35bf8eb0000, ext_arg1 = 0x88ace5d1438e2, ext_arg2 = 0xf0e0d0c0b0a0908}, m_pktdat = 0xfffff80004370e58 ""}}, m_dat = 0xfffff80004370e20 ""}} (In reply to WHR from comment #9) This is why this happened. We have a race condition. The interface was torn down and the mbuf deleted before ipfilter's entry point to push the arguments on the stack got it. At frame 12 p **m *ifp doesn't point to a valid device. It appears that the ifp structure was already freed and reused for a disk I/O. The problem is in if_tun somewhere. Can you show the command that was issued when tun0 was deleted? Also the command that was issued to set up the VPN. Why ppp over ssh? Can ou describe this in detail? Do you create a tun interface through ssh and tunnel ppp through it? I set up VPN tunnels through ssh directly: LocalCommand /bin/echo > /dev/null; /sbin/ifconfig tun3 10.2.2.6/32 10.2.2.5 m tu 1450; for I in 10.1.1.0 10.1.2.0 10.1.3.0; do route add -net $I/24 10.2.2.5; done; # for I in svn.freebsd.org 192.30.253.112; do route add -host $I 10.2.2.5; done (In reply to Cy Schubert from comment #11) > Can you show the command that was issued when tun0 was deleted? I didn't issue any command during that time, the ppp(8) process just terminating on termination of the SSH session. > Also the command that was issued to set up the VPN. An user 'ppp' with a default shell '/usr/local/bin/sshpppd', which contains something like: #/bin/sh ... exec /usr/sbin/ppp -direct incoming.$ppp_user > Why ppp over ssh? I uses PPP over SSH on FreeBSD because I uses it to interoperate with other OS which has poor support of tun interface. And I also use PPP as a tunnel link manager, to allocate in-tunnel IP addresses and set routes. This is not an ipfitler problem. New description: use after free panic after mbuf freed upon if_tun teardown. This is not an ipfilter bug. |