Bug 233535

Summary: Fix refcount leak in IPv6 MLD code leading to loss of IPv6 connectivity
Product: Base System Reporter: Slava Shwartsman <slavash>
Component: kernAssignee: Hans Petter Selasky <hselasky>
Status: Closed FIXED    
Severity: Affects Some People CC: ae, bz, cem, dch, hselasky, karl, kib, marek, menyy, mmacy, net, rainer
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Fix missing decrement of refcount in IPv6 code.
none
Fix missing decrement of refcount in IPv6 code (w/ additional debug code)
none
Fix missing decrement of refcount in IPv6 code (w/ additional debug code)
none
Fix missing decrement of refcount in IPv6 code (w/ additional debug code)
none
debug info
none
Fix MLD refcounting in IPv6 code.
none
Fix MLD refcounting in IPv6 code.
none
Fix MLD refcounting in IPv6 code (including additional debugging).
none
debug info
none
commads to obtain debug info
none
Fix MLD refcounting in IPv6 code (including additional debugging).
none
Fix MLD refcounting in IPv6 code (including additional debugging).
none
Fix MLD refcounting in IPv6 code (no debug version). none

Description Slava Shwartsman freebsd_committer 2018-11-26 16:12:28 UTC
Setup:
2 hosts connected back-to-back 
# uname -rv
13.0-CURRENT FreeBSD 13.0-CURRENT r340922 GENERIC-NODEBUG

Steps to reproduce:
1. Configure IPv6 address on both hosts:
HOST A: ifconfig igb0 inet6 2002::1
HOST B: ifconfig igb0 inet6 2002::2

2. Ping to make sure all works:
# ping6 2002::2
PING6(56=40+8+8 bytes) 2002::1 --> 2002::2
16 bytes from 2002::2, icmp_seq=0 hlim=64 time=0.101 ms
16 bytes from 2002::2, icmp_seq=1 hlim=64 time=0.085 ms
^C
--- 2002::2 ping6 statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.085/0.093/0.101/0.008 ms

3. On both hosts configure the same IP 
HOST A: ifconfig igb0 inet6 2002::1
HOST B: ifconfig igb0 inet6 2002::2

4. Check ping again
# ping6 2002::2
PING6(56=40+8+8 bytes) 2002::1 --> 2002::2
^C
--- 2002::2 ping6 statistics ---
7 packets transmitted, 0 packets received, 100.0% packet loss

Few notes:
==================
1. Seems like sometimes need to do step 3 multiple times.
2. Pinging from the other side, may resolve the issue.
3. Issue is reproducing on other NIC vendors.


From tcpdump, I see that the other side is getting the NS messages, but never replies:
# tcpdump -nei igb0 host 2002::1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb0, link-type EN10MB (Ethernet), capture size 262144 bytes
18:10:10.176317 0c:c4:7a:a8:b7:f6 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: 2002::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2002::2, length 32
18:10:11.196325 0c:c4:7a:a8:b7:f6 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: 2002::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2002::2, length 32
18:10:12.343345 0c:c4:7a:a8:b7:f6 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: 2002::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2002::2, length 32
18:10:13.377329 0c:c4:7a:a8:b7:f6 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: 2002::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2002::2, length 32
18:10:14.382334 0c:c4:7a:a8:b7:f6 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: 2002::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2002::2, length 32
18:10:15.533880 0c:c4:7a:a8:b7:f6 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: 2002::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2002::2, length 32
18:10:16.583342 0c:c4:7a:a8:b7:f6 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: 2002::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2002::2, length 32
18:10:17.603347 0c:c4:7a:a8:b7:f6 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: 2002::1 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2002::2, length 32
^C
8 packets captured
41 packets received by filter
0 packets dropped by kernel
Comment 1 Conrad Meyer freebsd_committer 2018-11-26 16:44:57 UTC
After step 3, what does ifconfig think the configured prefixes are?  Please include netstat -rn (inet6 portion) as well.
Comment 2 Conrad Meyer freebsd_committer 2018-11-26 17:43:13 UTC
(Maybe related to bug 233283.)
Comment 3 Andrey V. Elsukov freebsd_committer 2018-11-26 22:15:36 UTC
I think it is related to DAD (duplicate address detection). But what you expect to see after you did these steps?
Comment 4 Slava Shwartsman freebsd_committer 2018-11-27 08:56:33 UTC
(In reply to Conrad Meyer from comment #1)
Same issue appeared when setting the prefix:
HOST A: ifconfig igb0 inet6 2002::1/64
HOST B: ifconfig igb0 inet6 2002::2/64

# ping6 2002::2
PING6(56=40+8+8 bytes) 2002::1 --> 2002::2
16 bytes from 2002::2, icmp_seq=0 hlim=64 time=0.266 ms
16 bytes from 2002::2, icmp_seq=1 hlim=64 time=0.087 ms
^C
--- 2002::2 ping6 statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.087/0.176/0.266/0.090 ms
# ifconfig igb0 inet6 2002::1/64
# ping6 2002::2
PING6(56=40+8+8 bytes) fe80::ec4:7aff:fea8:b7f6%igb0 --> 2002::2
^C
--- 2002::2 ping6 statistics ---
54 packets transmitted, 0 packets received, 100.0% packet loss

# netstat -rn
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            10.209.52.1        UGS        igb0
10.209.52.0/22     link#1             U          igb0
10.209.52.157      link#1             UHS         lo0
127.0.0.1          link#3             UH          lo0

Internet6:
Destination                       Gateway                       Flags     Netif Expire
::/96                             ::1                           UGRS        lo0
::1                               link#3                        UH          lo0
::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
2002::/64                         link#1                        U          igb0
2002::1                           link#1                        UHS         lo0
fe80::/10                         ::1                           UGRS        lo0
fe80::%igb0/64                    link#1                        U          igb0
fe80::ec4:7aff:fea8:b7f6%igb0     link#1                        UHS         lo0
fe80::%lo0/64                     link#3                        U           lo0
fe80::1%lo0                       link#3                        UHS         lo0
ff02::/16                         ::1                           UGRS        lo0


(In reply to Andrey V. Elsukov from comment #3)
I would expect that ping will continue to work.
Comment 5 Andrey V. Elsukov freebsd_committer 2018-11-27 09:39:30 UTC
(In reply to Slava Shwartsman from comment #4)
> (In reply to Andrey V. Elsukov from comment #3)
> I would expect that ping will continue to work.

I'll try to reproduce this. But can you also show the output of `ifconfig igb0` command? Are both addresses has "duplicated" flag?
Comment 6 Slava Shwartsman freebsd_committer 2018-11-27 11:25:47 UTC
(In reply to Andrey V. Elsukov from comment #5)
This is the output after the issue reproduced:

# ifconfig igb0
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 0c:c4:7a:a8:b7:f6
        inet6 fe80::ec4:7aff:fea8:b7f6%igb0 prefixlen 64 scopeid 0x1
        inet6 2002::1 prefixlen 64
        inet 10.209.52.157 netmask 0xfffffc00 broadcast 10.209.55.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>


# ifconfig igb0
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 0c:c4:7a:a8:b7:76
        inet6 fe80::ec4:7aff:fea8:b776%igb0 prefixlen 64 scopeid 0x1
        inet6 2002::2 prefixlen 64
        inet 10.209.52.158 netmask 0xfffffc00 broadcast 10.209.55.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Comment 7 Slava Shwartsman freebsd_committer 2018-12-06 13:06:03 UTC
Any updates?
Comment 8 Andrey V. Elsukov freebsd_committer 2018-12-07 13:09:36 UTC
(In reply to Slava Shwartsman from comment #7)
> Any updates?

Sorry for long delay, I just tried your test scenario. And I'm able to reproduce the problem. From a first look, it seems there is some race in multicast/MLD code. I see that the host, that doesn't respond to ND6 NS, for each NS packet increments the `ip6s_notmember` counter of ip6 statistics in icmp6_input(). And ND6 code doesn't have a chance to send a reply.
Comment 9 Andrey V. Elsukov freebsd_committer 2018-12-08 11:16:41 UTC
It seems the problem is even worse. The system leaves multicast groups after some time without any reconfiguration and stops to reply to ND6 NS.
1. ifmcstat before test:
 em0:
	inet 10.9.8.6
	igmpv2
		group 224.0.0.1 mode exclude
			mcast-macaddr 01:00:5e:00:00:01
	inet6 fe80::222:4dff:fe6a:5eb9%em0 scopeid 0x1
	mldv1 flags=2<USEALLOW>
		group ff01::1%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:00:00:00:01
		group ff02::1%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:00:00:00:01
2. ifconfig em0 inet6 fc00::2
3. ifmcstat
em0:
	inet6 fe80::222:4dff:fe6a:5eb9%em0 scopeid 0x1
	mldv1 flags=2<USEALLOW>
		group ff02::2:d4f1:c447%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:d4:f1:c4:47
		group ff02::2:ffd4:f1c4%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:ff:d4:f1:c4
		group ff02::1:ff00:2%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:ff:00:00:02
	inet 10.9.8.6
	igmpv2
		group 224.0.0.1 mode exclude
			mcast-macaddr 01:00:5e:00:00:01
	inet6 fe80::222:4dff:fe6a:5eb9%em0 scopeid 0x1
	mldv1 flags=2<USEALLOW>
		group ff01::1%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:00:00:00:01
		group ff02::1%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:00:00:00:01
4. Wait about 1 minute
5. ifmcstat 
em0:
	inet 10.9.8.6
	igmpv2
		group 224.0.0.1 mode exclude
			mcast-macaddr 01:00:5e:00:00:01
	inet6 fe80::222:4dff:fe6a:5eb9%em0 scopeid 0x1
	mldv1 flags=2<USEALLOW>
		group ff01::1%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:00:00:00:01
		group ff02::1%em0 scopeid 0x1 mode exclude
			mcast-macaddr 33:33:00:00:00:01
6. On the second host: ndp -c && ping6 fc00::2 => no reply
Comment 10 Andrey V. Elsukov freebsd_committer 2018-12-08 12:02:09 UTC
This looks like really bad problem for 12.0 release. Can somebody check and confirm that the system leaves multicast groups? If it is not only my machine, this can break IPv6 connectivity after upgrade.
Comment 11 Andrey V. Elsukov freebsd_committer 2018-12-13 13:17:36 UTC
Also there is memory leak for "in6_multi" type. It can be observed by this script:

while true; do
ifconfig mce3 inet6 fe80::15
sleep 2
done
Comment 12 Hans Petter Selasky freebsd_committer 2018-12-17 14:43:21 UTC
Created attachment 200199 [details]
Fix missing decrement of refcount in IPv6 code.

Hi,

Please find attached a patch to try to fix this issue.

--HPS
Comment 13 Andrey V. Elsukov freebsd_committer 2018-12-17 15:24:22 UTC
So, my guess is that (In reply to Hans Petter Selasky from comment #12)
> Created attachment 200199 [details]
> Fix missing decrement of refcount in IPv6 code.
> 
> Hi,
> 
> Please find attached a patch to try to fix this issue.

The assertion in in6_mcast.c fires just after boot.
Comment 14 Hans Petter Selasky freebsd_committer 2018-12-17 16:29:09 UTC
Can you show the backtrace?
Comment 15 Hans Petter Selasky freebsd_committer 2018-12-17 16:29:53 UTC
Can you try the patch with MPASS() removed?
Comment 16 Andrey V. Elsukov freebsd_committer 2018-12-18 14:13:51 UTC
Unread portion of the kernel message buffer:
panic: Assertion inm->in6m_refcount > 0 failed at /home/devel/freebsd/base/head/sys/netinet6/in6_mcast.c:636
cpuid = 0
time = 1545140639
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00779ce740
vpanic() at vpanic+0x1a3/frame 0xfffffe00779ce7a0
panic() at panic+0x43/frame 0xfffffe00779ce800
in6m_disconnect() at in6m_disconnect+0x21b/frame 0xfffffe00779ce830
mld_fasttimo() at mld_fasttimo+0x7d7/frame 0xfffffe00779ce900
pffasttimo() at pffasttimo+0x54/frame 0xfffffe00779ce930
softclock_call_cc() at softclock_call_cc+0x141/frame 0xfffffe00779ce9e0
softclock() at softclock+0x7c/frame 0xfffffe00779cea10
ithread_loop() at ithread_loop+0x187/frame 0xfffffe00779cea70
fork_exit() at fork_exit+0x84/frame 0xfffffe00779ceab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00779ceab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

__curthread () at ./machine/pcpu.h:230
230		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (OFFSETOF_CURTHREAD));
(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=-1791303680) at /home/devel/freebsd/base/head/sys/kern/kern_shutdown.c:371
#2  0xffffffff80466d7c in db_fncall_generic (addr=<optimized out>, rv=<optimized out>, nargs=<optimized out>, args=<optimized out>)
    at /home/devel/freebsd/base/head/sys/ddb/db_command.c:609
#3  db_fncall (dummy1=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
    at /home/devel/freebsd/base/head/sys/ddb/db_command.c:657
#4  0xffffffff804668b9 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=<optimized out>)
    at /home/devel/freebsd/base/head/sys/ddb/db_command.c:481
#5  0xffffffff80466634 in db_command_loop () at /home/devel/freebsd/base/head/sys/ddb/db_command.c:534
#6  0xffffffff8046984f in db_trap (type=<optimized out>, code=<optimized out>) at /home/devel/freebsd/base/head/sys/ddb/db_main.c:252
#7  0xffffffff80bf1365 in kdb_trap (type=3, code=0, tf=<optimized out>) at /home/devel/freebsd/base/head/sys/kern/subr_kdb.c:692
#8  0xffffffff81076f76 in trap (frame=0xfffffe00779ce670) at /home/devel/freebsd/base/head/sys/amd64/amd64/trap.c:619
#9  <signal handler called>
#10 kdb_enter (why=0xffffffff8130f382 "panic", msg=<optimized out>) at /home/devel/freebsd/base/head/sys/kern/subr_kdb.c:479
#11 0xffffffff80ba8e10 in vpanic (fmt=<optimized out>, ap=0xfffffe00779ce7e0) at /home/devel/freebsd/base/head/sys/kern/kern_shutdown.c:866
#12 0xffffffff80ba8bb3 in panic (fmt=0xffffffff81e95368 <cnputs_mtx> "\340\066-\201\377\377\377\377") at /home/devel/freebsd/base/head/sys/kern/kern_shutdown.c:804
#13 0xffffffff80dee39b in in6m_disconnect (inm=0xfffff80003a04d00) at /home/devel/freebsd/base/head/sys/netinet6/in6_mcast.c:636
#14 0xffffffff80e02117 in mld_fasttimo_vnet () at /home/devel/freebsd/base/head/sys/netinet6/mld6.c:1611
#15 mld_fasttimo () at /home/devel/freebsd/base/head/sys/netinet6/mld6.c:1317
#16 0xffffffff80c36a04 in pffasttimo (arg=0xffffffff81e95368 <cnputs_mtx>) at /home/devel/freebsd/base/head/sys/kern/uipc_domain.c:521
#17 0xffffffff80bc2101 in softclock_call_cc (c=0xffffffff81314b74, cc=0xffffffff820acb40 <cc_cpu>, direct=<optimized out>)
    at /home/devel/freebsd/base/head/sys/kern/kern_timeout.c:731
#18 0xffffffff80bc24bc in softclock (arg=0xffffffff820acb40 <cc_cpu>) at /home/devel/freebsd/base/head/sys/kern/kern_timeout.c:869
#19 0xffffffff80b6bb07 in intr_event_execute_handlers (p=<optimized out>, ie=<optimized out>) at /home/devel/freebsd/base/head/sys/kern/kern_intr.c:1119
#20 ithread_execute_handlers (p=<optimized out>, ie=<optimized out>) at /home/devel/freebsd/base/head/sys/kern/kern_intr.c:1132
#21 ithread_loop (arg=<optimized out>) at /home/devel/freebsd/base/head/sys/kern/kern_intr.c:1212
#22 0xffffffff80b687f4 in fork_exit (callout=0xffffffff80b6b980 <ithread_loop>, arg=0xfffff800035f1dc0, frame=0xfffffe00779ceac0)
    at /home/devel/freebsd/base/head/sys/kern/kern_fork.c:1009
#23 <signal handler called>
Comment 17 Andrey V. Elsukov freebsd_committer 2018-12-18 14:17:13 UTC
So, trying with enabled/disabled net.inet6.mld.v1enable I discovered, that with disabled MLDv1 it doesn't panic. But the problem when host leaves multicast groups and stops respond to ND6 NS due to address is configured double times  persists.
Comment 18 Hans Petter Selasky freebsd_committer 2018-12-31 13:57:51 UTC
Documenting yet another related crash scenario:

#12 0xffffffff80e292f0 in mld_change_state (inm=0xfffff802eb1f5800, delay=0)
    at /usr/img/freebsd/sys/netinet6/mld6.c:1909
#13 0xffffffff80e1546a in in6_joingroup_locked (ifp=<optimized out>, 
    mcaddr=0xfffffe008da78618, imf=0x0, pinm=0xfffff8000415a960, delay=0)
    at /usr/img/freebsd/sys/netinet6/in6_mcast.c:1321
#14 0xffffffff80e14f74 in in6_joingroup (ifp=0xffffffff81e95368 <cnputs_mtx>, 
    mcaddr=0x80, imf=<optimized out>, pinm=0x80, delay=16)
    at /usr/img/freebsd/sys/netinet6/in6_mcast.c:1248
#15 0xffffffff80e0ce20 in in6_joingroup_legacy (ifp=<optimized out>, 
    mcaddr=0x40002ff, delay=18, errorp=<optimized out>)
    at /usr/img/freebsd/sys/netinet6/in6.c:752
#16 in6_update_ifa_join_mc (ifp=<optimized out>, ifra=<optimized out>, 
    ia=<optimized out>, flags=<optimized out>, in6m_sol=<optimized out>)
    at /usr/img/freebsd/sys/netinet6/in6.c:848
#17 in6_broadcast_ifa (ifp=<optimized out>, ifra=<optimized out>, 
    ia=<optimized out>, flags=<optimized out>)
    at /usr/img/freebsd/sys/netinet6/in6.c:1227
#18 in6_update_ifa (ifp=<optimized out>, ifra=<optimized out>, 
    ia=<optimized out>, flags=<optimized out>)
    at /usr/img/freebsd/sys/netinet6/in6.c:910
#19 0xffffffff80e0ae7b in in6_control (so=<optimized out>, 
--Type <RET> for more, q to quit, c to continue without paging--
    data=<optimized out>, ifp=<optimized out>, td=<optimized out>)
    at /usr/img/freebsd/sys/netinet6/in6.c:564
#20 0xffffffff80cd13db in ifioctl (so=<optimized out>, cmd=2156423451, data=0xfffff80006ea3b00 "mce0", 
    td=0xfffff8011cc9a000) at /usr/img/freebsd/sys/net/if.c:3098
#21 0xffffffff80c3c41b in fo_ioctl (fp=<optimized out>, com=<optimized out>, data=0x1d0, active_cred=0x80, 
    td=<optimized out>) at /usr/img/freebsd/sys/sys/file.h:330
#22 kern_ioctl (td=<optimized out>, fd=<optimized out>, com=2156423451, 
    data=0x1d0 <error: Cannot access memory at address 0x1d0>) at /usr/img/freebsd/sys/kern/sys_generic.c:800
#23 0xffffffff80c3c10d in sys_ioctl (td=0xfffff8011cc9a000, uap=0xfffff8011cc9a3c0)
    at /usr/img/freebsd/sys/kern/sys_generic.c:712
#24 0xffffffff8109deb2 in syscallenter (td=0xfffff8011cc9a000)
    at /usr/img/freebsd/sys/amd64/amd64/../../kern/subr_syscall.c:135
#25 amd64_syscall (td=0xfffff8011cc9a000, traced=0) at /usr/img/freebsd/sys/amd64/amd64/trap.c:1154


#12 0xffffffff80e292f0 in mld_change_state (inm=0xfffff802eb1f5800, delay=0) at /usr/img/freebsd/sys/netinet6/mld6.c:1909
1909		KASSERT(inm->in6m_ifp == ifp, ("%s: bad ifp", __func__));
(kgdb) list
1904			return (0);
1905		/*
1906		 * Sanity check that netinet6's notion of ifp is the
1907		 * same as net's.
1908		 */
1909		KASSERT(inm->in6m_ifp == ifp, ("%s: bad ifp", __func__));
1910	
1911		MLD_LOCK();
1912		mli = MLD_IFINFO(ifp);
1913		KASSERT(mli != NULL, ("%s: no mld_ifsoftc for ifp %p", __func__, ifp));
(kgdb) print *inm
$3 = {
  in6m_addr = {
    __u6_addr = {
      __u6_addr8 = "\377\002\000\004\000\000\000\000\000\000\000\002\340\312\325\032", 
      __u6_addr16 = {767, 1024, 0, 0, 0, 512, 51936, 6869}, 
      __u6_addr32 = {67109631, 0, 33554432, 450218720}
    }
  }, 
  in6m_ifp = 0x0, 
  in6m_ifma = 0xfffff80006969080, 
  in6m_refcount = 1, 
  in6m_state = 0, 
  in6m_timer = 0, 
  in6m_mli = 0xfffff800067e0900, 
  in6m_nrele = {
    sle_next = 0x0
  }, 
  in6m_srcs = {
    rbh_root = 0x0
  }, 
  in6m_nsrc = 0, 
  in6m_scq = {
    mq_head = {
      stqh_first = 0x0, 
      stqh_last = 0xfffff802eb1f5850
    }, 
    mq_len = 0, 
    mq_maxlen = 24
  }, 
  in6m_lastgsrtv = {
    tv_sec = 0, 
    tv_usec = 0
  }, 
  in6m_sctimer = 0, 
  in6m_scrv = 0, 
  in6m_st = {{
      iss_fmode = 0, 
      iss_asm = 0, 
      iss_ex = 0, 
      iss_in = 0, 
      iss_rec = 0
    }, {
      iss_fmode = 2, 
      iss_asm = 1, 
      iss_ex = 1, 
      iss_in = 0, 
      iss_rec = 0
    }}
}
Comment 19 Hans Petter Selasky freebsd_committer 2018-12-31 14:44:41 UTC
Just before the panic above, the following functions were called. It turns out there is a race that mld_fasttimo() can be called during in6_joingroup().

--HPS

acquire_locked - post inc - 0xfffff802eb1f5800->in6m_refcount = 2
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008da78410
in6_joingroup_locked() at in6_joingroup_locked+0x1b4/frame 0xfffffe008da784a0
in6_joingroup() at in6_joingroup+0x44/frame 0xfffffe008da784d0
in6_update_ifa() at in6_update_ifa+0x1880/frame 0xfffffe008da78680
in6_control() at in6_control+0x9eb/frame 0xfffffe008da78760
ifioctl() at ifioctl+0x57b/frame 0xfffffe008da78830
kern_ioctl() at kern_ioctl+0x29b/frame 0xfffffe008da788a0
sys_ioctl() at sys_ioctl+0x15d/frame 0xfffffe008da78970
amd64_syscall() at amd64_syscall+0x272/frame 0xfffffe008da78ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe008da78ab0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800fdd5ca, rsp = 0x7fffffffe248, rbp = 0x7fffffffe290 ---
rele_locked - post dec - 0xfffff802eb1f5800->in6m_refcount = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe000055a830
mld_fasttimo() at mld_fasttimo+0x8be/frame 0xfffffe000055a900
pffasttimo() at pffasttimo+0x54/frame 0xfffffe000055a930
softclock_call_cc() at softclock_call_cc+0x140/frame 0xfffffe000055a9e0
softclock() at softclock+0x7c/frame 0xfffffe000055aa10
ithread_loop() at ithread_loop+0x136/frame 0xfffffe000055aa70
fork_exit() at fork_exit+0x84/frame 0xfffffe000055aab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000055aab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
panic: mld_change_state: bad ifp
Comment 20 Hans Petter Selasky freebsd_committer 2018-12-31 15:02:16 UTC
Created attachment 200658 [details]
Fix missing decrement of refcount in IPv6 code (w/ additional debug code)

ae@ : Can you test the patch and also provide the last few pages of of dmesg before the panic you see? It will help narrow down the issue!
Comment 21 Hans Petter Selasky freebsd_committer 2019-01-02 09:55:49 UTC
Created attachment 200699 [details]
Fix missing decrement of refcount in IPv6 code (w/ additional debug code)

Added some debug prints.
Comment 22 Hans Petter Selasky freebsd_committer 2019-01-03 21:55:56 UTC
Created attachment 200753 [details]
Fix missing decrement of refcount in IPv6 code (w/ additional debug code)

Minor kernel compile fix. Please test.
Comment 23 Bjoern A. Zeeb freebsd_committer 2019-01-07 13:39:07 UTC
(In reply to Hans Petter Selasky from comment #22)

Should this go into head, please be more specific than "IPv6";  the code you fix is in MLD.
Comment 24 Hans Petter Selasky freebsd_committer 2019-01-07 14:35:04 UTC
@bz: Yes, this code goes to head.
Comment 25 Andrey V. Elsukov freebsd_committer 2019-01-07 21:07:59 UTC
Created attachment 200887 [details]
debug info
Comment 26 Andrey V. Elsukov freebsd_committer 2019-01-07 21:09:51 UTC
(In reply to Hans Petter Selasky from comment #22)
> Created attachment 200753 [details]
> Fix missing decrement of refcount in IPv6 code (w/ additional debug code)
> 
> Minor kernel compile fix. Please test.

I attached debug info from the panic, I did not look it myself yet...
Comment 27 Hans Petter Selasky freebsd_committer 2019-01-09 14:46:06 UTC
Created attachment 200956 [details]
Fix MLD refcounting in IPv6 code.

Hi,

Please test this patch while watching:

vmstat -m | grep multi

Thank you!

--HPS

Simple leak provoking script:

#!/bin/sh

while true; do
ifconfig mce0 inet6 fe80::15
sleep 1
vmstat -m | grep multi
done
Comment 28 karl 2019-01-09 18:53:05 UTC
Does this indicate that if we are using IPv6 we should *not*, until this is tested/MFC'd, update to any 12+ (RELEASE, STABLE or -HEAD) codebase?
Comment 29 Andrey V. Elsukov freebsd_committer 2019-01-10 01:58:11 UTC
(In reply to karl from comment #28)
> Does this indicate that if we are using IPv6 we should *not*, until this is
> tested/MFC'd, update to any 12+ (RELEASE, STABLE or -HEAD) codebase?

Yes. But you can update and help with testing. :)
Comment 30 Andrey V. Elsukov freebsd_committer 2019-01-10 02:43:40 UTC
(In reply to Hans Petter Selasky from comment #27)
> Created attachment 200956 [details]
> Fix MLD refcounting in IPv6 code.
> 
> Hi,
> 
> Please test this patch while watching:
> 
> vmstat -m | grep multi

It still leaks. The memory leak is unimportant problem. The much worse is that system leaves multicast groups when you do an address configuration several times. When system leaves multicast groups, it stops respond to ND6 NS, and becomes unresponsive for neighbors.

# vmstat -m | grep multi
  ether_multi    61     5K       -      135  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    33     6K       -       73  32,256
# ifmcstat -i re0
re0:
	inet 10.9.8.12
	igmpv2
		group 224.0.0.1 mode exclude
			mcast-macaddr 01:00:5e:00:00:01
	inet6 fe80::1ebd:b9ff:fede:d7d%re0 scopeid 0x2
	mldv1 flags=2<USEALLOW>
		group ff01::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01
		group ff02::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01
# ifconfig re0 inet6 fc00::1
# vmstat -m | grep multi
  ether_multi    76     6K       -      150  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    39     7K       -       81  32,256
# ifmcstat -i re0
re0:
	inet6 fe80::1ebd:b9ff:fede:d7d%re0 scopeid 0x2
	mldv1 flags=2<USEALLOW>
		group ff02::2:d4f1:c447%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:d4:f1:c4:47
		group ff02::2:ffd4:f1c4%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:ff:d4:f1:c4
		group ff02::1:ff00:1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:ff:00:00:01
	inet 10.9.8.12
	igmpv2
		group 224.0.0.1 mode exclude
			mcast-macaddr 01:00:5e:00:00:01
	inet6 fe80::1ebd:b9ff:fede:d7d%re0 scopeid 0x2
	mldv1 flags=2<USEALLOW>
		group ff01::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01
		group ff02::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01

# ifconfig re0 inet6 fc00::1
# vmstat -m | grep multi
  ether_multi    70     6K       -      165  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    36     6K       -       89  32,256
# ifmcstat -i re0
re0:
	inet 10.9.8.12
	igmpv2
		group 224.0.0.1 mode exclude
			mcast-macaddr 01:00:5e:00:00:01
	inet6 fe80::1ebd:b9ff:fede:d7d%re0 scopeid 0x2
	mldv1 flags=2<USEALLOW>
		group ff01::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01
		group ff02::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01
# ifconfig re0 inet6 fc00::1
# vmstat -m | grep multi
  ether_multi    85     7K       -      180  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    42     7K       -       97  32,256
# ifmcstat -i re0
re0:
	inet6 fe80::1ebd:b9ff:fede:d7d%re0 scopeid 0x2
	mldv1 flags=2<USEALLOW>
		group ff02::2:d4f1:c447%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:d4:f1:c4:47
		group ff02::2:ffd4:f1c4%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:ff:d4:f1:c4
		group ff02::1:ff00:1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:ff:00:00:01
	inet 10.9.8.12
	igmpv2
		group 224.0.0.1 mode exclude
			mcast-macaddr 01:00:5e:00:00:01
	inet6 fe80::1ebd:b9ff:fede:d7d%re0 scopeid 0x2
	mldv1 flags=2<USEALLOW>
		group ff01::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01
		group ff02::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01

# ifconfig re0 inet6 fc00::1
# ifmcstat -i re0
re0:
	inet 10.9.8.12
	igmpv2
		group 224.0.0.1 mode exclude
			mcast-macaddr 01:00:5e:00:00:01
	inet6 fe80::1ebd:b9ff:fede:d7d%re0 scopeid 0x2
	mldv1 flags=2<USEALLOW>
		group ff01::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01
		group ff02::1%re0 scopeid 0x2 mode exclude
			mcast-macaddr 33:33:00:00:00:01
# vmstat -m | grep multi
  ether_multi    79     6K       -      195  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    39     7K       -      105  32,256

....

# vmstat -m | grep multi
  ether_multi   127    10K       -      315  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    56    11K       -      169  32,256
Comment 31 Bjoern A. Zeeb freebsd_committer 2019-01-11 16:45:12 UTC
(In reply to Andrey V. Elsukov from comment #30)

Do you have a rough idea when you have seen that the first time?  (SVN r# / date / branch)?
Comment 32 Hans Petter Selasky freebsd_committer 2019-01-14 13:28:06 UTC
Created attachment 201125 [details]
Fix MLD refcounting in IPv6 code.

Hi,

I found one more refcount leak, namely when starting and stopping rpcbind .

Please test new patch!

--HPS
Comment 33 Hans Petter Selasky freebsd_committer 2019-01-14 13:45:29 UTC
> It still leaks. The memory leak is unimportant problem.

No, I don't think so. Every multicast rule generates a multicast address for the network interface. When a rule is dangeling, I suspect the multicast programming of the network interface may become incorrect, so the multicast traffic gets dropped.

Can you try playing with promiscious mode when there is no ping6 response?

--HPS
Comment 34 Andrey V. Elsukov freebsd_committer 2019-01-15 06:07:39 UTC
(In reply to Hans Petter Selasky from comment #32)
> Created attachment 201125 [details]
> Fix MLD refcounting in IPv6 code.
> 
> Hi,
> 
> I found one more refcount leak, namely when starting and stopping rpcbind .
> 
> Please test new patch!

# while true; do                                                                                                                                                       ifconfig re0 inet6 fc00::1                                                                                                                                             sleep 3                                                                                                                                                                vmstat -m | grep multi                                                                                                                                                 done
  ether_multi    65     5K       -      135  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    35     6K       -       73  32,256
  ether_multi    68     6K       -      150  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    36     6K       -       81  32,256
  ether_multi    71     6K       -      165  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    37     6K       -       89  32,256
  ether_multi    74     6K       -      180  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    38     6K       -       97  32,256
  ether_multi    79     6K       -      195  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    40     7K       -      105  32,256
  ether_multi    75     6K       -      210  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    38     7K       -      113  32,256
  ether_multi    81     7K       -      225  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    40     7K       -      121  32,256
  ether_multi    91     7K       -      240  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    44     8K       -      129  32,256
  ether_multi    91     7K       -      255  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    44     8K       -      137  32,256
  ether_multi    91     7K       -      270  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    44     8K       -      145  32,256
  ether_multi    89     7K       -      285  16,32,64,128
     in_multi     2     1K       -        3  256
    in6_multi    43     8K       -      153  32,256
^C
Comment 35 Andrey V. Elsukov freebsd_committer 2019-01-15 06:10:25 UTC
(In reply to Bjoern A. Zeeb from comment #31)
> (In reply to Andrey V. Elsukov from comment #30)
> 
> Do you have a rough idea when you have seen that the first time?  (SVN r# /
> date / branch)?

I'm pretty sure that it was broken during epoch-ification. But I failed to do bisection, since the changes were made in May, but recent CURRENT fails to build the sources from that time.
Comment 36 Andrey V. Elsukov freebsd_committer 2019-01-15 06:48:58 UTC
(In reply to Hans Petter Selasky from comment #33)
> Can you try playing with promiscious mode when there is no ping6 response?

This won't help. icmp6_input() will drop multicast packets if destination address is targeted to group that we didn't join, even if ether_input() handled packet and put it into IP6 netisr queue.

# netstat -sp ip6 | grep multicast
	192 multicast packets which we don't join
Comment 37 Hans Petter Selasky freebsd_committer 2019-01-16 19:41:56 UTC
ae@ : 
I'm able to reproduce using "xorp" from ports and the following configuration file on a test hosts, which basically should disable MLDv2:

# cat /usr/local/etc/xorp.conf
interfaces {
    interface re0 {
        vif re0 {
            address 10.10.10.10 {
                prefix-length: 24
            }
            address fc00::1 {
                prefix-length: 64
            }
        }
    }                                             
}

protocols {
   mld {
     disable: false
     interface re0 {
        vif re0 {
          disable: false
          version: 1
        }
     }
   }
}
Comment 38 Hans Petter Selasky freebsd_committer 2019-01-17 14:21:10 UTC
Created attachment 201210 [details]
Fix MLD refcounting in IPv6 code (including additional debugging).

@ae:

I found more issues:

1) Missing EPOCH enter/exit calls around CK_STAILQ's
2) Some disconnect MDL calls where previously only a negative ref (See @mmacys's commits) and I believe this should only be negative refs.

Can you test new patch?

And also provide "dmesg" if it doesn't panic and output from "vmstat -m | grep multi".

Thank you!
Comment 39 Andrey V. Elsukov freebsd_committer 2019-01-17 16:06:24 UTC
Created attachment 201211 [details]
debug info
Comment 40 Andrey V. Elsukov freebsd_committer 2019-01-17 16:07:16 UTC
Created attachment 201212 [details]
commads to obtain debug info
Comment 41 Hans Petter Selasky freebsd_committer 2019-01-17 17:27:45 UTC
Created attachment 201220 [details]
Fix MLD refcounting in IPv6 code (including additional debugging).

Hi @ae,

Found one more bug. According to the logs you provided I figured out the inm leaks when entering mld_v1_process_group_timer() and looking at the version history this function should not disconnect the inm, only queue a v1_transmit.

I've uploaded a new patch. Can you re-test?

Thank you!

--HPS

@@ -1488,8 +1493,7 @@ mld_v1_process_group_timer(struct in6_multi_head *inmh, struct in6_multi *in
m)
        case MLD_REPORTING_MEMBER:
                if (report_timer_expired) {
                        inm->in6m_state = MLD_IDLE_MEMBER;
-                       in6m_disconnect(inm);
-                       in6m_rele_locked(inmh, inm);
+                       SLIST_INSERT_HEAD(inmh, inm, in6m_nrele);
Comment 42 Hans Petter Selasky freebsd_committer 2019-01-17 17:47:27 UTC
Created attachment 201221 [details]
Fix MLD refcounting in IPv6 code (including additional debugging).

Separate SLIST entries for deferred operation and free, so that they don't race.
Extend epoch to cover all use of inm.
Comment 43 Andrey V. Elsukov freebsd_committer 2019-01-17 19:00:38 UTC
(In reply to Hans Petter Selasky from comment #42)
> Created attachment 201221 [details]
> Fix MLD refcounting in IPv6 code (including additional debugging).
> 
> Separate SLIST entries for deferred operation and free, so that they don't
> race.
> Extend epoch to cover all use of inm.

This look like a fix. I will do a deeper test and report again, but from first look seems now the system does not leaves multicast groups and there is no memory leak.
Comment 44 Hans Petter Selasky freebsd_committer 2019-01-18 09:05:39 UTC
Created attachment 201228 [details]
Fix MLD refcounting in IPv6 code (no debug version).

Here is also a no-debug version of the patch.
Comment 45 Andrey V. Elsukov freebsd_committer 2019-01-18 09:33:10 UTC
(In reply to Hans Petter Selasky from comment #42)
> Created attachment 201221 [details]
> Fix MLD refcounting in IPv6 code (including additional debugging).
> 
> Separate SLIST entries for deferred operation and free, so that they don't
> race.
> Extend epoch to cover all use of inm.

It fixes the memory leak and MLD for me. Thanks!
Comment 46 Hans Petter Selasky freebsd_committer 2019-01-18 09:41:31 UTC
ae@ - What is the way forward? Differential revision or will someone approve the patch here?
Comment 47 Bjoern A. Zeeb freebsd_committer 2019-01-18 09:49:09 UTC
(In reply to Hans Petter Selasky from comment #46)

why don't you just open a review with the final patch, a good commit message, etc.  and go the normal way;  probably easier than anything else.
Comment 48 Hans Petter Selasky freebsd_committer 2019-01-18 11:44:07 UTC
See:
https://reviews.freebsd.org/D18887
Comment 49 commit-hook freebsd_committer 2019-01-24 08:16:44 UTC
A commit references this bug:

Author: hselasky
Date: Thu Jan 24 08:15:42 UTC 2019
New revision: 343392
URL: https://svnweb.freebsd.org/changeset/base/343392

Log:
  Fix duplicate acquiring of refcount when joining IPv6 multicast groups.
  This was observed by starting and stopping rpcbind(8) multiple times.

  PR:			233535
  Differential Revision:	https://reviews.freebsd.org/D18887
  Reviewed by:		bz (net)
  Tested by:		ae
  MFC after:		1 week
  Sponsored by:		Mellanox Technologies

Changes:
  head/sys/netinet6/in6_mcast.c
Comment 50 commit-hook freebsd_committer 2019-01-24 08:18:52 UTC
A commit references this bug:

Author: hselasky
Date: Thu Jan 24 08:18:02 UTC 2019
New revision: 343393
URL: https://svnweb.freebsd.org/changeset/base/343393

Log:
  Add debugging sysctl to disable incoming MLD v2 messages similar to the
  existing sysctl for MLD v1 messages.

  PR:			233535
  Differential Revision:	https://reviews.freebsd.org/D18887
  Reviewed by:		bz (net)
  Tested by:		ae
  MFC after:		1 week
  Sponsored by:		Mellanox Technologies

Changes:
  head/sys/netinet6/mld6.c
Comment 51 commit-hook freebsd_committer 2019-01-24 08:26:02 UTC
A commit references this bug:

Author: hselasky
Date: Thu Jan 24 08:25:03 UTC 2019
New revision: 343394
URL: https://svnweb.freebsd.org/changeset/base/343394

Log:
  When detaching a network interface drain the workqueue freeing the inm's
  because the destructor will access the if_ioctl() callback in the ifnet
  pointer which is about to be freed. This prevents use-after-free.

  PR:			233535
  Differential Revision:	https://reviews.freebsd.org/D18887
  Reviewed by:		bz (net)
  Tested by:		ae
  MFC after:		1 week
  Sponsored by:		Mellanox Technologies

Changes:
  head/sys/netinet6/in6_ifattach.c
  head/sys/netinet6/in6_mcast.c
  head/sys/netinet6/in6_var.h
Comment 52 commit-hook freebsd_committer 2019-01-24 08:35:13 UTC
A commit references this bug:

Author: hselasky
Date: Thu Jan 24 08:34:14 UTC 2019
New revision: 343395
URL: https://svnweb.freebsd.org/changeset/base/343395

Log:
  Fix refcounting leaks in IPv6 MLD code leading to loss of IPv6
  connectivity.

  Looking at past changes in this area like r337866, some refcounting
  bugs have been introduced, one by one. For example like calling
  in6m_disconnect() and in6m_rele_locked() in mld_v1_process_group_timer()
  where previously no disconnect nor refcount decrement was done.
  Calling in6m_disconnect() when it shouldn't causes IPv6 solitation to no
  longer work, because all the multicast addresses receiving the solitation
  messages are now deleted from the network interface.

  This patch reverts some recent changes while improving the MLD
  refcounting and concurrency model after the MLD code was converted
  to using EPOCH(9).

  List changes:
  - All CK_STAILQ_FOREACH() macros are now properly enclosed into
    EPOCH(9) sections. This simplifies assertion of locking inside
    in6m_ifmultiaddr_get_inm().
  - Corrected bad use of in6m_disconnect() leading to loss of IPv6
    connectivity for MLD v1.
  - Factored out checks for valid inm structure into
    in6m_ifmultiaddr_get_inm().

  PR:			233535
  Differential Revision:	https://reviews.freebsd.org/D18887
  Reviewed by:		bz (net)
  Tested by:		ae
  MFC after:		1 week
  Sponsored by:		Mellanox Technologies

Changes:
  head/sys/netinet6/in6_ifattach.c
  head/sys/netinet6/in6_mcast.c
  head/sys/netinet6/in6_var.h
  head/sys/netinet6/mld6.c
  head/sys/netinet6/mld6_var.h
Comment 53 commit-hook freebsd_committer 2019-02-01 09:06:17 UTC
A commit references this bug:

Author: hselasky
Date: Fri Feb  1 09:05:42 UTC 2019
New revision: 343647
URL: https://svnweb.freebsd.org/changeset/base/343647

Log:
  MFC r343392:
  Fix duplicate acquiring of refcount when joining IPv6 multicast groups.
  This was observed by starting and stopping rpcbind(8) multiple times.

  PR:			233535
  Differential Revision:	https://reviews.freebsd.org/D18887
  Reviewed by:		bz (net)
  Tested by:		ae
  Sponsored by:		Mellanox Technologies

Changes:
_U  stable/12/
  stable/12/sys/netinet6/in6_mcast.c
Comment 54 commit-hook freebsd_committer 2019-02-01 09:07:25 UTC
A commit references this bug:

Author: hselasky
Date: Fri Feb  1 09:06:40 UTC 2019
New revision: 343648
URL: https://svnweb.freebsd.org/changeset/base/343648

Log:
  MFC r343393:
  Add debugging sysctl to disable incoming MLD v2 messages similar to the
  existing sysctl for MLD v1 messages.

  PR:			233535
  Differential Revision:	https://reviews.freebsd.org/D18887
  Reviewed by:		bz (net)
  Tested by:		ae
  Sponsored by:		Mellanox Technologies

Changes:
_U  stable/12/
  stable/12/sys/netinet6/mld6.c
Comment 55 commit-hook freebsd_committer 2019-02-01 09:08:30 UTC
A commit references this bug:

Author: hselasky
Date: Fri Feb  1 09:07:28 UTC 2019
New revision: 343649
URL: https://svnweb.freebsd.org/changeset/base/343649

Log:
  MFC r343394:
  When detaching a network interface drain the workqueue freeing the inm's
  because the destructor will access the if_ioctl() callback in the ifnet
  pointer which is about to be freed. This prevents use-after-free.

  PR:			233535
  Differential Revision:	https://reviews.freebsd.org/D18887
  Reviewed by:		bz (net)
  Tested by:		ae
  Sponsored by:		Mellanox Technologies

Changes:
_U  stable/12/
  stable/12/sys/netinet6/in6_ifattach.c
  stable/12/sys/netinet6/in6_mcast.c
  stable/12/sys/netinet6/in6_var.h
Comment 56 commit-hook freebsd_committer 2019-02-01 09:08:34 UTC
A commit references this bug:

Author: hselasky
Date: Fri Feb  1 09:08:20 UTC 2019
New revision: 343650
URL: https://svnweb.freebsd.org/changeset/base/343650

Log:
  MFC r343395:
  Fix refcounting leaks in IPv6 MLD code leading to loss of IPv6
  connectivity.

  Looking at past changes in this area like r337866, some refcounting
  bugs have been introduced, one by one. For example like calling
  in6m_disconnect() and in6m_rele_locked() in mld_v1_process_group_timer()
  where previously no disconnect nor refcount decrement was done.
  Calling in6m_disconnect() when it shouldn't causes IPv6 solitation to no
  longer work, because all the multicast addresses receiving the solitation
  messages are now deleted from the network interface.

  This patch reverts some recent changes while improving the MLD
  refcounting and concurrency model after the MLD code was converted
  to using EPOCH(9).

  List changes:
  - All CK_STAILQ_FOREACH() macros are now properly enclosed into
    EPOCH(9) sections. This simplifies assertion of locking inside
    in6m_ifmultiaddr_get_inm().
  - Corrected bad use of in6m_disconnect() leading to loss of IPv6
    connectivity for MLD v1.
  - Factored out checks for valid inm structure into
    in6m_ifmultiaddr_get_inm().

  PR:			233535
  Differential Revision:	https://reviews.freebsd.org/D18887
  Reviewed by:		bz (net)
  Tested by:		ae
  Sponsored by:		Mellanox Technologies

Changes:
_U  stable/12/
  stable/12/sys/netinet6/in6_ifattach.c
  stable/12/sys/netinet6/in6_mcast.c
  stable/12/sys/netinet6/in6_var.h
  stable/12/sys/netinet6/mld6.c
  stable/12/sys/netinet6/mld6_var.h