Bug 242677 - multicast: setsockopt(...IP_DROP_MEMBERSHIP...) doesn't lead to sending IGMP packet after base r349369
Summary: multicast: setsockopt(...IP_DROP_MEMBERSHIP...) doesn't lead to sending IGMP ...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: Hans Petter Selasky
URL: https://reviews.freebsd.org/D22848
Keywords: regression
Depends on:
Blocks:
 
Reported: 2019-12-17 07:26 UTC by Aleksandr Fedorov
Modified: 2019-12-25 09:35 UTC (History)
7 users (show)

See Also:
koobs: mfc-stable12?
koobs: mfc-stable11-


Attachments
Test code. (1.73 KB, text/plain)
2019-12-17 07:26 UTC, Aleksandr Fedorov
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aleksandr Fedorov freebsd_committer 2019-12-17 07:26:53 UTC
Created attachment 210002 [details]
Test code.

Revision 355322 fixes multicast in case of closing the socket.

But the problem remains if you try to exit the multicast group before closing the socket.

If you run test code from attachment.
#./mcasttest
Add membership
igb1:
        inet 192.168.1.55
        igmpv3 rv 2 qi 125 qri 10 uri 3
                group 239.0.0.5 mode exclude
                        mcast-macaddr 01:00:5e:00:00:05
                group 224.0.0.1 mode exclude
                        mcast-macaddr 01:00:5e:00:00:01
Drop membership
igb1:
        inet 192.168.1.55
        igmpv3 rv 2 qi 125 qri 10 uri 3
                group 224.0.0.1 mode exclude
                        mcast-macaddr 01:00:5e:00:00:01
Add membership
igb1:
        inet 192.168.1.55
        igmpv3 rv 2 qi 125 qri 10 uri 3
                group 239.0.0.5 mode exclude
                        mcast-macaddr 01:00:5e:00:00:05
                group 224.0.0.1 mode exclude
                        mcast-macaddr 01:00:5e:00:00:01
Drop membership
igb1:
        inet 192.168.1.55
        igmpv3 rv 2 qi 125 qri 10 uri 3
                group 224.0.0.1 mode exclude
                        mcast-macaddr 01:00:5e:00:00:01


As you can see, the ADD/DROP membership to/from the group is successful.

But, there are no IGMP leave packets on igb1:
# tcpdump -i igb1 -vvv
tcpdump: listening on igb1, link-type EN10MB (Ethernet), capture size 262144 bytes
08:14:31.403346 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.55 tell 192.168.1.55, length 28
08:14:33.678193 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_ex { }]
08:14:34.526191 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_ex { }]
08:14:37.905173 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_ex { }]
08:14:39.177149 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_ex { }]

It seems that the function inp_leave_group() is broken:

2409 	        if (is_final) {
2410 	                ip_mfilter_remove(&imo->imo_head, imf);

(1) For IP_DROP_MEMBERSHIP is_final is true. Mark socket-layer filter set as INCLUDE {} at t1.

2411 	                imf_leave(imf);
2412 	        } else {
2413 	                if (imf->imf_st[0] == MCAST_EXCLUDE) {
2414 	                        error = EADDRNOTAVAIL;
2415 	                        goto out_inp_locked;
2416 	                }
2417 	                ims = imo_match_source(imf, &ssa->sa);
2418 	                if (ims == NULL) {
2419 	                        CTR3(KTR_IGMPV3, "%s: source 0x%08x %spresent",
2420 	                            __func__, ntohl(ssa->sin.sin_addr.s_addr), "not ");
2421 	                        error = EADDRNOTAVAIL;
2422 	                        goto out_inp_locked;
2423 	                }
2424 	                CTR2(KTR_IGMPV3, "%s: %s source", __func__, "block");
2425 	                error = imf_prune(imf, &ssa->sin);
2426 	                if (error) {
2427 	                        CTR1(KTR_IGMPV3, "%s: merge imf state failed",
2428 	                            __func__);
2429 	                        goto out_inp_locked;
2430 	                }
2431 	        }
2432 	
....

(2) Commit and reap socket filter deltas.

2460 	        imf_commit(imf);
2461 	        imf_reap(imf);
2462 	
2463 	out_inp_locked:
2464 	        INP_WUNLOCK(inp);
2465 	
2466 	        if (is_final && imf) {

2467 	                /*
2468 	                 * Give up the multicast address record to which
2469 	                 * the membership points.
2470 	                 */

(3) in_leavegroup_locked() call igmp_change_state(). But, because source filter deltas already committed and reaped, the call igmp_change_state() doesn't lead to IGMP state transition and sending IGMP packets.

2471 	                (void) in_leavegroup_locked(imf->imf_inm, imf);
2472 	                ip_mfilter_free(imf);
2473 	        }
2474 	
2475 	        IN_MULTI_UNLOCK();
2476 	        return (error);
2477 	}

What do you think about move the in_leavegroup_locked() call before commit and reap?
Something like this:

Index: sys/netinet/in_mcast.c
===================================================================
--- sys/netinet/in_mcast.c      (revision 355800)
+++ sys/netinet/in_mcast.c      (working copy)
@@ -2409,6 +2409,7 @@
        if (is_final) {
                ip_mfilter_remove(&imo->imo_head, imf);
                imf_leave(imf);
+               (void) in_leavegroup_locked(imf->imf_inm, imf);
        } else {
                if (imf->imf_st[0] == MCAST_EXCLUDE) {
                        error = EADDRNOTAVAIL;
@@ -2468,7 +2469,6 @@
                 * Give up the multicast address record to which
                 * the membership points.
                 */
-               (void) in_leavegroup_locked(imf->imf_inm, imf);
                ip_mfilter_free(imf);
        }

With this patch.

# ./mcasttest
Add membership
igb1:
        inet 192.168.1.55
        igmpv3 rv 2 qi 125 qri 10 uri 3
                group 239.0.0.5 mode exclude
                        mcast-macaddr 01:00:5e:00:00:05
                group 224.0.0.1 mode exclude
                        mcast-macaddr 01:00:5e:00:00:01
Drop membership
igb1:
        inet 192.168.1.55
        igmpv3 rv 2 qi 125 qri 10 uri 3
                group 224.0.0.1 mode exclude
                        mcast-macaddr 01:00:5e:00:00:01
Add membership
igb1:
        inet 192.168.1.55
        igmpv3 rv 2 qi 125 qri 10 uri 3
                group 239.0.0.5 mode exclude
                        mcast-macaddr 01:00:5e:00:00:05
                group 224.0.0.1 mode exclude
                        mcast-macaddr 01:00:5e:00:00:01
Drop membership
igb1:
        inet 192.168.1.55
        igmpv3 rv 2 qi 125 qri 10 uri 3
                group 224.0.0.1 mode exclude
                        mcast-macaddr 01:00:5e:00:00:01

And tcpdump shows:
# tcpdump -i igb1 -vvv
tcpdump: listening on igb1, link-type EN10MB (Ethernet), capture size 262144 bytes
10:19:22.059334 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.55 tell 192.168.1.55, length 28
10:19:24.232247 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_ex { }]
10:19:24.861942 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_ex { }]
10:19:29.502570 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_in { }]
10:19:31.807624 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_in { }]
10:19:34.747646 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_ex { }]
10:19:35.595604 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_ex { }]
10:19:39.818577 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_in { }]
10:19:40.242591 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
    192.168.1.55 > igmp.mcast.net: igmp v3 report, 1 group record(s) [gaddr 239.0.0.5 to_in { }]
Comment 1 Hans Petter Selasky freebsd_committer 2019-12-17 08:44:38 UTC
> What do you think about move the in_leavegroup_locked() call before commit and reap? Something like this:

This patch probably also applies to the IPv6 multicast code. Did you check?

Is this a regression issue?

--HPS
Comment 2 Hans Petter Selasky freebsd_committer 2019-12-17 09:45:59 UTC
Also check this statistics that you don't have a memory leak:
vmstat -m | grep multi
Comment 3 Hans Petter Selasky freebsd_committer 2019-12-17 10:01:48 UTC
Can you test this patch:

https://reviews.freebsd.org/D22848

Both IPv4 and IPv6, thank you.
Comment 4 Aleksandr Fedorov freebsd_committer 2019-12-17 13:17:01 UTC
It seems that regression was introduced at r349369: https://svnweb.freebsd.org/base/head/sys/netinet/in_mcast.c?r1=347691&r2=349369&pathrev=349369

I tested ipv6 version with your patch, it's look good.

Before patch:
# tcpdump -i igb1 -vvv
tcpdump: listening on igb1, link-type EN10MB (Ethernet), capture size 262144 bytes
16:12:20.141207 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.55 tell 192.168.1.55, length 28
16:12:22.359877 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 to_ex { }]
16:12:24.479711 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 to_ex { }]
16:12:32.945688 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 to_ex { }]
16:12:33.793701 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 to_ex { }]

After patch:
# tcpdump -i igb1 -vvv
tcpdump: listening on igb1, link-type EN10MB (Ethernet), capture size 262144 bytes
15:57:34.351600 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.55 tell 192.168.1.55, length 28
15:57:36.607299 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 to_ex { }]
15:57:37.031254 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 to_ex { }]
15:57:41.676026 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 block { }]
15:57:43.158268 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 block { }]
15:57:46.941264 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 to_ex { }]
15:57:50.121248 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 to_ex { }]
15:57:52.029263 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 block { }]
15:57:54.139271 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::aa1e:84ff:fe93:d5f5 > ff02::16: HBH (pa
dn)(rtalert: 0x0000)  [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff12::1 block { }]

I also checked vmstat -m |grep multi and didn't found any memory leaks.

Thanks!
Comment 5 Hans Petter Selasky freebsd_committer 2019-12-18 12:03:23 UTC
Guido: Please give https://reviews.freebsd.org/D22848 a spin.

--HPS
Comment 6 Hans Petter Selasky freebsd_committer 2019-12-18 12:07:06 UTC
Let me know if there are any more issues.
Comment 7 commit-hook freebsd_committer 2019-12-18 12:07:10 UTC
A commit references this bug:

Author: hselasky
Date: Wed Dec 18 12:06:35 UTC 2019
New revision: 355881
URL: https://svnweb.freebsd.org/changeset/base/355881

Log:
  Leave multicast group before reaping and committing state for both
  IPv4 and IPv6.

  This fixes a regression issue after r349369. When trying to exit a
  multicast group before closing the socket, a multicast leave packet
  should be sent.

  Differential Revision:	https://reviews.freebsd.org/D22848
  PR: 242677
  Reviewed by:	bz (network)
  Tested by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
  MFC after:	1 week
  Sponsored by:	Mellanox Technologies

Changes:
  head/sys/netinet/in_mcast.c
  head/sys/netinet6/in6_mcast.c
Comment 8 guido 2019-12-18 15:27:27 UTC
(In reply to Hans Petter Selasky from comment #5)
This patch (applied to 12.1) fixes the remaining multicast issues I had.
Comment 9 Kubilay Kocak freebsd_committer freebsd_triage 2019-12-24 00:11:34 UTC
^Triage: 

- Re-open pending MFC
- Set mfc-stable11 to (assuming only stable/12 regressed). If incorrect, please set flag to ? and + on merge to stable/11
Comment 10 commit-hook freebsd_committer 2019-12-25 09:26:17 UTC
A commit references this bug:

Author: hselasky
Date: Wed Dec 25 09:25:21 UTC 2019
New revision: 356069
URL: https://svnweb.freebsd.org/changeset/base/356069

Log:
  MFC r355881:
  Leave multicast group before reaping and committing state for both
  IPv4 and IPv6.

  This fixes a regression issue after r349369. When trying to exit a
  multicast group before closing the socket, a multicast leave packet
  should be sent.

  Differential Revision:	https://reviews.freebsd.org/D22848
  PR: 242677
  Reviewed by:	bz (network)
  Tested by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
  Sponsored by:	Mellanox Technologies

Changes:
_U  stable/12/
  stable/12/sys/netinet/in_mcast.c
  stable/12/sys/netinet6/in6_mcast.c
Comment 11 Hans Petter Selasky freebsd_committer 2019-12-25 09:35:55 UTC
I think only FreeBSD 12-stable is relevant for this patch.