Hello, I'm using FreeBSD 12.1-STABLE. # uname -a FreeBSD test.test.com 12.1-STABLE FreeBSD 12.1-STABLE #8 1f999e39f46(v2)-dirty: Wed Apr 22 08:40:36 +03 2020 test@test.test.com:/usr/obj/usr/src/amd64.amd64/sys/amd64 After restarting multicast routing daemon, daemon can't start with this error: "Failed adding VIF 1 (MRT_ADD_VIF) for iface em1: Address already in use". pimd tries to disable all vifs from kernel (shown in output below) but still throws the same error. I tried with both mrouted and pimd. Both of them shows same behavior. My opinion is kernel doesn't disable VIFs. I opened a bug report for netstat -g also ( bug #246626 ), I think there is corruption about the multicast stack in the kernel. There is no problem on FreeBSD 11.2. To reproduce error: # pimd # killall pimd # pimd -d -s debug debug level 0xffffffff (dvmrp_detail,dvmrp_prunes,dvmrp_routes,dvmrp_neighbors,dvmrp_timers,igmp_proto,igmp_timers,igmp_members,trace,timeout,packets,interfaces,kernel,cache,rsrr,pim_detail,pim_hello,pim_register,pim_join_prune,pim_bootstrap,pim_asserts,pim_cand_rp,pim_routes,pim_timers,pim_rpf) 11:52:23.035 pimd version 2.3.2 starting ... 11:52:23.035 Got 262144 byte send buffer size in 0 iterations 11:52:23.035 Got 262144 byte recv buffer size in 0 iterations 11:52:23.035 Got 262144 byte send buffer size in 0 iterations 11:52:23.035 Got 262144 byte recv buffer size in 0 iterations 11:52:23.035 Getting vifs from kernel 11:52:23.035 Installing em0 (10.2.4.20 on subnet 10.2.4/24) as vif #0 - rate 0 11:52:23.035 Installing em1 (192.168.58.1 on subnet 192.168.58) as vif #1 - rate 0 11:52:23.035 Installing em2 (192.168.59.1 on subnet 192.168.59) as vif #2 - rate 0 11:52:23.035 Installing em1.1600 (192.168.16.1 on subnet 192.168.16) as vif #3 - rate 0 11:52:23.035 Installing em1.1700 (192.168.17.1 on subnet 192.168.17) as vif #4 - rate 0 11:52:23.035 Disabling all vifs from kernel 11:52:23.035 Getting vifs from /usr/local/etc//pimd.conf 11:52:23.035 Local Cand-BSR address 192.168.59.1, priority 5 11:52:23.035 Local Cand-RP address 192.168.59.1, priority 20, interval 30 sec 11:52:23.035 spt-threshold packets 0 interval 100 11:52:23.035 Local static RP: 169.254.0.1, group 232.0.0.0/8 11:52:23.035 IGMP query interval : 12 sec 11:52:23.035 IGMP querier timeout : 41 sec 11:52:23.035 **Failed adding VIF 1 (MRT_ADD_VIF) for iface em1: Address already in use**
There is no problem FreeBSD 12.1-p5 Latest SVN base/stable/12 kernel produces this problem.
Problem started after this commit: https://svnweb.freebsd.org/base?view=revision&revision=356621
Notify committer of r356621.
Hello, Many people are using pfSense and "some of them" try to get multicast working using IMCP-proxy or PIMD. That simply does not work. I put a lot of effort getting multicast to work and it is simply not possible. Reading this bug report, I am 99% sure that this is one of the underlaying issues. That is also the verdict of jimp (pfSense Lead Designer) So please fix it with high priority! During PIMD startup I notice: 1) after processing the first few (vlan)interfaces something goes wrong. Jun 10 11:53:50 pfSense kernel: vlan4: changing name to 'em0.4' Jun 10 11:53:49 pfSense kernel: vlan3: changing name to 'em0.6' Jun 10 11:53:47 pfSense sshd[11688]: Server listening on 0.0.0.0 port 22. Jun 10 11:53:47 pfSense sshd[11688]: Server listening on :: port 22. >>> after starting PIMD the three vlan’s below provide a VIF the ones above DO NOT, so some thing wrong here ! << Jun 10 11:53:47 pfSense kernel: vlan2: changing name to 'lagg0.13' Jun 10 11:53:47 pfSense kernel: vlan1: changing name to 'lagg0.26' Jun 10 11:53:47 pfSense kernel: vlan0: changing name to 'lagg0.10' 2) Related things are going wrong with PIMD Jun 10 11:54:11 pfSense pimd[52647]: Getting vifs from /var/etc/pimd/pimd.conf Jun 10 11:54:11 pfSense pimd[52647]: Disabling all vifs from kernel Jun 10 11:54:11 pfSense pimd[52368]: Disabling all vifs from kernel >> see here that only from three vlan's there are vifs, exactly the ones from the beginning of the kernel startup and not the once starting a bit later << Jun 10 11:54:11 pfSense pimd[52647]: Installing lagg0.13 (192.168.13.1 on subnet 192.168.13) as vif #2 - rate 0 Jun 10 11:54:11 pfSense pimd[52368]: Installing lagg0.13 (192.168.13.1 on subnet 192.168.13) as vif #2 - rate 0 Jun 10 11:54:11 pfSense pimd[52368]: Installing lagg0.26 (192.168.2.1 on subnet 192.168.2) as vif #1 - rate 0 Jun 10 11:54:11 pfSense pimd[52647]: Installing lagg0.26 (192.168.2.1 on subnet 192.168.2) as vif #1 - rate 0 Jun 10 11:54:11 pfSense pimd[52368]: Installing lagg0.10 (192.168.10.1 on subnet 192.168.10) as vif #0 - rate 0 Jun 10 11:54:11 pfSense pimd[52647]: Installing lagg0.10 (192.168.10.1 on subnet 192.168.10) as vif #0 - rate 0 3) then there is a problem recognising interfaces Jun 10 11:54:11 pfSense pimd[52647]: /var/etc/pimd/pimd.conf:12 - Invalid phyint address 'ix1.116' Jun 10 11:54:11 pfSense pimd[52647]: /var/etc/pimd/pimd.conf:11 - Invalid phyint address 'lagg0.16' Jun 10 11:54:11 pfSense pimd[52647]: /var/etc/pimd/pimd.conf:10 - Invalid phyint address 'ix0.14' 4) and the result is Jun 10 11:54:11 pfSense pimd[52368]: Cannot forward: no enabled vifs As allready indicated, I expect that this is all related to this bug. Hope to see it fixed soon. Importance is defenitively not correct!! Sincerely Louis
Created this account simply to mention this is also causing me some issues as well. Been trying to get my multicast-traffic to work correctly on pfsense 2.5.0 snapshot for the last month without luck. Their related issue can be found here: https://redmine.pfsense.org/issues/7727 Any ideas would be appreciated. I also have the mentioned errors found in these bug reports. Just wanted to advise and be able to keep an eye on any possible patches.
Sorry, I didn't notice the CC: on the bug. Let me it (for now) and have a look.
(In reply to Bjoern A. Zeeb from comment #6) Understood, it's no problem at all :) We appreciate you being able to take some time into it for us!
I quick initial guess (if anyone can test this before me) is this one line change (should also apply to HEAD): Index: sys/netinet/ip_mroute.c =================================================================== --- sys/netinet/ip_mroute.c (revision 362232) +++ sys/netinet/ip_mroute.c (working copy) @@ -739,7 +739,7 @@ X_ip_mrouter_done(void) if_allmulti(ifp, 0); } } - bzero((caddr_t)V_viftable, sizeof(V_viftable)); + bzero((caddr_t)V_viftable, sizeof(V_viftable) * MAXVIFS); V_numvifs = 0; V_pim_assert_enabled = 0;
(In reply to Bjoern A. Zeeb from comment #8) The patch works; also hit an epoch panic while testing on HEAD. I'll commit after review and merge to stable/12 a few days after. Sorry for the breakage. /bz
(In reply to Bjoern A. Zeeb from comment #9) Your the man. Thanks a million for fixing this for us. We understand we all make mistakes, we are all only human. But we all appreciate you taking the time to review and fix this for us, it makes the world of difference for us.
A commit references this bug: Author: bz Date: Wed Jun 17 21:04:39 UTC 2020 New revision: 362289 URL: https://svnweb.freebsd.org/changeset/base/362289 Log: When converting the static arrays to mallocarray() in r356621 I missed one place where we now need to multiply the size of the struct with the number of entries. This lead to problems when restarting user space daemons, as the cleanup was never properly done, resulting in MRT_ADD_VIF EADDRINUSE. Properly zero all array elements to avoid this problem. PR: 246629, 206583 Reported by: (many) MFC after: 4 days Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate") Changes: head/sys/netinet/ip_mroute.c
Sorry for the breakage guys. In case I need to reproduce this again: (rc.conf) vlans_igb0="vlan100 vlan101" create_args_vlan100="vlan 100" create_args_vlan101="vlan 101" ifconfig_vlan100="192.0.2.1/24" ifconfig_vlan101="203.0.113.1/24" igmpproxy (install pkg) config (x.conf): quickleave phyint vlan100 upstream ratelimit 0 threshold 1 altnet 192.0.2.0/24 phyint vlan101 downstream ratelimit 0 threshold 1 altnet 203.0.113.0/24 phyint igb0 disabled phyint igb1 disabled (commands) kldload ip_mroute igmpproxy -dvvvvvvvv x.conf wait briefly, ^c and restart again: igmpproxy -dvvvvvvvv x.conf
A commit references this bug: Author: bz Date: Sun Jun 21 11:48:55 UTC 2020 New revision: 362465 URL: https://svnweb.freebsd.org/changeset/base/362465 Log: MFC r362289: When converting the static arrays to mallocarray() in r356621 I missed one place where we now need to multiply the size of the struct with the number of entries. This lead to problems when restarting user space daemons, as the cleanup was never properly done, resulting in MRT_ADD_VIF EADDRINUSE. Properly zero all array elements to avoid this problem. PR: 246629, 206583 Changes: _U stable/12/ stable/12/sys/netinet/ip_mroute.c
Should all be fine again; the next snapshots, release or if you rebuild stable/12 after this should work again as expected. Sorry one more time for the breakage and not immediately noticing this PR and thanks to the pfsense people for pointing me at it.
Hello, Hello thanks again! I also notid that there was still a problem. Hopefully solved now. However one question. As you can see in the bootlog I added 14/6 there are also messages like: Jun 10 11:54:11 pfSense pimd[52647]: /var/etc/pimd/pimd.conf:12 - Invalid phyint I wonder if these errors are related to this issue and also solved now, or that they are related to another issue/bug? Sincerely, Loui
(In reply to Louis from comment #15) Sorry, Louis. I cannot say. Do you have the pimd.conf and the related rc.conf snippets or at least an ifconfig -a or ifconfig -l from a started system as the names pimd complains about the and vlans created on the lagg interfaces and the physical interfaces you pasted in do not seem to relate to each other and sadly your comments don't make full sense to me out of context. It might be wise to take this offline with me if you want and we can open a different bug report if we think this is a different FreeBSD issue.
Hello, I support you remark below, however I do not know how to do that :) "It might be wise to take this offline with me if you want and we can open a different bug report if we think this is a different FreeBSD issue." Further on - My pfSense system has 9 vlans only the first 3 seems to be recognised correctly - I did collect the info you where requesting and put that in a text file, however I do not know how to attach a file to this bugreport (please let me know) - I do hope that your latest patch will arrive in the pfSense snapshots, so that I can test it. I hope and assume soon (days) - My fieling is that there is prehaps more than one bug. And we have a small timeslot now and momentum now in which things can be fixed. If we do not take that I am afraid that pfSense will not support multicast for at least many other months. So if there is more than one bug we should identify that as quickly as possible! Related bug report for pfSense is kown as https://redmine.pfsense.org/issues/10558#change-46850 Louis
A commit references this bug: Author: bz Date: Sun Jun 21 22:09:30 UTC 2020 New revision: 362472 URL: https://svnweb.freebsd.org/changeset/base/362472 Log: Rather than zeroing MAXVIFS times size of pointer [r362289] (still better than sizeof pointer before [r354857]), we need to zero MAXVIFS times the size of the struct. All good things come in threes; I hope this is it on this one. PR: 246629, 206583 Reported by: kib MFC after: ASAP Changes: head/sys/netinet/ip_mroute.c
More eyes more fixes..
A commit references this bug: Author: bz Date: Mon Jun 22 10:52:31 UTC 2020 New revision: 362494 URL: https://svnweb.freebsd.org/changeset/base/362494 Log: MFC r362472: Rather than zeroing MAXVIFS times size of pointer [r362289] (still better than sizeof pointer before [r354857]), we need to zero MAXVIFS times the size of the struct. All good things come in threes; I hope this is it on this one. PR: 246629, 206583 Reported by: kib Changes: stable/12/sys/netinet/ip_mroute.c
*** Bug 248512 has been marked as a duplicate of this bug. ***