I'm running 11-CURRENT r283640. Starting syncthing (net/syncthing) works fine, but killing it produces a panic in mld_change_state(). There is an image of the panic here:
It seems to be reliably reproducible.
Disabling localAnnounceEnabled should avoid it, it's announcing on [ff32::5222]:21026.
This bug is really annoying and makes it close to impossible to keep syncthing in local network. I'm happy to help with debugging if necessary.
Steps to reproduce:
1. turn on sycnthing
2. local peer discovery is enabled by default
3. turn off syncthing or restart
4. crashes 9/10
The same problem exists on 10.3-RELEASE-p5 using any recent version of Syncthing (from the version available in the 2016Q2 ports branch/pkg repo to the latest version available on Syncthing's website at the time of this posting). Syncthing developer states this is a FreeBSD kernel bug.
Also confirmed that disabling the 'localAnnounceEnabled' option is a temporary workaround, but makes the product somewhat useless in LAN environments.
it's not longer reproducible on FBSD11 32bits and latest syncthing installed by pkg.
Fixed as noted in PR 202978.
Re-open. Will add more context/information shortly.
Canonicalize summary including info from bug 202978 and bug 201913 (duplicates)
Note 1: This was reproducible using net/syncthing > 0.11.18 (including 0.11.23), noticed after a port upgrade between those two versions, until some later version when upstream issue #2090  was resolved.
Note 2: The upstream commit  *only* changed the default IPv6 multicast address. The underlying userland causing kernel/host crash is what this issue is for.
This probably needs much more attention given the scope of versions that were reported to be affected, and the apparent triviality of the causing factor.
@George, can you cc individuals / re-assign the issue as necessary please.
*** Bug 201913 has been marked as a duplicate of this bug. ***
*** Bug 202978 has been marked as a duplicate of this bug. ***
Created attachment 173704 [details]
panic backtrace 10.2-RELEASE / syncthing: 0.12.17
Add upstream forum thread reference (containing stacktrace) and attach here for completeness
In my comment 6, "until some later version when upstream issue #2090  was resolved" may not be the case, as per Charles comment 2
@Charles, can you please provide more detail on your system configuration that is affected? In particular:
- The latest version of freebsd you have reproduced the issue on
- The latest version of net/syncthing you have reproduced the issue with (Please also specify whether: port, package, latest or quarterly, or upstream)
If you can provide a gdb backtrace as an attachment, that would also be fantastic.
We don't do anything with the the P (Prefix) or T (Transient) bits in IPv6 multicast in FreeBSD. So it's unclear how this could have affected a resolution of the issue, and this is what makes it very difficult to draw any conclusions from the upstream change for #2090 apparently resolving an issue.
Consider: the multicast address scope does not change; the first 16 bits of the address syncthing use remain: FFX2. The FF denotes multicast; the 2 nibble denotes link-local.
However, the bits in nibble X do change. Link-local groups normally set X=0. syncthing pivots between X=1 (ipv6 group is transient and not well known) and X=3 (transient group, based on unicast prefix).
But nothing I've seen in FreeBSD directly references these bits.
Disclaimer: I haven't observed or reproduced the issue myself, and it's been many, many years since I wrote this code. It seems to me that it could have been triggered by a race elsewhere; obviously, this isn't going to show up in the kernel backtrace posted to syncthing's support forums, as is the nature of races.
Is there someone, who is able reproduce this panic and can provide some debug info?
1. Do you use some sort of VPN that create/destroys interfaces?
2. Can you save a core dump from this panic, and then run
# kgdb /boot/kernel/kernel /var/crash/vmcore.N
(kgdb) l *0xfffffffxxxxxxx
where 0xfffffffxxxxxxx - is address from panic message "instruction pointer = 0x20:0xfffffffxxxxxxx"
*** Bug 213953 has been marked as a duplicate of this bug. ***