Bug 235005 - r342051 "pfsync: Performance improvement" breaks CARP when used with pfsync
Summary: r342051 "pfsync: Performance improvement" breaks CARP when used with pfsync
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: Kristof Provost
URL: https://reviews.freebsd.org/D18882
Keywords: regression
Depends on:
Blocks:
 
Reported: 2019-01-16 19:16 UTC by Thomas Steen Rasmussen / Tykling
Modified: 2019-01-21 00:33 UTC (History)
3 users (show)

See Also:
koobs: mfc-stable12?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Steen Rasmussen / Tykling 2019-01-16 19:16:10 UTC
After quite a few buildworlds+kernel and reboots I've managed to isolate base r342051 "pfsync: Performance improvement" as the reason why lagg stopped working for me.

I've been building a couple of carp+pf routers/firewalls, originally with 12-BETA2 but they were recently upgraded to 12-STABLE base r342254 which is when both carp nodes started being MASTER instead of one MASTER and one BACKUP node.

The notes from my bisecting are below. All tests are with the same configuration. As you can see, base r342051 is the commit where it broke.

12-STABLE base r339946 MASTER/BACKUP
12-STABLE base r341100 MASTER/BACKUP
12-STABLE base r341677 MASTER/BACKUP
12-STABLE base r341965 MASTER/BACKUP
12-STABLE base r342037 MASTER/BACKUP
12-STABLE base r342050 MASTER/BACKUP
12-STABLE base r342051 MASTER/MASTER
12-STABLE base r342055 MASTER/MASTER
12-STABLE base r342073 MASTER/MASTER
12-STABLE base r342109 MASTER/MASTER
12-STABLE base r342254 MASTER/MASTER

I've further confirmed pfsync to be at fault, when pfsync is not enabled the two nodes are MASTER and BACKUP as they should be. Immediately after I start pfsync the BACKUP node becomes MASTER and logs these messages:

Jan 16 16:34:56 fwclu2b kernel: carp: demoted by -240 to -240 (pfsync bulk done)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.52: BACKUP -> MASTER (preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.51: BACKUP -> MASTER (preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg3: BACKUP -> MASTER (preempting a slower master)

...but the MASTER also stays MASTER, and chaos ensues, nothing works on the network. Stopping pfsync doesn't resolve the situation, only a reboot with pfsync disabled restores normal carp functionality.

I suggest maybe backing out base r342051 while we investigate the cause, if a fix can't be found quickly. I suspect it could have something to do with the pfsync carp demotion code, which the log messages above seem to confirm, but I don't know.

Let me know if further info is needed about my configuration or anything. See also this thread on -stable https://lists.freebsd.org/pipermail/freebsd-stable/2019-January/090421.html which confirms I am not the only one experiencing this.
Comment 1 Kristof Provost freebsd_committer freebsd_triage 2019-01-16 19:30:32 UTC
Can you describe your test setup?
Comment 2 Thomas Steen Rasmussen / Tykling 2019-01-16 19:49:51 UTC
Sure I guess, they are routers connected to many different networks, some of them with carp shared IPs. Real servers connected to real switches.

Both ipv4 and ipv6 configured. carp sits either directly on lagg or on top of vlan interfaces on top of lagg interfaces on top of the actual interfaces.

A bit of a complex setup, anything in particular you are looking for?
Comment 3 Kristof Provost freebsd_committer freebsd_triage 2019-01-16 20:22:13 UTC
Ideally as small a testcase as possible to reproduce the problem. The commit you identified does not directly touch any carp related code, and shouldn’t change pfsync behaviour, so I’ll have to reproduce the problem to be able to investigate it.
Comment 4 Thomas Steen Rasmussen / Tykling 2019-01-16 20:28:43 UTC
Have you tried just setting up two carp nodes and enabling pfsync?
Comment 5 Kristof Provost freebsd_committer freebsd_triage 2019-01-16 20:48:34 UTC
Not yet, no. I’m currently traveling and my computer time is severely limited.
Comment 6 Thomas Steen Rasmussen / Tykling 2019-01-16 21:09:53 UTC
OK, well, if you have trouble reproducing let me know and I will see what I can do :)

Meanwhile - and I don't know what the policy usually is here - but I would suggest backing out this change for now. It is a "only" a performance improvement, and it carries a significant regression.

I can keep the firewalls in this state for a week more at the most. Then I will have to continue with getting them into production. So any testing I need to do has to be before Wednesday, January 23rd.
Comment 7 Pete French 2019-01-16 22:29:46 UTC
Just to add another data point, I hit the same issue as Thomas, on top of the network interfaces directly, not including lagg. I too have IPv6 enabled, along with pf and pfsync. Machines without pfsync behave fine.
Comment 8 Kristof Provost freebsd_committer freebsd_triage 2019-01-17 05:28:58 UTC
So far I'm not able to reproduce this problem (on current HEAD).
I'm using vnet jails (because I obviously don't travel with several test devices on me), but that shouldn't affect matters, as far as I can tell.

Can you confirm that backing out just r342051 on top of the latest HEAD shows the bad behaviour?

Is there anything else in the dmesg output? The only way pfsync should affect carp is through a demotion adjustment callback, which should be logged.

Can you also include the ifconfig output for the carp interfaces on both hosts?

It might also be interesting to check if you're seeing the VRRP messages from the other host on each host.
Comment 9 Thomas Steen Rasmussen / Tykling 2019-01-17 08:21:27 UTC
But it *does* log it, I included the log messages previously. Here they are again:


Jan 16 16:34:56 fwclu2b kernel: carp: demoted by -240 to -240 (pfsync bulk done)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.52: BACKUP -> MASTER (preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.51: BACKUP -> MASTER (preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg3: BACKUP -> MASTER (preempting a slower master)

This is from the BACKUP node the same moment I started pfsync. It then becomes MASTER, but the former MASTER also stays MASTER.

We are of course seeing VRRP messages, because otherwise the MASTER/BACKUP negotiation would not work (when pfsync is stopped or r342051 is not applied).

I will try to disable some interfaces to see if I can get to a less complex setup which still shows the issue.

If I was to include full ifconfig output it would have to be redacted. Lets see if we can make do without for now.
Comment 10 Kristof Provost freebsd_committer freebsd_triage 2019-01-17 08:57:38 UTC
(In reply to Thomas Steen Rasmussen / Tykling from comment #9)
There is indeed a single report from pfsync, which I'd expect. The only way I can currently see that it could keep us in carp master state is for it to keep happening, and that doesn't appear to be the case. As soon as it gets a report from its peer is should degrade to backup, shouldn't it?

Please do supply the ifconfig output. The IP addresses can probably be censored (even though I'll never understand why that would be sensitive information).

Has anyone tried reverting that single commit on HEAD and confirmed that the problem goes away then?
Comment 11 Thomas Steen Rasmussen / Tykling 2019-01-17 20:59:14 UTC
(In reply to Kristof Provost from comment #10)

To try to isolate the problem I have disabled all interfaces except for the pfsync interface and a single real interface with 1 v4 carp IP, and the problem persists. 

This means the issue is:
- not IPv6 related
- not related to multiple carp interfaces
- not related to multiple carp IPs

So a relatively simple test setup should be able to replicate it, unless drivers are somehow involved.

Ifconfig for both systems sync interface (lagg0) and carp interface (lagg2.51) is below:

[tykling@fwclu2a ~]$ ifconfig lagg2.51
lagg2.51: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether a0:36:9f:xx:yy:zz
        inet X.Y.Z.232 netmask 0xffffff00 broadcast X.Y.Z.255 
        inet X.Y.Z.231 netmask 0xffffffff broadcast X.Y.Z.231 vhid 1 
        groups: vlan 
        carp: MASTER vhid 1 advbase 1 advskew 100
        vlan: 51 vlanpcp: 0 parent interface: lagg2
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
[tykling@fwclu2a ~]$ ifconfig lagg0
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether a0:36:9f:xx:yy:zz
        inet 172.16.16.232 netmask 0xffffff00 broadcast 172.16.16.255 
        laggproto failover lagghash l2,l3,l4
        laggport: igb7 flags=5<MASTER,ACTIVE>
        laggport: igb9 flags=0<>
        groups: lagg 
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
[tykling@fwclu2a ~]$ 

[tykling@fwclu2b ~]$ ifconfig lagg2.51
lagg2.51: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether a0:36:9f:xx:yy:zz
        inet X.Y.Z.233 netmask 0xffffff00 broadcast X.Y.Z.255 
        inet X.Y.Z.231 netmask 0xffffffff broadcast X.Y.Z.231 vhid 1 
        groups: vlan 
        carp: MASTER vhid 1 advbase 1 advskew 200
        vlan: 51 vlanpcp: 0 parent interface: lagg2
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
[tykling@fwclu2b ~]$ ifconfig lagg0
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether a0:36:9f:xx:yy:zz
        inet 172.16.16.233 netmask 0xffffff00 broadcast 172.16.16.255 
        laggproto failover lagghash l2,l3,l4
        laggport: igb7 flags=5<MASTER,ACTIVE>
        laggport: igb9 flags=0<>
        groups: lagg 
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
[tykling@fwclu2b ~]$ 

I've tried building 12-STABLE HEAD (base r343118) without base r342051 and confirmed that the MASTER/BACKUP negotiation works fine then, even with pfsync enabled.
Comment 12 Kristof Provost freebsd_committer freebsd_triage 2019-01-17 21:09:28 UTC
Just to confirm: this is all on amd64, right?

I can try to reproduce it again this evening.
Comment 13 Thomas Steen Rasmussen / Tykling 2019-01-17 21:15:39 UTC
Yes all amd64
Comment 14 Steven Hartland freebsd_committer freebsd_triage 2019-01-17 23:37:52 UTC
It might be a silly question but I'm wondering if pfsync_update_state_req is never returning true due to the bucketing hence its not entering running the "We've filled a packet" block.

Running a simple dtrace script to monitor pfsync_update_state_req should confirm or deny this.

One of your previous emails showed demotion at -240 which is as if its been undemoted without first being demoted and would explain what your seeing if this is only happening on the machine which started life as with carp at BACKUP.

Given this can you show the output of sysctl net.inet.carp as well as the carp log messages for both machines.

Finally can you do a `tcpdump -s 0 -i lagg2.51 -T carp carp` to see what carp packets each machine is seeing.
Comment 15 Steven Hartland freebsd_committer freebsd_triage 2019-01-18 00:16:51 UTC
I think I found the issue can you try the one line fix from:
https://reviews.freebsd.org/D18882
Comment 16 Kristof Provost freebsd_committer freebsd_triage 2019-01-18 00:38:07 UTC
That looks plausible, yes. I’ll have a closer look in a few hours, when I’m behing a computer rather than my phone.
Comment 17 Thomas Steen Rasmussen / Tykling 2019-01-18 07:51:21 UTC
Hello,

I've built r343129 plus the oneline patch from review D18882 and confirmed that it solves the issue. I now have a "bucketed" pfsync running and carp MASTER/BACKUP negotiation works.

Thank you

/Thomas
Comment 18 commit-hook freebsd_committer freebsd_triage 2019-01-18 08:20:27 UTC
A commit references this bug:

Author: kp
Date: Fri Jan 18 08:19:54 UTC 2019
New revision: 343130
URL: https://svnweb.freebsd.org/changeset/base/343130

Log:
  pf: fix pfsync breaking carp

  Fix missing initialisation of sc_flags into a valid sync state on clone which
  breaks carp in pfsync.

  This regression was introduce by r342051.

  PR:		235005
  Submitted by:	smh@FreeBSD.org
  Pointy hat to:	kp
  MFC after:	3 days
  Differential Revision:	https://reviews.freebsd.org/D18882

Changes:
  head/sys/netpfil/pf/if_pfsync.c
Comment 19 Kristof Provost freebsd_committer freebsd_triage 2019-01-18 08:21:21 UTC
Thanks to you all for the fix and the testing.
Comment 20 commit-hook freebsd_committer freebsd_triage 2019-01-21 00:32:24 UTC
A commit references this bug:

Author: kp
Date: Mon Jan 21 00:32:04 UTC 2019
New revision: 343236
URL: https://svnweb.freebsd.org/changeset/base/343236

Log:
  MFC r343130

  pf: fix pfsync breaking carp

  Fix missing initialisation of sc_flags into a valid sync state on clone which
  breaks carp in pfsync.

  This regression was introduce by r342051.

  PR:		235005
  Submitted by:	smh@FreeBSD.org
  Pointy hat to:	kp

Changes:
_U  stable/11/
  stable/11/sys/netpfil/pf/if_pfsync.c
Comment 21 commit-hook freebsd_committer freebsd_triage 2019-01-21 00:32:29 UTC
A commit references this bug:

Author: kp
Date: Mon Jan 21 00:32:04 UTC 2019
New revision: 343237
URL: https://svnweb.freebsd.org/changeset/base/343237

Log:
  MFC r343130

  pf: fix pfsync breaking carp

  Fix missing initialisation of sc_flags into a valid sync state on clone which
  breaks carp in pfsync.

  This regression was introduce by r342051.

  PR:		235005
  Submitted by:	smh@FreeBSD.org
  Pointy hat to:	kp

Changes:
_U  stable/12/
  stable/12/sys/netpfil/pf/if_pfsync.c