Bug 254236 - pfsync bulk update fails on systems with many CPU cores
Summary: pfsync bulk update fails on systems with many CPU cores
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: --- Affects Only Me
Assignee: Kristof Provost
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-12 10:34 UTC by topical
Modified: 2021-04-23 15:50 UTC (History)
1 user (show)

See Also:


Attachments
Enforce sending of bucket containing "bulk update" request (555 bytes, text/plain)
2021-03-12 10:34 UTC, topical
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description topical 2021-03-12 10:34:05 UTC
Created attachment 223204 [details]
Enforce sending of bucket containing "bulk update" request

On startup, pfsync initiates a "bulk update" to fetch all states from sync peer:

kernel: pfsync: requesting bulk update

One systems with many cores, this step fails in most cases:

kernel: pfsync: failed to receive bulk update

This is because the "bulk update" is not sent to the sync peer. Internally, all pfsync requests (like "bulk update") are put into "buckets". The content of a bucket is usually sent when the bucket is sufficiently filled (i.e. the generated IP packet just fits into the MTU).

The number of buckets is twice the number of CPU cores by default to reduce lock contention, but you can change this at boot time with "net.pfsync.pfsync_buckets".

If you have many cores and little traffic on the host, buckets won't be filled fast enough and the bucket with the "bulk update" request will not be sent within timeout. If it sent later on, the host is not in bulk update mode anymore and will discard the bulk update response.

I identified the following solutions/workarounds:

1. Enforce sending buckets after generating bulk update request by patching if_pfsync.c
2. Reduce number of buckets (very unreliably and even "4" may be too high if your host is in standby mode and thus receives very little traffic)
3. Force sending of all buckets by temporarily reducing MTU of pfsync interface
Comment 1 Kristof Provost freebsd_committer freebsd_triage 2021-03-12 10:40:42 UTC
Thanks for the report and analysis. At first glance your patch seems like the sane fix for this. I'll try to take a closer look in the next few days.
Comment 2 commit-hook freebsd_committer freebsd_triage 2021-03-17 19:21:51 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=8ad7d25dfc808ca00300f7553a9b28dfc0e99c18

commit 8ad7d25dfc808ca00300f7553a9b28dfc0e99c18
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2021-03-15 13:10:55 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-03-17 18:18:14 +0000

    pf tests: pfsync bulk update test

    Test that pfsync works as expected with bulk updates. That is, create
    some state before setting up the second firewall. Let that firewall
    request a bulk update so it can catch up, and check that it got the
    state which was created before it enable pfsync.

    PR:             254236
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D29272

 tests/sys/netpfil/pf/pfsync.sh | 68 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
Comment 3 commit-hook freebsd_committer freebsd_triage 2021-03-17 19:21:52 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=9f2e5184173f6af70306678b018270df9a9600f2

commit 9f2e5184173f6af70306678b018270df9a9600f2
Author:     Thomas Kurschel <topical@gmx.net>
AuthorDate: 2021-03-15 13:28:52 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-03-17 18:18:14 +0000

    pfsync: Unconditionally push packets when requesting state updates

    When we request a bulk sync we need to ensure we actually send out that
    request, not just buffer it until we have enough data to send a full
    packet.

    PR:             254236
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D29271

 sys/netpfil/pf/if_pfsync.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
Comment 4 commit-hook freebsd_committer freebsd_triage 2021-03-31 13:13:44 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=343fee4cd023da0f7ed64e19f3d2351083fe963c

commit 343fee4cd023da0f7ed64e19f3d2351083fe963c
Author:     Thomas Kurschel <topical@gmx.net>
AuthorDate: 2021-03-15 13:28:52 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-03-31 13:09:08 +0000

    pfsync: Unconditionally push packets when requesting state updates

    When we request a bulk sync we need to ensure we actually send out that
    request, not just buffer it until we have enough data to send a full
    packet.

    PR:             254236
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D29271

    (cherry picked from commit 9f2e5184173f6af70306678b018270df9a9600f2)

 sys/netpfil/pf/if_pfsync.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
Comment 5 commit-hook freebsd_committer freebsd_triage 2021-03-31 13:13:45 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e99aa5c2cf6b0eadcc29c62243d51de0eb36937c

commit e99aa5c2cf6b0eadcc29c62243d51de0eb36937c
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2021-03-15 13:10:55 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-03-31 13:09:08 +0000

    pf tests: pfsync bulk update test

    Test that pfsync works as expected with bulk updates. That is, create
    some state before setting up the second firewall. Let that firewall
    request a bulk update so it can catch up, and check that it got the
    state which was created before it enable pfsync.

    PR:             254236
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D29272

    (cherry picked from commit 8ad7d25dfc808ca00300f7553a9b28dfc0e99c18)

 tests/sys/netpfil/pf/pfsync.sh | 68 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
Comment 6 commit-hook freebsd_committer freebsd_triage 2021-03-31 13:13:46 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=aab309b4134a1822fab48d799580e4afdf90ac4a

commit aab309b4134a1822fab48d799580e4afdf90ac4a
Author:     Thomas Kurschel <topical@gmx.net>
AuthorDate: 2021-03-15 13:28:52 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-03-31 08:56:46 +0000

    pfsync: Unconditionally push packets when requesting state updates

    When we request a bulk sync we need to ensure we actually send out that
    request, not just buffer it until we have enough data to send a full
    packet.

    PR:             254236
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D29271

    (cherry picked from commit 9f2e5184173f6af70306678b018270df9a9600f2)

 sys/netpfil/pf/if_pfsync.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
Comment 7 commit-hook freebsd_committer freebsd_triage 2021-03-31 13:13:47 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f8d706fdd106f94ca048d809d504fef651d4a23e

commit f8d706fdd106f94ca048d809d504fef651d4a23e
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2021-03-15 13:10:55 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2021-03-31 08:57:24 +0000

    pf tests: pfsync bulk update test

    Test that pfsync works as expected with bulk updates. That is, create
    some state before setting up the second firewall. Let that firewall
    request a bulk update so it can catch up, and check that it got the
    state which was created before it enable pfsync.

    PR:             254236
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D29272

    (cherry picked from commit 8ad7d25dfc808ca00300f7553a9b28dfc0e99c18)

 tests/sys/netpfil/pf/pfsync.sh | 68 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)