Bug 260499

Summary: [carp] Carp unable recover after demotion by send error
Product: Base System Reporter: Marius Halden <marius.halden>
Component: kernAssignee: Gleb Smirnoff <glebius>
Status: Closed FIXED    
Severity: Affects Only Me CC: glebius
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
testcase.sh
none
alternative patch none

Description Marius Halden 2021-12-17 18:47:31 UTC
Created attachment 230203 [details]
testcase.sh

When carp is demoted due to send errors it will never recover as the code path which will decrement the demotion counter appears to be dead.

Expected output from test case when working:
> # sh testcase.sh
> Staring jail test1.
> ...
> carp: 1@epair1b: BACKUP -> MASTER (master timed out)
> net.inet.carp.demotion: 0
> 
> Enabling pf
> pf enabled
> ...
> carp: demoted by 240 to 240 (send error 13 on epair1b)
> net.inet.carp.demotion: 240
> 
> Disabling pf
> pf disabled
> .......
> carp: demoted by -240 to 0 (send ok on epair1b)
> net.inet.carp.demotion: 0
> Cleaning up.

Actual output from test case:
> # sh testcase.sh
> Staring jail test1.
> ...
> carp: 1@epair0b: BACKUP -> MASTER (master timed out)
> net.inet.carp.demotion: 0
> 
> Enabling pf
> pf enabled
> ...
> carp: demoted by 240 to 240 (send error 13 on epair0b)
> net.inet.carp.demotion: 240
> 
> Disabling pf
> pf disabled
> .......
> carp: demoted by 240 to 240 (send error 13 on epair0b)
> net.inet.carp.demotion: 240
> Cleaning up.

Note, the test case uses PF to force ip_output to fail.
Comment 1 Marius Halden 2021-12-17 18:56:13 UTC
A patch is available as review D33536
Comment 2 Gleb Smirnoff freebsd_committer freebsd_triage 2021-12-17 19:40:07 UTC
Created attachment 230207 [details]
alternative patch

Marius, thanks a lot for the test case! I came with alternative patch, that IMHO makes it easier to understand what's going on here.
Comment 3 Marius Halden 2021-12-18 10:29:17 UTC
I tested the patch, and it looks good to me.
Comment 4 commit-hook freebsd_committer freebsd_triage 2021-12-19 01:20:32 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=9a8cf950b259f6833c7562ce941b0cfeae6687e5

commit 9a8cf950b259f6833c7562ce941b0cfeae6687e5
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2021-12-19 01:19:26 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2021-12-19 01:19:26 +0000

    carp: fix send error demotion recovery

    The problem is that carp(4) would clear the error counter on first
    successful send, and stop counting successes after that.  Fix this
    logic and document it in human language.

    PR:                     260499
    Differential revision:  https://reviews.freebsd.org/D33536

 sys/netinet/ip_carp.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)
Comment 5 Marius Halden 2021-12-19 12:12:44 UTC
Thanks for committing this.

Are you planting to MFC this to 13-Stable? I couldn’t see an MFC line in the commit.
Comment 6 Gleb Smirnoff freebsd_committer freebsd_triage 2021-12-19 16:03:47 UTC
I will do that after the holidays.
Comment 7 Marius Halden 2022-01-21 11:09:03 UTC
Hi,

How are you coming along with mfc-ing this? From what I can see it hasn't happened yet.
Comment 8 commit-hook freebsd_committer freebsd_triage 2022-01-21 18:43:10 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f7735c3952d9d54f745f1682ba1f13a70897a8f7

commit f7735c3952d9d54f745f1682ba1f13a70897a8f7
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2021-12-19 01:19:26 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2022-01-21 18:41:46 +0000

    carp: fix send error demotion recovery

    The problem is that carp(4) would clear the error counter on first
    successful send, and stop counting successes after that.  Fix this
    logic and document it in human language.

    PR:                     260499
    Differential revision:  https://reviews.freebsd.org/D33536

    (cherry picked from commit 9a8cf950b259f6833c7562ce941b0cfeae6687e5)

 sys/netinet/ip_carp.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)