Bug 262571 - epair(4) interfaces stop forwarding traffic on moderate load
Summary: epair(4) interfaces stop forwarding traffic on moderate load
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.1-RELEASE
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-net (Nobody)
URL: https://lists.freebsd.org/archives/fr...
Keywords:
Depends on:
Blocks:
 
Reported: 2022-03-15 14:08 UTC by Michael Gmelin
Modified: 2022-03-24 17:09 UTC (History)
5 users (show)

See Also:
grembo: maintainer-feedback? (kp)


Attachments
Patch that works around the problem (3.04 KB, patch)
2022-03-15 14:08 UTC, Michael Gmelin
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Gmelin freebsd_committer freebsd_triage 2022-03-15 14:08:19 UTC
Created attachment 232471 [details]
Patch that works around the problem

As discussed on the freebsd-net mailing list[0]. Also affects CURRENT.

When running on multicore systems, epair interfaces stop forwarding traffic even on moderate load and don't recover unless recreated. This is a critical problem, as it breaks vnet jails running non-trivial workloads. The problem can be reproduced easily using a shell script[1].

This was introduced when adding multi-core improvements to epair[2].

It happens because work is scheduled in taskqueue(s) based on a check if mbuf ring buffers are empty, a logic which is racy on multi-core systems. The race is happening between epair_menq() and epair_tx_start_deferred().

The patch attached to this PR addresses the problem, but it needs to be looked at, profiled, and most likely improved by somebody who has a better understanding of both the code in question and writing lock free-code in general.

[0]https://lists.freebsd.org/archives/freebsd-net/2022-March/001449.html
[1]https://people.freebsd.org/~grembo/hang_epair.sh
[2]https://cgit.freebsd.org/src/commit/?id=24f0bfbad57b9c3cb9b543a60b2ba00e4812c286
Comment 1 commit-hook freebsd_committer freebsd_triage 2022-03-16 23:30:25 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=66acf7685bcd8cf23b6c658a991637238a01859e

commit 66acf7685bcd8cf23b6c658a991637238a01859e
Author:     Michael Gmelin <grembo@FreeBSD.org>
AuthorDate: 2022-03-16 22:08:55 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-03-16 22:08:55 +0000

    if_epair: fix race condition on multi-core systems

    As an unwanted side effect of the performance improvements in
    24f0bfbad57b9, epair interfaces stop forwarding traffic on higher
    load levels when running on multi-core systems.

    This happens due to a race condition in the logic that decides when to
    place work in the task queue(s) responsible for processing the content
    of ring buffers.

    In order to fix this, a field named state is added to the epair_queue
    structure. This field is used by the affected functions to signal each
    other that something happened in the underlying ring buffers that might
    require work to be scheduled in task queue(s), replacing the existing
    logic, which relied on checking if ring buffers are empty or not.

    epair_menq() does:
      - set BIT_MBUF_QUEUED
      - queue mbuf
      - if testandset BIT_QUEUE_TASK:
          enqueue task

    epair_tx_start_deferred() does:
      - swap ring buffers
      - process mbufs
      - clear BIT_QUEUE_TASK
      - if testandclear BIT_MBUF_QUEUED
          enqueue task

    PR:             262571
    Reported by:    Johan Hendriks <joh.hendriks@gmail.com>
    MFC after:      3 days
    Differential Revision:  https://reviews.freebsd.org/D34569

 sys/net/if_epair.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)
Comment 2 commit-hook freebsd_committer freebsd_triage 2022-03-16 23:40:31 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=bb9ad300f029d57c84804702daa2652542a2535f

commit bb9ad300f029d57c84804702daa2652542a2535f
Author:     Michael Gmelin <grembo@FreeBSD.org>
AuthorDate: 2022-03-16 22:08:55 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-03-16 23:38:33 +0000

    if_epair: fix race condition on multi-core systems

    As an unwanted side effect of the performance improvements in
    24f0bfbad57b9, epair interfaces stop forwarding traffic on higher
    load levels when running on multi-core systems.

    This happens due to a race condition in the logic that decides when to
    place work in the task queue(s) responsible for processing the content
    of ring buffers.

    In order to fix this, a field named state is added to the epair_queue
    structure. This field is used by the affected functions to signal each
    other that something happened in the underlying ring buffers that might
    require work to be scheduled in task queue(s), replacing the existing
    logic, which relied on checking if ring buffers are empty or not.

    epair_menq() does:
      - set BIT_MBUF_QUEUED
      - queue mbuf
      - if testandset BIT_QUEUE_TASK:
          enqueue task

    epair_tx_start_deferred() does:
      - swap ring buffers
      - process mbufs
      - clear BIT_QUEUE_TASK
      - if testandclear BIT_MBUF_QUEUED
          enqueue task

    PR:             262571
    Approved by:    re (gjb, early MFC)
    Reported by:    Johan Hendriks <joh.hendriks@gmail.com>
    MFC after:      3 days
    Differential Revision:  https://reviews.freebsd.org/D34569

    (cherry picked from commit 66acf7685bcd8cf23b6c658a991637238a01859e)

 sys/net/if_epair.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)
Comment 3 commit-hook freebsd_committer freebsd_triage 2022-03-16 23:48:33 UTC
A commit in branch releng/13.1 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1ddb6c5518e02953f3fe37d4da27a15ebd7a5a12

commit 1ddb6c5518e02953f3fe37d4da27a15ebd7a5a12
Author:     Michael Gmelin <grembo@FreeBSD.org>
AuthorDate: 2022-03-16 22:08:55 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-03-16 23:40:40 +0000

    if_epair: fix race condition on multi-core systems

    As an unwanted side effect of the performance improvements in
    24f0bfbad57b9, epair interfaces stop forwarding traffic on higher
    load levels when running on multi-core systems.

    This happens due to a race condition in the logic that decides when to
    place work in the task queue(s) responsible for processing the content
    of ring buffers.

    In order to fix this, a field named state is added to the epair_queue
    structure. This field is used by the affected functions to signal each
    other that something happened in the underlying ring buffers that might
    require work to be scheduled in task queue(s), replacing the existing
    logic, which relied on checking if ring buffers are empty or not.

    epair_menq() does:
      - set BIT_MBUF_QUEUED
      - queue mbuf
      - if testandset BIT_QUEUE_TASK:
          enqueue task

    epair_tx_start_deferred() does:
      - swap ring buffers
      - process mbufs
      - clear BIT_QUEUE_TASK
      - if testandclear BIT_MBUF_QUEUED
          enqueue task

    PR:             262571
    Approved by:    re (gjb, early MFC)
    Reported by:    Johan Hendriks <joh.hendriks@gmail.com>
    MFC after:      3 days
    Differential Revision:  https://reviews.freebsd.org/D34569

    (cherry picked from commit 66acf7685bcd8cf23b6c658a991637238a01859e)
    (cherry picked from commit bb9ad300f029d57c84804702daa2652542a2535f)

 sys/net/if_epair.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)
Comment 4 commit-hook freebsd_committer freebsd_triage 2022-03-17 05:45:20 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=0bf7acd6b7047537a38e2de391a461e4e8956630

commit 0bf7acd6b7047537a38e2de391a461e4e8956630
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2022-03-17 02:35:13 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-03-17 05:43:47 +0000

    if_epair: build fix

    66acf7685b failed to build on riscv (and mips). This is because the
    atomic_testandset_int() (and friends) functions do not exist there.
    Happily those platforms do have the long variant, so switch to that.

    PR:             262571
    MFC after:      3 days

 sys/net/if_epair.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
Comment 5 commit-hook freebsd_committer freebsd_triage 2022-03-20 05:20:34 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f6138d93b5115ff560b24200d1ea002cdc46bb64

commit f6138d93b5115ff560b24200d1ea002cdc46bb64
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2022-03-17 02:35:13 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-03-20 00:25:03 +0000

    if_epair: build fix

    66acf7685b failed to build on riscv (and mips). This is because the
    atomic_testandset_int() (and friends) functions do not exist there.
    Happily those platforms do have the long variant, so switch to that.

    PR:             262571
    MFC after:      3 days

    (cherry picked from commit 0bf7acd6b7047537a38e2de391a461e4e8956630)

 sys/net/if_epair.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
Comment 6 commit-hook freebsd_committer freebsd_triage 2022-03-20 05:21:37 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=fb3644ab2afe777fdd2539bc996a390443f052f1

commit fb3644ab2afe777fdd2539bc996a390443f052f1
Author:     Michael Gmelin <grembo@FreeBSD.org>
AuthorDate: 2022-03-16 22:08:55 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-03-20 00:24:51 +0000

    if_epair: fix race condition on multi-core systems

    As an unwanted side effect of the performance improvements in
    24f0bfbad57b9, epair interfaces stop forwarding traffic on higher
    load levels when running on multi-core systems.

    This happens due to a race condition in the logic that decides when to
    place work in the task queue(s) responsible for processing the content
    of ring buffers.

    In order to fix this, a field named state is added to the epair_queue
    structure. This field is used by the affected functions to signal each
    other that something happened in the underlying ring buffers that might
    require work to be scheduled in task queue(s), replacing the existing
    logic, which relied on checking if ring buffers are empty or not.

    epair_menq() does:
      - set BIT_MBUF_QUEUED
      - queue mbuf
      - if testandset BIT_QUEUE_TASK:
          enqueue task

    epair_tx_start_deferred() does:
      - swap ring buffers
      - process mbufs
      - clear BIT_QUEUE_TASK
      - if testandclear BIT_MBUF_QUEUED
          enqueue task

    PR:             262571
    Reported by:    Johan Hendriks <joh.hendriks@gmail.com>
    MFC after:      3 days
    Differential Revision:  https://reviews.freebsd.org/D34569

    (cherry picked from commit 66acf7685bcd8cf23b6c658a991637238a01859e)

 sys/net/if_epair.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)
Comment 7 commit-hook freebsd_committer freebsd_triage 2022-03-20 05:21:38 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b1a3f8dccb6203036b7ee81201fd5b5a8de36f0d

commit b1a3f8dccb6203036b7ee81201fd5b5a8de36f0d
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2022-03-17 02:35:13 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-03-20 00:25:06 +0000

    if_epair: build fix

    66acf7685b failed to build on riscv (and mips). This is because the
    atomic_testandset_int() (and friends) functions do not exist there.
    Happily those platforms do have the long variant, so switch to that.

    PR:             262571
    MFC after:      3 days

    (cherry picked from commit 0bf7acd6b7047537a38e2de391a461e4e8956630)

 sys/net/if_epair.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
Comment 8 commit-hook freebsd_committer freebsd_triage 2022-03-20 15:38:14 UTC
A commit in branch releng/13.1 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=81c1d7c9c727184dacb31df563fb42d208ccde46

commit 81c1d7c9c727184dacb31df563fb42d208ccde46
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2022-03-17 02:35:13 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2022-03-20 15:25:40 +0000

    if_epair: build fix

    66acf7685b failed to build on riscv (and mips). This is because the
    atomic_testandset_int() (and friends) functions do not exist there.
    Happily those platforms do have the long variant, so switch to that.

    PR:             262571
    MFC after:      3 days
    Approved by:    re (gjb)

    (cherry picked from commit 0bf7acd6b7047537a38e2de391a461e4e8956630)

 sys/net/if_epair.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
Comment 9 Mark Johnston freebsd_committer freebsd_triage 2022-03-24 17:01:47 UTC
Can this PR be closed?
Comment 10 Kristof Provost freebsd_committer freebsd_triage 2022-03-24 17:09:58 UTC
Yes.