Bug 274316 - excessive memory consumed by static_single_cpu_mask
Summary: excessive memory consumed by static_single_cpu_mask
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Bjoern A. Zeeb
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-10-06 21:32 UTC by Ed Maste
Modified: 2024-03-04 23:30 UTC (History)
4 users (show)

See Also:
bz: mfc-stable14+
bz: mfc-stable13+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Maste freebsd_committer freebsd_triage 2023-10-06 21:32:29 UTC
in sys/compat/linuxkpi/common/src/linux_compat.c we have:
static cpumask_t static_single_cpu_mask[MAXCPU];

cpumask_t *
lkpi_get_static_single_cpu_mask(int cpuid)
{

        KASSERT((cpuid >= 0 && cpuid < MAXCPU), ("%s: invalid cpuid %d\n",
            __func__, cpuid));

        return (&static_single_cpu_mask[cpuid]);
}

when testing (an admittedly excessive) MAXCPU=65536 this array is huge (as cpumask_t also scales with MAXCPU). on arm64 andrew found it is responsible for 512M of .bss's 566M.
Comment 1 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-10-06 22:50:21 UTC
people lately switched code, not here yet I assume; should I?  Is mp_ncpus still the correct value to do run-time scaling?
Comment 2 Ed Maste freebsd_committer freebsd_triage 2023-10-07 00:02:42 UTC
We can switch to run-time sizing for these, but there's a very interesting observation used by Linux that we can adopt, to significantly reduce the memory used.

We have (currently) MAXCPU count of cpuset_ts, each with one CPU set. Note that all but one of the longs making up each cpuset_t is zero.

If n=__bitset_words(MAXCPU) then cpuset_t is an array of long __bits[n]. If we have (n-1) long zeros, 0x1, (n-1) long zeros, 0x2, (n-1) long zeros, and so on, with an appropriate offset we can use that same 0x1 for:

0000000000000000 0000000000000000 ... 0000000000000000 0000000000000001
0000000000000000 0000000000000000 ... 0000000000000001 0000000000000000
...
0000000000000000 0000000000000001 ... 0000000000000000 0000000000000000
0000000000000001 0000000000000000 ... 0000000000000000 0000000000000000

and so on for 0x2, etc.
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2023-10-12 13:25:03 UTC
(In reply to Ed Maste from comment #2)
Does this mean that some of the cpusets won't be aligned?  I don't quite understand.
Comment 4 Ed Maste freebsd_committer freebsd_triage 2023-10-12 18:54:34 UTC
(In reply to Mark Johnston from comment #3)
They won't be aligned to sizeof(cpuset_t) indeed, but I expect they need only 8-byte alignment (i.e., alignof(long)), right?
Comment 5 Bjoern A. Zeeb freebsd_committer freebsd_triage 2023-10-23 23:29:55 UTC
https://reviews.freebsd.org/D42345
Comment 6 commit-hook freebsd_committer freebsd_triage 2023-12-22 00:23:47 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=488e8a7faca51a71987fbf00cd36cfcd19269db7

commit 488e8a7faca51a71987fbf00cd36cfcd19269db7
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-23 23:14:35 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2023-12-22 00:22:04 +0000

    LinuxKPI: reduce impact of large MAXCPU

    Start scaling arrays dynamically instead of using MAXCPU, resulting in
    extra allocations on startup but reducing the overall memory footprint.
    For the static single CPU mask we provide two versions to further save
    memory depending on a low or high CPU count system.  The threshold to
    switch is currently at 128 CPUs on 64bit platforms.
    More detailed comments on the implementations can be found in the code.

    If I am not wrong on a MAXCPU=65536 system the memory footprint should
    roughly go down from 512M to 1.5M for the static single CPU mask.

    Submitted by:   olce (most of this final version)
    Sponsored by:   The FreeBSD Foundation
    PR:             274316
    Differential Revision: https://reviews.freebsd.org/D42345

 sys/compat/linuxkpi/common/include/asm/processor.h |   2 +-
 sys/compat/linuxkpi/common/src/linux_compat.c      | 106 +++++++++++++++++++--
 2 files changed, 99 insertions(+), 9 deletions(-)
Comment 7 Mark Linimon freebsd_committer freebsd_triage 2023-12-27 12:28:46 UTC
^Triage: assign to committer that resolved; set possible MFC flags.
Comment 8 commit-hook freebsd_committer freebsd_triage 2024-02-18 21:12:29 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7730aec6b7c8ac6c6e4ca31577b8af0c15ebb3ec

commit 7730aec6b7c8ac6c6e4ca31577b8af0c15ebb3ec
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-23 23:14:35 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-18 16:41:24 +0000

    LinuxKPI: reduce impact of large MAXCPU

    Start scaling arrays dynamically instead of using MAXCPU, resulting in
    extra allocations on startup but reducing the overall memory footprint.
    For the static single CPU mask we provide two versions to further save
    memory depending on a low or high CPU count system.  The threshold to
    switch is currently at 128 CPUs on 64bit platforms.
    More detailed comments on the implementations can be found in the code.

    If I am not wrong on a MAXCPU=65536 system the memory footprint should
    roughly go down from 512M to 1.5M for the static single CPU mask.

    Submitted by:   olce (most of this final version)
    Sponsored by:   The FreeBSD Foundation
    PR:             274316
    Differential Revision: https://reviews.freebsd.org/D42345

    (cherry picked from commit 488e8a7faca51a71987fbf00cd36cfcd19269db7)

 sys/compat/linuxkpi/common/include/asm/processor.h |   2 +-
 sys/compat/linuxkpi/common/src/linux_compat.c      | 106 +++++++++++++++++++--
 2 files changed, 99 insertions(+), 9 deletions(-)
Comment 9 commit-hook freebsd_committer freebsd_triage 2024-02-19 08:09:36 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=90aaf46d520816e7a92d88fc159fe8694a5e1e32

commit 90aaf46d520816e7a92d88fc159fe8694a5e1e32
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2023-10-23 23:14:35 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-02-19 08:01:58 +0000

    LinuxKPI: reduce impact of large MAXCPU

    Start scaling arrays dynamically instead of using MAXCPU, resulting in
    extra allocations on startup but reducing the overall memory footprint.
    For the static single CPU mask we provide two versions to further save
    memory depending on a low or high CPU count system.  The threshold to
    switch is currently at 128 CPUs on 64bit platforms.
    More detailed comments on the implementations can be found in the code.

    If I am not wrong on a MAXCPU=65536 system the memory footprint should
    roughly go down from 512M to 1.5M for the static single CPU mask.

    Submitted by:   olce (most of this final version)
    Sponsored by:   The FreeBSD Foundation
    PR:             274316
    Differential Revision: https://reviews.freebsd.org/D42345

    (cherry picked from commit 488e8a7faca51a71987fbf00cd36cfcd19269db7)

 sys/compat/linuxkpi/common/include/asm/processor.h |   2 +-
 sys/compat/linuxkpi/common/src/linux_compat.c      | 106 +++++++++++++++++++--
 2 files changed, 99 insertions(+), 9 deletions(-)