Hi, I have upgraded FreeBSD from 12.1-RELEASE to 13.0-BETA3 (fresh install) and I can't assign different CPU cores to host userland and jail at the same time anymore. # cat /boot/loader.conf init_script="/etc/init.sh" # cat /etc/init.sh #!/bin/sh cpuset -C -c -l 0 -p 1 # cat /etc/jail.conf path = "/usr/jail/${name}"; exec.clean; exec.start = "sh /etc/rc"; exec.poststart = "cpuset -l 1 -j ${name}"; exec.stop = "sh /etc/rc.shutdown jail"; host.hostname = "${name}"; mount.devfs; test { } # cat /usr/jail/test/etc/rc.conf dumpdev="NO" cron_flags="-J 30" sendmail_enable="NONE" syslogd_flags="-ss" # service jail onestart Starting jails:test: created ELF ldconfig path: /lib /usr/lib /usr/lib/compat 32-bit compatibility ldconfig path: /usr/lib32 Updating motd:. Creating and/or trimming log files. Clearing /tmp (X related). Updating /var/run/os-release done. Starting syslogd. /etc/rc: WARNING: failed to start syslogd Starting cron. Sat Feb 20 18:40:41 CET 2021 cpuset: setaffinity: Resource deadlock avoided jail: test: cpuset -l 1 -j test: failed test: removed .
Sorry about that; I think this is my preferred method to solve it: https://reviews.freebsd.org/D28952 The main need that I have is preventing unprivileged users who are restricted to a subset of available CPUs from bypassing that restriction by attaching (allowed by MAC policy) to a jail with a wider mask. The patch above restores the system root's ability to administer such a setup as yours, and allows the previous behavior entirely (i.e. unprivileged users) with a MAC policy.
You could workaround this initial problem by making your exec.poststart an exec.created, but you would need D28952 on top of that for jail creation to succeed as the next step won't be able to attach without widening its mask pre-attach. exec.created is debatably the more appropriate place for resource-control type stuff anyways, though it doesn't matter much at all for cpuset. We're still discussing in that review and elsewhere whether this can be fixed quickly in a low-impact way besides reverting the original change; we should arrive at something before -rc1.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=60c4ec806dfd0f79edf8ca3abc04bbb69c0418f7 commit 60c4ec806dfd0f79edf8ca3abc04bbb69c0418f7 Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2021-02-26 21:46:47 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2021-03-01 18:38:31 +0000 jail: allow root to implicitly widen its cpuset to attach The default behavior for attaching processes to jails is that the jail's cpuset augments the attaching processes, so that it cannot be used to escalate a user's ability to take advantage of more CPUs than the administrator wanted them to. This is problematic when root needs to manage jails that have disjoint sets with whatever process is attaching, as this would otherwise result in a deadlock. Therefore, if we did not have an appropriate common subset of cpus/domains for our new policy, we now allow the process to simply take on the jail set *if* it has the privilege to widen its mask anyways. With the new logic, root can still usefully cpuset a process that attaches to a jail with the desire of maintaining the set it was given pre-attachment while still retaining the ability to manage child jails without jumping through hoops. A test has been added to demonstrate the issue; cpuset of a process down to just the first CPU and attempting to attach to a jail without access to any of the same CPUs previously resulted in EDEADLK and now results in taking on the jail's mask for privileged users. PR: 253724 Reviewed by: jamie (also discussed with) MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D28952 lib/libc/tests/sys/cpuset_test.c | 203 ++++++++++++++++++++++++++++++++++++++- sys/kern/kern_cpuset.c | 8 ++ 2 files changed, 210 insertions(+), 1 deletion(-)
The above-referenced commit will fix it, with the caveat that you'll need to move the cpuset invocation from exec.poststart to exec.created so that it's in effect when /etc/rc gets invoked -- this should be portable across all releases, and we'll refine jail(8) further. I'll MFC the above in 3 days and get it in to 13.0.
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=c4ccb6d1be1f00ebcda9e83f06db55f9d6c152ac commit c4ccb6d1be1f00ebcda9e83f06db55f9d6c152ac Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2021-02-26 21:46:47 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2021-03-04 02:04:10 +0000 jail: allow root to implicitly widen its cpuset to attach The default behavior for attaching processes to jails is that the jail's cpuset augments the attaching processes, so that it cannot be used to escalate a user's ability to take advantage of more CPUs than the administrator wanted them to. This is problematic when root needs to manage jails that have disjoint sets with whatever process is attaching, as this would otherwise result in a deadlock. Therefore, if we did not have an appropriate common subset of cpus/domains for our new policy, we now allow the process to simply take on the jail set *if* it has the privilege to widen its mask anyways. With the new logic, root can still usefully cpuset a process that attaches to a jail with the desire of maintaining the set it was given pre-attachment while still retaining the ability to manage child jails without jumping through hoops. A test has been added to demonstrate the issue; cpuset of a process down to just the first CPU and attempting to attach to a jail without access to any of the same CPUs previously resulted in EDEADLK and now results in taking on the jail's mask for privileged users. PR: 253724 (cherry picked from commit 60c4ec806dfd0f79edf8ca3abc04bbb69c0418f7) lib/libc/tests/sys/cpuset_test.c | 203 ++++++++++++++++++++++++++++++++++++++- sys/kern/kern_cpuset.c | 8 ++ 2 files changed, 210 insertions(+), 1 deletion(-)
A commit in branch releng/13.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5fdc5c5dbac400a2ff05820ba8a63cdec8603c92 commit 5fdc5c5dbac400a2ff05820ba8a63cdec8603c92 Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2021-02-26 21:46:47 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2021-03-04 02:41:48 +0000 jail: allow root to implicitly widen its cpuset to attach The default behavior for attaching processes to jails is that the jail's cpuset augments the attaching processes, so that it cannot be used to escalate a user's ability to take advantage of more CPUs than the administrator wanted them to. This is problematic when root needs to manage jails that have disjoint sets with whatever process is attaching, as this would otherwise result in a deadlock. Therefore, if we did not have an appropriate common subset of cpus/domains for our new policy, we now allow the process to simply take on the jail set *if* it has the privilege to widen its mask anyways. With the new logic, root can still usefully cpuset a process that attaches to a jail with the desire of maintaining the set it was given pre-attachment while still retaining the ability to manage child jails without jumping through hoops. A test has been added to demonstrate the issue; cpuset of a process down to just the first CPU and attempting to attach to a jail without access to any of the same CPUs previously resulted in EDEADLK and now results in taking on the jail's mask for privileged users. PR: 253724 Approved by: re (gjb) (cherry picked from commit 60c4ec806dfd0f79edf8ca3abc04bbb69c0418f7) (cherry picked from commit c4ccb6d1be1f00ebcda9e83f06db55f9d6c152ac) lib/libc/tests/sys/cpuset_test.c | 203 ++++++++++++++++++++++++++++++++++++++- sys/kern/kern_cpuset.c | 8 ++ 2 files changed, 210 insertions(+), 1 deletion(-)
Leaving this open a little longer; the first half went in time for -rc1 that fixes this setup if you cpuset it before running rc (exec.created). There's still a second change in progress that restores it (mostly) by effectively using the jail's initial cpuset for spawning off rc: https://reviews.freebsd.org/D29008
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=466df976babed65f8a8de9e36d7f016a444609af commit 466df976babed65f8a8de9e36d7f016a444609af Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2021-03-04 19:28:53 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2021-03-04 19:28:53 +0000 jail(8): reset to root cpuset before attaching to run commands Recent changes have made it such that attaching to a jail will augment the attaching process' cpu mask with the jail's cpuset. While this is convenient for allowing the administrator to cpuset arbitrary programs that will attach to a jail, this is decidedly not convenient for executing long-running daemons during jail creation. This change inserts a reset of the process cpuset to the root cpuset between the fork and attach to execute a command. This allows commands executed to have the widest mask possible, and the administrator can cpuset(1) it back down inside the jail as needed. With this applied, one should be able to change a jail's cpuset at exec.poststart in addition to exec.created. The former was made difficult if jail(8) itself was running with a constrained set, as then some processes may have been spawned inside the jail with a non-root set. The latter is the preferred option so that processes starting in the jail are constrained appropriately up front. Note that all system commands are still run with the process' initial cpuset applied. PR: 253724 MFC after: 3 days Reviewed by: jamie Differential Revision: https://reviews.freebsd.org/D29008 usr.sbin/jail/command.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=bdd61b6914f1f961b5f414b2d5cc623a5a829b89 commit bdd61b6914f1f961b5f414b2d5cc623a5a829b89 Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2021-03-04 19:28:53 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2021-03-07 21:39:29 +0000 jail(8): reset to root cpuset before attaching to run commands Recent changes have made it such that attaching to a jail will augment the attaching process' cpu mask with the jail's cpuset. While this is convenient for allowing the administrator to cpuset arbitrary programs that will attach to a jail, this is decidedly not convenient for executing long-running daemons during jail creation. This change inserts a reset of the process cpuset to the root cpuset between the fork and attach to execute a command. This allows commands executed to have the widest mask possible, and the administrator can cpuset(1) it back down inside the jail as needed. With this applied, one should be able to change a jail's cpuset at exec.poststart in addition to exec.created. The former was made difficult if jail(8) itself was running with a constrained set, as then some processes may have been spawned inside the jail with a non-root set. The latter is the preferred option so that processes starting in the jail are constrained appropriately up front. Note that all system commands are still run with the process' initial cpuset applied. PR: 253724 (cherry picked from commit 466df976babed65f8a8de9e36d7f016a444609af) usr.sbin/jail/command.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)
A commit in branch releng/13.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7a1f6858dabfdb24d9f58a52fd5e6e1fe0ceead1 commit 7a1f6858dabfdb24d9f58a52fd5e6e1fe0ceead1 Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2021-03-04 19:28:53 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2021-03-11 19:24:28 +0000 jail(8): reset to root cpuset before attaching to run commands Recent changes have made it such that attaching to a jail will augment the attaching process' cpu mask with the jail's cpuset. While this is convenient for allowing the administrator to cpuset arbitrary programs that will attach to a jail, this is decidedly not convenient for executing long-running daemons during jail creation. This change inserts a reset of the process cpuset to the root cpuset between the fork and attach to execute a command. This allows commands executed to have the widest mask possible, and the administrator can cpuset(1) it back down inside the jail as needed. With this applied, one should be able to change a jail's cpuset at exec.poststart in addition to exec.created. The former was made difficult if jail(8) itself was running with a constrained set, as then some processes may have been spawned inside the jail with a non-root set. The latter is the preferred option so that processes starting in the jail are constrained appropriately up front. Note that all system commands are still run with the process' initial cpuset applied. PR: 253724 Approved by: re (gjb) (cherry picked from commit 466df976babed65f8a8de9e36d7f016a444609af) (cherry picked from commit bdd61b6914f1f961b5f414b2d5cc623a5a829b89) usr.sbin/jail/command.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)
Thanks for the report!