At present IPC functionality is turned off by default (it could be turned on by setting sysctl jail.sysvipc_allowed to 1), because memory zones are shared between main host and every jail. For this reason many applications that use IPC can't be secure jailed. Attached patches provide as follows: - private memory zones for main host and every jail, - adds some sysctls and now ipcs(1) don't have to be set-gid on kmem group, because it don't use kvm(3) library anymore, - sysctl jail.sysvipc_allowed is now turned on by default, Only functionality was changed, but there have to be many style of old code fixes. I could port those patches on 5.x if FreeBSD team will be interested. Fix: ===> kernel patch ===> ipcs(1) patch
Responsible Changed From-To: freebsd-bugs->pjd Originator is now committer.
If it helps, I'm prepared to pay a small amount of money to get this bug fixed. Any way to bump up the priority? Regards, Mark Blackman Exonetric
+1 I think this patch is a fabulous idea. Needs to be updated for 10+. Zabbix and Postgres require sysvipc to work.
Created attachment 157658 [details] IPC key_t space separation between jails After the patch is applied; - IPC objects created on parent jail, are invisible to children. - IPC objects created on neighbor jail, are also invisible each other. - IPC objects craeted on child jail, are VISIBLE from parent. - IPC key_t spaces are separated between jails. If you see the key_t named object from parent, it's shown as IPC_PRIVATE.
Created attachment 157661 [details] IPC key_t space separation between jails (fix) Sorry, typo in the patch.
One feature of this patch is the ipc_cleanup_for_prison function. That provides a needed hook for jail removal that I'm going to make more general, using the same OSD system that has get/set hooks. It's in a different place than ipc_cleanup_for_prison was, to make sure it's not called prematurely in some cases. A patch is coming soon for that, and then some further IPC patches that will use that. This won't be disregarding kikuchan's patch, but building on it a bit.
Looking at https://www.freebsd.org/releases/11.0R/schedule.html, do you think that has a chance to make it? It would be nice coupled with the recent VNET fixups.
Yes, I think it will make it. It was getting the schedule reminder email that made me move this to the front burner.
Created attachment 169450 [details] Add PR_METHOD_REMOVE to jails, pass prison to PR_METHOD_CREATE Here's the "jail part" of my patch. It adds PR_METHOD_REMOVE to jails, called whenever a jail is removed from the user view (though it might still exist as a "dying" jail). That's sometimes at the time of the jail_remove system call, but often later if there are processes attached to the jail that need to go away first. It also passes the new jail to PR_METHOD_CHECK, which I'll need. The next step will use these jail changes. I'll probably be putting this in as two commits before I add the rest. I'll let it sit a while in case anyone wants to find something I've missed, but I know it's not the part that people are really looking for.
Created attachment 169452 [details] Add PR_METHOD_REMOVE to jails, pass prison to PR_METHOD_CREATE Oops - a small but important typo there - use this one instead. Come on Jamie, test *before* hitting "submit".
Created attachment 169480 [details] IPC key_t space separation, using OSD methods This is the companion to my previous patch - use both of them. It's based off kikuchan's work, but uses the jail OSD methods and some new jail parameters. Notable differences are: Instead of allow.sysvipc, there's a new set of independent parameters: sysvmsg, syssem, and sysvshm. Like ip4, ip6, and linux, they have the possible values "disable", "inherit", and "new". If not specified, they'll get a default based off allow.sysvipc, either "disable" or "inherit". - "disable" is like allow.nosysvipc, where nothing works but just returns ENOSYS. - "inherit" is like allow.sysvipc in the current state of affairs, where there's a single key space shared across all jails, and they can all see each other's objects. As with ip4/etc, this is the way of saying the jail shouldn't mess with this feature. - "new" is the way kikuchan's patch runs, with jails able to see only their own and descendants' objects, and each jail has its own key space. Each prison is associated with a "root prison" for the associated IPC type, which is the highest-level prison that it has inherited from (prison0 by default, itself for "new"). Any two jails with the same root will see the same keys. It may seem cumbersome to have the three separate parameters, but this allows their code to remain independent (even though there's a good bit of duplication), which is necessary because they're loaded as separate modules and a system may have only one of them loaded. In the future, I plan to add per-jail limits to the IPC object, with parameters such as sysvmsg.mni, so you can let jails use IPC without using up the available slots. Then these separate parameters might looks a little more sensible. But that's something for after I close out this bug. Go ahead and give it a try - I'd like to know how it works. I've created, statted, and removed a few msgs, and that seemed to have worked. But I haven't actually run any software that really uses these things.
A commit references this bug: Author: jamie Date: Mon Apr 25 04:24:01 UTC 2016 New revision: 298565 URL: https://svnweb.freebsd.org/changeset/base/298565 Log: Add a new jail OSD method, PR_METHOD_REMOVE. It's called when a jail is removed from the user perspective, i.e. when the last pr_uref goes away, even though the jail mail still exist in the dying state. It will also be called if either PR_METHOD_CREATE or PR_METHOD_SET fail. PR: 48471 MFC after: 5 days Changes: head/sys/kern/kern_jail.c head/sys/sys/jail.h
A commit references this bug: Author: jamie Date: Mon Apr 25 04:27:59 UTC 2016 New revision: 298566 URL: https://svnweb.freebsd.org/changeset/base/298566 Log: Pass the current/new jail to PR_METHOD_CHECK, which pushes the call until after the jail is found or created. This requires unlocking the jail for the call and re-locking it afterward, but that works because nothing in the jail has been changed yet, and other processes won't change the important fields as long as allprison_lock remains held. Keep better track of name vs namelc in kern_jail_set. Name should always be the hierarchical name (relative to the caller), and namelc the last component. PR: 48471 MFC after: 5 days Changes: head/sys/kern/kern_jail.c
A commit references this bug: Author: jamie Date: Mon Apr 25 17:06:51 UTC 2016 New revision: 298585 URL: https://svnweb.freebsd.org/changeset/base/298585 Log: Encapsulate SYSV IPC objects in jails. Define per-module parameters sysvmsg, sysvsem, and sysvshm, with the following bahavior: inherit: allow full access to the IPC primitives. This is the same as the current setup with allow.sysvipc is on. Jails and the base system can see (and moduly) each other's objects, which is generally considered a bad thing (though may be useful in some circumstances). disable: all no access, same as the current setup with allow.sysvipc off. new: A jail may see use the IPC objects that it has created. It also gets its own IPC key namespace, so different jails may have their own objects using the same key value. The parent jail (or base system) can see the jail's IPC objects, but not its keys. PR: 48471 Submitted by: based on work by kikuchan98@gmail.com MFC after: 5 days Changes: head/sys/kern/sysv_msg.c head/sys/kern/sysv_sem.c head/sys/kern/sysv_shm.c head/usr.sbin/jail/jail.8
A commit references this bug: Author: jamie Date: Mon Apr 25 22:30:11 UTC 2016 New revision: 298597 URL: https://svnweb.freebsd.org/changeset/base/298597 Log: Fix the logic in r298585: shm_prison_cansee returns an errno, so is the opposite of a boolean. PR: 48471 Changes: head/sys/kern/sysv_shm.c
A commit references this bug: Author: jamie Date: Tue Apr 26 18:17:45 UTC 2016 New revision: 298656 URL: https://svnweb.freebsd.org/changeset/base/298656 Log: Redo the changes to the SYSV IPC sysctl functions from r298585, so they don't (mis)use sbufs. PR: 48471 Changes: head/sys/kern/sysv_msg.c head/sys/kern/sysv_sem.c head/sys/kern/sysv_shm.c
A commit references this bug: Author: jamie Date: Wed Apr 27 02:25:21 UTC 2016 New revision: 298683 URL: https://svnweb.freebsd.org/changeset/base/298683 Log: Delay revmoing the last jail reference in prison_proc_free, and instead put it off into the pr_task. This is similar to prison_free, and in fact uses the same task even though they do something slightly different. This resolves a LOR between the process lock and allprison_lock, which came about in r298565. PR: 48471 Changes: head/sys/kern/kern_jail.c head/sys/sys/jail.h
A commit references this bug: Author: jamie Date: Sat Apr 30 03:19:07 UTC 2016 New revision: 298833 URL: https://svnweb.freebsd.org/changeset/base/298833 Log: MFC r298565: Add a new jail OSD method, PR_METHOD_REMOVE. It's called when a jail is removed from the user perspective, i.e. when the last pr_uref goes away, even though the jail mail still exist in the dying state. It will also be called if either PR_METHOD_CREATE or PR_METHOD_SET fail. MFC r298683: Delay removing the last jail reference in prison_proc_free, and instead put it off into the pr_task. This is similar to prison_free, and in fact uses the same task even though they do something slightly different. MFC r298566: Pass the current/new jail to PR_METHOD_CHECK, which pushes the call until after the jail is found or created. This requires unlocking the jail for the call and re-locking it afterward, but that works because nothing in the jail has been changed yet, and other processes won't change the important fields as long as allprison_lock remains held. Keep better track of name vs namelc in kern_jail_set. Name should always be the hierarchical name (relative to the caller), and namelc the last component. MFC r298668: Use crcopysafe in jail_attach. PR: 48471 Changes: _U stable/10/ stable/10/sys/kern/kern_jail.c stable/10/sys/sys/jail.h
A commit references this bug: Author: jamie Date: Sat Apr 30 04:02:32 UTC 2016 New revision: 298835 URL: https://svnweb.freebsd.org/changeset/base/298835 Log: MFC r298584: Note the existence of module-specific jail paramters, starting with the linux.* parameters when linux emulation is loaded. MFC r298585: Encapsulate SYSV IPC objects in jails. Define per-module parameters sysvmsg, sysvsem, and sysvshm, with the following bahavior: inherit: allow full access to the IPC primitives. This is the same as the current setup with allow.sysvipc is on. Jails and the base system can see (and moduly) each other's objects, which is generally considered a bad thing (though may be useful in some circumstances). disable: all no access, same as the current setup with allow.sysvipc off. new: A jail may see use the IPC objects that it has created. It also gets its own IPC key namespace, so different jails may have their own objects using the same key value. The parent jail (or base system) can see the jail's IPC objects, but not its keys. PR: 48471 Changes: _U stable/10/ stable/10/sys/kern/sysv_msg.c stable/10/sys/kern/sysv_sem.c stable/10/sys/kern/sysv_shm.c stable/10/usr.sbin/jail/jail.8
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
This looks resolved (commits + merges). If this is not 'complete', please re-open with (specific) details on what is required to complete this issue as reported.