I recently upgraded one of my stable/11 machines to head (so what will be stable/12 soon), and ran into an issue with my nested jails setup. I run standard (i.e. non-vnet) jails, by assigning e.g. 172.16.4.2 to the parent jail, and starting a jail with that IP address from within that jail (basically, so I can delegate a jailed setup to other people). That used to work on stable/11, but now it failed with ‘IPv4 addresses clash’. To the extent that I understand the relevant code it looks like we try to verify that no other jail uses the address we’re trying to assign to the new jail. I think this block tries to ensure we start at either the host system or the vnet jail that’s hosting us. I suspect that’s just wrong, because we don’t do that if VIMAGE is not enabled (and there’s no need either, because this check will already have been done for the parent jail). #ifdef VIMAGE for (; tppr != &prison0; tppr = tppr->pr_parent) if (tppr->pr_flags & PR_VNET) break; #endif I can work around the problem by resetting 'tppr = ppr;' just before the FOREACH_PRISON_DESCENDANT() loop. (The IPv6 code has exactly the same problem.) Presumably the trigger for this is the enabling of VIMAGE in CURRENT.
I'm also using the same IP for many jails, it's convenient. I hope this feature won't be removed.
Created attachment 197577 [details] Automated test I wrote up a quick automated test for this problem. It just checks that we can start (well, that jail doesn't return errors) a nested jail.
While writing this test case I also noticed something else unexpected: trying to terminate the nested jail from the host (prison0) results in 'jail: "nestedjail" not found'. Doing so from within the base jail (which started "nestedjail") does succeed.
(In reply to Kristof Provost from comment #3) Were you trying to remove "basejail.nestedjail" or just "nestedjail"? You need the fully qualified name to operate on the nested jail from the system level.
(In reply to Jamie Gritton from comment #4) Just "nestedjail". That explains it, thanks.
(In reply to Kristof Provost from comment #0) As per jail(8) man page: ip4.addr A list of IPv4 addresses assigned to the jail. If this is set, the jail is restricted to using only these addresses. Any attempts to use other addresses fail, and attempts to use wildcard addresses silently use the jailed address instead. For IPv4 the first address given will be used as the source address when source address selection on unbound sockets cannot find a better match. !!!!!!!!! It is only possible to start multiple jails with the same IP address if none of the jails has more than this single overlapping IP address assigned to itself. !!!!!!!!! Everything else is unfortunately a bug and cannot work deterministically and might lead to all kinds of problems. Robert and I spent a lot of time in before multi-IP jails went in and the conclusion was "you cannot get that right". It had also been "documented" in the original jail code: https://svnweb.freebsd.org/base/head/sys/kern/kern_jail.c?annotate=185435#l534 https://svnweb.freebsd.org/base/head/sys/kern/kern_jail.c?annotate=185435#l209 Now for nested jails that brings a problem of its own as I can see "delegate 10 addresses to customer A" and "customer A delegates 1 address to customers {m,n,o,p,..} himself starting a nested jail; sounds great in theory, in practise its a recipe for disaster. Just a vnet jail as base jail and then single-IP jails for the nested ones probably the best idea? No matter what people had done in the past in any overlapping services cases, they got indeterministic, random behaviour. Given people also use jails as security compartments fixings this to the documented behaviour is the only solution I can see; sorry.
(In reply to Bjoern A. Zeeb from comment #6) Hmm, I was thinking some more about this. What definitively should not work is: basejail: 192.0.2.2,192.0.2.3,192.0.2.4,192.0.2.5 basejail.foo: 192.0.2.3,192.0.2.4 Now both jails have more than one address. I have to think some more about nested jails. Let's star flat.. With just 1st level jails (as in the old days without nested jails): jailA: 192.0.2.1 jailB: 192.0.2.1 is fine. jailA: 192.0.2.1,192.0.2.2 jailB: 192.0.2.1,192.0.2.33 is not fine. jailA: 192.0.2.1 jailB: 192.0.2.1,192.0.2.33 is not fine either I believe. I think the conclusion is if jailA is a child of jailB it's equally not fine as for the PCB it's a flat space. Jamie can you go back and read the old pre-nested jails code I previously cited and independently confirm that my memory serves me right? Otherwise we might have to download any of 7.2/7.3/7.4-R and check the behaviour there to be sure.
(In reply to Bjoern A. Zeeb from comment #7) That's right - duplicates are allowed only if both jails have exactly one IP address. In particular, there code contained: ((ip4s > 0 && tpr->pr_ip4s > 1) || (ip4s > 1 && tpr->pr_ip4s > 0)). While the specifics of the test changes with the hierarchical jail code, the fact remained that only single-IP jails could have the same address as other single-IP jails. Note that while we don't allow a child jail to have a partial match of IP addresses, we do allow it to have a full match (by not specifying ip4.addr). Otherwise the only choice for a child jail would be to have no IP addresses. Why are single-address jails allowed to be duplicates, and why only of other single-address jails? I understand the general nature of the problem, but the discussion was before my time, and these subtleties are beyond me.
Created attachment 197613 [details] Patch to fix the original bug (i.e. to keep the reported "bug" broken) Here's the patch to make the non-VIMAGE case behave the same as the VIMAGE case. This is how I should have made the original VIMAGE change way back when.
(In reply to Jamie Gritton from comment #8) For single-IP jails laddr operations of inaddr_any are automagically translated to the jail's single IP address; hence there can be no duplicates and thus no undefined behaviour. Multi-IP jails will bind to inaddr_any and if you have two jails with overlapping IP ranges your packets/connection will go to one or the other in indeterministic ways.
(In reply to Bjoern A. Zeeb from comment #6) Okay, thanks for the clarification. I can probably re-do my config around vnet jails.
(In reply to Bjoern A. Zeeb from comment #10) The remapping if single-IP bindings from inaddr_any to the address does indeed many be wonder about your previous note: > jailA: 192.0.2.1 > jailB: 192.0.2.1,192.0.2.33 > is not fine either I believe. > > I think the conclusion is if jailA is a child of jailB it's equally not fine as for the PCB it's a flat space. This makes me think it would be fine if jailA is a child of jailB, at least as far as inaddr_any binding goes. It would be analogous to the case of jailA under the base system. But it could still be a problem as far as localhost remapping goes. Of course, any child jail is going to have such a problem when a subset of parent addresses is disallowed; the two jails will always try to bind localhost to the first address in both their lists.
I've committed r339211 as a returns of this bug. The commit doesn't mention the PR, because I didn't actually *fix* the problem, so much as I formalized the (to many people) new way of doing things as the in-fact-proper *old* way of doing things. It does at least clear up the discrepancy between VIMAGE and non-VIMAGE jails, but the 12 release will give a few people an unwelcome surprise. Hopefully a very few people.
I think https://github.com/freebsd/poudriere/issues/657 is the same problem. All Poudriere is trying to do is pass 127.0.0.1 down to a nested jail. I'm not sure where the regression is here but it used to work and now doesn't.
What makes no sense here is that the restriction is not applied evenly with nested jails. The restriction quoted here would mean that *any jail must have the same IPs as the host/prison0*. If we properly exempt that then why do we not exempt nested jails in the same way? PRISON0->PRISON1 shouldn't be any more special than PRISON1->PRISON2.
(In reply to Bryan Drewery from comment #15) > What makes no sense here is that the restriction is not applied evenly with > nested jails. > > The restriction quoted here would mean that *any jail must have the same IPs > as the host/prison0*. If we properly exempt that then why do we not exempt > nested jails in the same way? PRISON0->PRISON1 shouldn't be any more special > than PRISON1->PRISON2. Where PRISON2 is nested in PRISON1.