Created attachment 216500 [details] SO_REUSEPORT_LB.diff Now socket option SO_REUSEPORT_LB in jail does not work as intended
Can you explain the reason you want this feature? It seems to me that this was explicitly disallowed for security reason. E.g. You have host that provides jails and some load-balanced service, and jailed user can not run some bad service to join to load-balanced service. With your patch this seems possible.
(In reply to Andrey V. Elsukov from comment #1) without this it is impossible: * running load-balanced service in single jail * running load-balanced service in multiple jails plus tasks to minimize downtime during upgrades services running in jail
(In reply to Andrey V. Elsukov from comment #1) > E.g. You have host that provides jails and some load-balanced service, and > jailed user can not run some bad service to join to load-balanced service. > With your patch this seems possible. VNET should solve this problem?
We can augment LB groups with a credential, and require all members of the group to belong to the same jail. This prevents jailed sockets from surreptitiously joining a group in the host.
(In reply to Mark Johnston from comment #4) It's perfect idea, but how to do with LB services belong to different jails? I think it should be sysctl-managed switch, ex.: - forbid joining for LB to jailed sockets; - allow joining for LB to same-jail sockets; - allow joining for LB to all sockets.
(In reply to Dmitry Wagin from comment #5) Hmm. When is it useful to run identical services in different jails with a shared IP address? Note, the policies you suggested are not really possible to implement with VNET jails.
(In reply to Mark Johnston from comment #6) > When is it useful to run identical services in different jails with a shared IP address? For example: 1. Scaling a single-threaded application (pgbouncer) 2. HA and live update (almost) (nginx, pgbouncer, syslog-ng etc.) We use images with the launch of a single copy of the application. > Note, the policies you suggested are not really possible to implement with VNET jails. Of course, this should not work with jails in different VNET. But for such a setup it should probably work: Parent jail (VNET): * Child jail 1 * Child jail 2
(In reply to Dmitry Wagin from comment #7) > For example: > 1. Scaling a single-threaded application (pgbouncer) Ok. > 2. HA and live update (almost) (nginx, pgbouncer, syslog-ng etc.) Well, you do not need two jails to be the in the same LB group for this. Each one creates its own LB group on the same shared IP. All traffic goes to one group; when the services are stopped, the kernel will automatically push new connections to the second group.
(In reply to Mark Johnston from comment #8) > Well, you do not need two jails to be the in the same LB group for this. Each one creates its own LB group on the same shared IP. All traffic goes to one group; when the services are stopped, the kernel will automatically push new connections to the second group. But it is necessary when items 1 and 2 are combined, this is in most cases.
Chiming in here as I discovered that dnsdist with multiple listeners didn’t load balance requests as per SO_REUSEPORT_LB in a vnet jail but this worked in non jail context. I think it would be nice to have in a (vnet) jail, or at least a mention in setsockopt(2) that it isn’t allowed by design.
Oh never mind, I see that commit d93ec8cb1324d04d7cae19fb7fa98ade2ff33c80 would fix this but isn’t MFC’ed (yet)
I suspect that the commit won't be MFCed to 13 since the inpcb layer has changed substantially since then, i.e., there are too many risky merge conflicts. I think this report can be closed now. There is a request to allow LB groups to span multiple (classic) jails, but it is not obvious to me how this should be implemented. Feel free to submit a follow-up PR with a proposal for how it should work.