Bug 248029

Summary: Allow ability to use socket option SO_REUSEPORT_LB in jail
Product: Base System Reporter: Dmitry Wagin <dmitry.wagin>
Component: kernAssignee: Mark Johnston <markj>
Status: Closed FIXED    
Severity: Affects Some People CC: ae, drtr0jan, emaste, markj, pi, ruben, zlei
Priority: --- Keywords: patch
Version: 12.0-STABLE   
Hardware: Any   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247956
Attachments:
Description Flags
SO_REUSEPORT_LB.diff none

Description Dmitry Wagin 2020-07-16 21:25:05 UTC
Created attachment 216500 [details]
SO_REUSEPORT_LB.diff

Now socket option SO_REUSEPORT_LB in jail does not work as intended
Comment 1 Andrey V. Elsukov freebsd_committer freebsd_triage 2020-07-17 08:09:09 UTC
Can you explain the reason you want this feature?

It seems to me that this was explicitly disallowed for security reason.
E.g. You have host that provides jails and some load-balanced service, and jailed user can not run some bad service to join to load-balanced service. With your patch this seems possible.
Comment 2 Dmitry Wagin 2020-07-17 08:25:36 UTC
(In reply to Andrey V. Elsukov from comment #1)

without this it is impossible:
* running load-balanced service in single jail
* running load-balanced service in multiple jails

plus tasks to minimize downtime during upgrades services running in jail
Comment 3 Dmitry Wagin 2020-07-17 11:04:59 UTC
(In reply to Andrey V. Elsukov from comment #1)
> E.g. You have host that provides jails and some load-balanced service, and
> jailed user can not run some bad service to join to load-balanced service.
> With your patch this seems possible.

VNET should solve this problem?
Comment 4 Mark Johnston freebsd_committer freebsd_triage 2022-10-17 19:49:22 UTC
We can augment LB groups with a credential, and require all members of the group to belong to the same jail.  This prevents jailed sockets from surreptitiously joining a group in the host.
Comment 5 Dmitry Wagin 2022-10-18 15:02:04 UTC
(In reply to Mark Johnston from comment #4)

It's perfect idea, but how to do with LB services belong to different jails?
I think it should be sysctl-managed switch, ex.:
- forbid joining for LB to jailed sockets;
- allow joining for LB to same-jail sockets;
- allow joining for LB to all sockets.
Comment 6 Mark Johnston freebsd_committer freebsd_triage 2022-10-18 15:12:43 UTC
(In reply to Dmitry Wagin from comment #5)
Hmm.  When is it useful to run identical services in different jails with a shared IP address?  Note, the policies you suggested are not really possible to implement with VNET jails.
Comment 7 Dmitry Wagin 2022-10-18 20:51:42 UTC
(In reply to Mark Johnston from comment #6)
> When is it useful to run identical services in different jails with a shared IP address?

For example:
1. Scaling a single-threaded application (pgbouncer)
2. HA and live update (almost) (nginx, pgbouncer, syslog-ng etc.)

We use images with the launch of a single copy of the application.

> Note, the policies you suggested are not really possible to implement with VNET jails.

Of course, this should not work with jails in different VNET.

But for such a setup it should probably work:

Parent jail (VNET):
 * Child jail 1
 * Child jail 2
Comment 8 Mark Johnston freebsd_committer freebsd_triage 2022-10-18 22:03:37 UTC
(In reply to Dmitry Wagin from comment #7)
> For example:
> 1. Scaling a single-threaded application (pgbouncer)

Ok.

> 2. HA and live update (almost) (nginx, pgbouncer, syslog-ng etc.)

Well, you do not need two jails to be the in the same LB group for this.  Each one creates its own LB group on the same shared IP.  All traffic goes to one group; when the services are stopped, the kernel will automatically push new connections to the second group.
Comment 9 Dmitry Wagin 2022-10-18 22:38:47 UTC
(In reply to Mark Johnston from comment #8)

> Well, you do not need two jails to be the in the same LB group for this.  Each one creates its own LB group on the same shared IP.  All traffic goes to one group; when the services are stopped, the kernel will automatically push new connections to the second group.

But it is necessary when items 1 and 2 are combined, this is in most cases.
Comment 10 ruben 2023-05-04 12:20:43 UTC
Chiming in here as I discovered that dnsdist with multiple listeners didn’t load balance requests as per SO_REUSEPORT_LB in a vnet jail but this worked in non jail context. I think it would be nice to have in a (vnet) jail, or at least a mention in setsockopt(2) that it isn’t allowed by design.
Comment 11 ruben 2023-05-04 12:30:10 UTC
Oh never mind, I see that commit d93ec8cb1324d04d7cae19fb7fa98ade2ff33c80 would fix this but isn’t MFC’ed (yet)
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2023-05-04 14:08:08 UTC
I suspect that the commit won't be MFCed to 13 since the inpcb layer has changed substantially since then, i.e., there are too many risky merge conflicts.

I think this report can be closed now.  There is a request to allow LB groups to span multiple (classic) jails, but it is not obvious to me how this should be implemented.  Feel free to submit a follow-up PR with a proposal for how it should work.