Bug 219655 - Per-VNET soacceptqueue/somaxconn and numopensockets possible?
Summary: Per-VNET soacceptqueue/somaxconn and numopensockets possible?
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net
URL:
Keywords: feature, needs-qa, patch
Depends on:
Blocks:
 
Reported: 2017-05-30 13:54 UTC by john.leo
Modified: 2017-06-01 00:57 UTC (History)
3 users (show)

See Also:


Attachments
make per-VNET soacceptqueue/somaxconn and numopensockets (2.88 KB, patch)
2017-05-31 14:22 UTC, Eugene Grosbein
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description john.leo 2017-05-30 13:54:19 UTC
I'm currently running FreeNas 11 RC1.  I've edited the tunables for this item kern.ipc.soacceptqueue to allow for 2048 connections, but it doesn't propagate to the jails on my system.  Below is the error I received saying the listen queue is full.

May 19 18:43:24 maverick kernel: sonewconn: pcb 0xfffff8035bef6740: Listen queue overflow: 193 already in queue awaiting acceptance (138 occurrences)
May 19 19:02:16 maverick kernel: sonewconn: pcb 0xfffff8035bef6740: Listen queue overflow: 193 already in queue awaiting acceptance (200 occurrences)
May 19 19:03:17 maverick kernel: sonewconn: pcb 0xfffff8035bef6740: Listen queue overflow: 193 already in queue awaiting acceptance (209 occurrences)
May 19 19:04:17 maverick kernel: sonewconn: pcb 0xfffff8035bef6740: Listen queue overflow: 193 already in queue awaiting acceptance (199 occurrences)
May 19 19:05:17 maverick kernel: sonewconn: pcb 0xfffff8035bef6740: Listen queue overflow: 193 already in queue awaiting acceptance (202 occurrences)

Here are the Netstat outputs from my main FreeNas system:
tcp4  0/0/2048                         127.0.0.1.8542
tcp4  0/0/2048                         127.0.0.1.8600
tcp4  0/0/2048                         127.0.0.1.8500
tcp4  0/0/2048                         127.0.0.1.8400

And the output from Netstat for my jail:

tcp4  0/0/128                          192.168.0.20.12348
tcp6  0/0/128                          *.51413
tcp4  0/0/128                          *.51413
tcp4  0/0/128                          *.9091

If I run sysctl kern.ipc.soacceptqueue in the jail it shows the following:

# sysctl kern.ipc.soacceptqueue
kern.ipc.soacceptqueue: 2048
Comment 1 Fabian Keil 2017-05-30 15:10:02 UTC
At least on vanilla FreeBSD kern.ipc.soacceptqueue merely specifies an upper limit,
it does not prevent the application from requesting a smaller one.

Many applications use a hardcoded value like 128 without checking
if a higher value would work.

For details see the "listen" and "getsockopt" man pages.
Comment 2 john.leo 2017-05-30 15:35:24 UTC
I've checked multiple jails on my system, which include CouchPotato, Plex and Transmission and all have the same max of 128, yet when running sysctl kern.ipc.soacceptqueue it says that is 2048.  I find it highly unlikely that 3 separate applications are applying limits to the jails.
Comment 3 Eugene Grosbein freebsd_committer 2017-05-31 02:43:06 UTC
kern.ipc.soacceptqueue is SYSCTL_PROC defined in sys/kern/uipc_socket.c without CTLFLAG_VNET, so it is not VIMAGE/VNET-aware currently.
Comment 4 john.leo 2017-05-31 12:29:44 UTC
That makes more sense.  Is there a way around this or a way to increase the connection queue in a jail that has VIMAGE enabled?
Comment 5 Eugene Grosbein freebsd_committer 2017-05-31 13:21:19 UTC
(In reply to john.leo from comment #4)

You could just patch sys/sys/socket.h and increase value on a line "#define SOMAXCONN 128" (this is used to specify initial value for a sysctl only).

Or, if you are curious enough, you can add CTLFLAG_VNET flag to declaration of soacceptqueue in sys/kern/uipc_socket.c and see if it will work or crash :-)

That is, until SomeOne (TM) prepares complete solution.
Comment 6 Eugene Grosbein freebsd_committer 2017-05-31 14:22:30 UTC
Created attachment 183099 [details]
make per-VNET soacceptqueue/somaxconn and numopensockets

At attempt to make sysctl soacceptqueue (somaxconn) and numopensockets per-VNET/VIMAGE instead of global.
Comment 7 Eugene Grosbein freebsd_committer 2017-05-31 14:23:33 UTC
(In reply to john.leo from comment #4)

You may also try attached patch. Beware, as it is compile-only tested.
Comment 8 john.leo 2017-05-31 14:34:48 UTC
Thanks, would this be for the main OS itself or for use in the jail?  Also I'm having trouble finding that file within the OS, I couldn't find it at the path you provided.
Comment 9 Eugene Grosbein freebsd_committer 2017-05-31 14:40:22 UTC
(In reply to john.leo from comment #8)

The patch is for sys/kern/uipc_socket.c - that is, for kernel. Jails do not have own kernels, there is only one kernel in running system. You need to rebuild and reinstall kernel after patch and reboot the system to apply changes.
Comment 10 Eugene Grosbein freebsd_committer 2017-05-31 14:41:21 UTC
That is, for main OS and /usr/src/sys/kern/uipc_socket.c
Comment 11 john.leo 2017-05-31 14:44:40 UTC
Thanks, for some reason it is showing that /usr/src is an empty directory on my system.  I think I'd also like to set up a dev system to test this out rather than run it in production.
Comment 12 Bjoern A. Zeeb freebsd_committer 2017-05-31 14:48:09 UTC
Making these variables per-VNET is not necessarily a good idea;  it means a VNET-jail consumer could possibly DoS the system without the administrator having a chance to prevent this easily by exceeding resources.

Need to be very careful.  I'd hope if this should go into HEAD that there'll be a way to "cap" the values or reject excessive requests by some metric at least.
Comment 13 Eugene Grosbein freebsd_committer 2017-05-31 15:12:08 UTC
(In reply to Bjoern A. Zeeb from comment #12)

These variables are global currently but this does not mean the limits they impose are "global" in any way: static u_int somaxconn is just default for per-socket backlog limit so->so_qlimit (struct socket *so) and this change makes it possible to assign different defaults per-jail.

Yes, increase of such limit allows jailed root to get more space in the queue of not accepted yet sockets but theres is already plenty ways to consume such resources (f.e. by creating listening socket and making tons of local connections). Perhaps, this sysctl should be made read-only for jailed root, if possible.

V_numopensockets is purely informational.