Summary: | 13.0-BETA4: VNET: Stopping jails: Freed UMA keg (rtentry) was not empty (1 items). Lost 1 pages of memory. | ||
---|---|---|---|
Product: | Base System | Reporter: | rashey |
Component: | kern | Assignee: | Alexander V. Chernikov <melifaro> |
Status: | Closed FIXED | ||
Severity: | Affects Some People | CC: | kp, melifaro, trashcan, zarychtam |
Priority: | --- | ||
Version: | 13.0-STABLE | ||
Hardware: | amd64 | ||
OS: | Any |
Description
rashey
2021-03-03 22:01:23 UTC
These leaks appear to be correlated with this error:
> in6_purgeaddr: err=65, destination address delete failed
(In reply to rashey from comment #0) Could you consider clarifying your jail setup a bit? What interfaces are set up in the jail? Do you instantiate loopback interface there? Is there any chance you could provide `ifconfig` and `netstat -rn` output from the jail before the shutdown? (In reply to Alexander V. Chernikov from comment #2) I provided minimal configuration that can be used to reproduce the issue. My full jail configuration for testing purpose looks like this: # cat /etc/jail.conf path = "/usr/jail/${name}"; exec.clean; exec.prestart = "ifconfig epair${epairid} create"; exec.prestart += "ifconfig epair${epairid}a inet6 ifdisabled up"; exec.prestart += "ifconfig bridge0 addm epair${epairid}a"; exec.created = "cpuset -l 1 -j ${name}"; exec.start = "ifconfig epair${epairid}b ether 02:ef:a4:c1:60:0${epairid}"; exec.start += "ifconfig epair${epairid}b inet ${ipaddress} netmask ${netmask}"; exec.start += "route add default ${gateway}"; exec.start += "sh /etc/rc"; exec.stop = "sh /etc/rc.shutdown jail"; exec.poststop = "ifconfig epair${epairid}a destroy"; host.hostname = "${name}"; mount.fstab = "/etc/fstab.${name}"; mount.devfs; vnet; vnet.interface = "epair${epairid}b"; test { $epairid = 1; $ipaddress = 192.168.0.101; $netmask = 255.255.255.0; $gateway = 192.168.0.1; } # jexec test ifconfig lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 inet 127.0.0.1 netmask 255.0.0.0 groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> pflog0: flags=0<> metric 0 mtu 33160 groups: pflog epair1b: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:ef:a4:c1:60:01 hwaddr 02:be:06:3e:c3:0b inet 192.168.0.101 netmask 255.255.255.0 broadcast 192.168.0.255 groups: epair media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> # jexec test netstat -rn Routing tables Internet: Destination Gateway Flags Netif Expire default 192.168.0.1 UGS epair1b 192.168.0.0/24 link#3 U epair1b 192.168.0.101 link#3 UHS lo0 127.0.0.1 link#1 UH lo0 Internet6: Destination Gateway Flags Netif Expire ::/96 ::1 UGRS lo0 ::1 link#1 UH lo0 ::ffff:0.0.0.0/96 ::1 UGRS lo0 fe80::/10 ::1 UGRS lo0 fe80::%lo0/64 link#1 U lo0 fe80::1%lo0 link#1 UHS lo0 ff02::/16 ::1 UGRS lo0 Thank you for the clarification! I've raised https://reviews.freebsd.org/D29116 to address this issue. A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=b1d63265ac399112b3bca36c3d75df1a3c2c8102 commit b1d63265ac399112b3bca36c3d75df1a3c2c8102 Author: Alexander V. Chernikov <melifaro@FreeBSD.org> AuthorDate: 2021-03-08 21:35:41 +0000 Commit: Alexander V. Chernikov <melifaro@FreeBSD.org> CommitDate: 2021-03-10 21:10:14 +0000 Flush remaining routes from the routing table during VNET shutdown. Summary: This fixes rtentry leak for the cloned interfaces created inside the VNET. PR: 253998 Reported by: rashey at superbox.pl MFC after: 3 days Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown). Thus, any route table operations are too late to schedule. As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`. It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish. Test Plan: ``` set_skip:set_skip_group_lo -> passed [0.053s] tail -n 200 /var/log/messages | grep rtentry ``` Reviewers: #network, kp, bz Reviewed By: kp Subscribers: imp, ae Differential Revision: https://reviews.freebsd.org/D29116 sys/net/route.c | 15 --------------- sys/net/route.h | 2 +- sys/net/route/route_ctl.c | 36 ++++++++++++++++++++++++++++++++++++ sys/netinet/ip_input.c | 6 +----- sys/netinet6/ip6_input.c | 5 +++-- 5 files changed, 41 insertions(+), 23 deletions(-) A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=8aafa7a0276302a0dcc3d0bd78b4d3842dfd1640 commit 8aafa7a0276302a0dcc3d0bd78b4d3842dfd1640 Author: Alexander V. Chernikov <melifaro@FreeBSD.org> AuthorDate: 2021-03-08 21:35:41 +0000 Commit: Alexander V. Chernikov <melifaro@FreeBSD.org> CommitDate: 2021-03-13 20:19:17 +0000 Flush remaining routes from the routing table during VNET shutdown. Summary: This fixes rtentry leak for the cloned interfaces created inside the VNET. Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown). Thus, any route table operations are too late to schedule. As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`. It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish. Test Plan: ``` set_skip:set_skip_group_lo -> passed [0.053s] tail -n 200 /var/log/messages | grep rtentry ``` PR: 253998 Reported by: rashey at superbox.pl Reviewed By: kp Differential Revision: https://reviews.freebsd.org/D29116 (cherry picked from commit b1d63265ac399112b3bca36c3d75df1a3c2c8102) sys/net/route.c | 15 --------------- sys/net/route.h | 2 +- sys/net/route/route_ctl.c | 36 ++++++++++++++++++++++++++++++++++++ sys/netinet/ip_input.c | 6 +----- sys/netinet6/ip6_input.c | 5 +++-- 5 files changed, 41 insertions(+), 23 deletions(-) Is there any chance to MFC the patch to releng/13.0 before RELEASE build begin? A commit in branch releng/13.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=6f4f8a441aaab2e23a8e70ed0689daa05cec3ef4 commit 6f4f8a441aaab2e23a8e70ed0689daa05cec3ef4 Author: Alexander V. Chernikov <melifaro@FreeBSD.org> AuthorDate: 2021-03-08 21:35:41 +0000 Commit: Alexander V. Chernikov <melifaro@FreeBSD.org> CommitDate: 2021-03-28 20:40:48 +0000 Flush remaining routes from the routing table during VNET shutdown. Summary: This fixes rtentry leak for the cloned interfaces created inside the VNET. Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown). Thus, any route table operations are too late to schedule. As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`. It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish. Test Plan: ``` set_skip:set_skip_group_lo -> passed [0.053s] tail -n 200 /var/log/messages | grep rtentry ``` PR: 253998 Reported by: rashey at superbox.pl Reviewed By: kp Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D29116 (cherry picked from commit 8aafa7a0276302a0dcc3d0bd78b4d3842dfd1640) sys/net/route.c | 15 --------------- sys/net/route.h | 2 +- sys/net/route/route_ctl.c | 36 ++++++++++++++++++++++++++++++++++++ sys/netinet/ip_input.c | 6 +----- sys/netinet6/ip6_input.c | 5 +++-- 5 files changed, 41 insertions(+), 23 deletions(-) Given the fix has landed in 13.0-RC4 I'm going to close this one. Thank you for reporting the issue! Please do reopen if you still see the behaviour / have other concerns w.r.t the change. |