Bug 221202 - Mbufs leak since r321253
Summary: Mbufs leak since r321253
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: arm64 Any
: --- Affects Some People
Assignee: Sean Bruno
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2017-08-03 13:58 UTC by gergely.czuczy
Modified: 2017-08-13 16:36 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description gergely.czuczy 2017-08-03 13:58:56 UTC
Hello,

While running some port builds over an NFS-mounted /usr/ports on my 
aarch64 VM after r321253, I've noticed that sooner or later I'm getting 
an error saying "kernel: [zone: mbuf_cluster] kern.ipc.nmbclusters limit 
reached". I've tried increasing the sysctl, but after some time I always 
got the error.

I've checked the mbuf counter with ``netstat -m'' periodically during 
the build, and noticed that the mbufs in use and mbuf clusters in use 
are continuously increasing, I've never seen a decrease:

279943/392/280335 mbufs in use (current/cache/total)
130819/253/131072/131072 mbuf clusters in use (current/cache/total/max)

According to 
https://svnweb.freebsd.org/base/head/?view=log&pathrev=321253, the 
previous commit to HEAD was r321248. I've checked it on that build as 
well, and I couldn't reproduce the issue, mbuf counters were stable.

After getting the kernel error message, the builds got stuck, some 
processes become unresponsive, and the only way to I was aware to clear 
the mbufs was to reboot the VM.

So far what I know, is building over NFS triggers the leak. However, I 
haven't tested it with other network activities, might also do the trick.

Could you please take a look into it?

If any more information is needed, please let me know, I still have my 
test VMs at hand, I can do a few tests.

Best regards,
Gergely


PS1: Might affect other architectures as well, I haven't tested it on anything else than aarch64.
Comment 1 Mark Linimon freebsd_committer 2017-08-07 03:27:58 UTC
Over to committer of r321253.
Comment 2 Sean Bruno freebsd_committer 2017-08-07 17:48:24 UTC
Adding Andrew Turner so he can yell at me.
Comment 3 Sean Bruno freebsd_committer 2017-08-07 20:06:54 UTC
https://github.com/mattmacy/networking/commit/c364bff57fa46be1f2e0a08c94ae4354a7a4303f

Can you try this patch and see if mbufs stop leaking on your host?
Comment 4 commit-hook freebsd_committer 2017-08-08 01:39:48 UTC
A commit references this bug:

Author: sbruno
Date: Tue Aug  8 01:39:37 UTC 2017
New revision: 447530
URL: https://svnweb.freebsd.org/changeset/ports/447530

Log:
  Pointyhat to me.

  A stray '.' somehow made it past my testing.

  Do *not* bump portrevision as this only affects the packaging/stage
  of these ports on mips/armv6 or other cross compiled targets.

  PR:		221202
  Reported by:	antoine

Changes:
  head/lang/python33/Makefile
  head/lang/python34/Makefile
  head/lang/python35/Makefile
Comment 5 gergely.czuczy 2017-08-08 04:16:20 UTC
sbruno, the patch applied:
root@marvin:/tank/rpi3/src# patch -p1 < ../build/c364bff57fa46be1f2e0a08c94ae4354a7a4303f.patch
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|From c364bff57fa46be1f2e0a08c94ae4354a7a4303f Mon Sep 17 00:00:00 2001
|From: Matthew Macy <mmacy@nextbsd.org>
|Date: Mon, 7 Aug 2017 12:55:12 -0700
|Subject: [PATCH] don't leak mbufs when clusters > segments
|
|---
| sys/net/iflib.c | 18 +++++++++++++++---
| 1 file changed, 15 insertions(+), 3 deletions(-)
|
|diff --git a/sys/net/iflib.c b/sys/net/iflib.c
|index fd8f517b22a62..e1de60b0e2b94 100644
|--- a/sys/net/iflib.c
|+++ b/sys/net/iflib.c
--------------------------
Patching file sys/net/iflib.c using Plan A...
Hunk #1 succeeded at 267 (offset 2 lines).
Hunk #2 succeeded at 2930 (offset -36 lines).
Hunk #3 succeeded at 3282 (offset 5 lines).
done

I will test it, and let you know of the results in a few hours.
Comment 6 gergely.czuczy 2017-08-08 08:47:58 UTC
Various builds are running for quite a while, and the mbuf counters are stable. This seems like fixes the issue, I'm unable to reproduce it with the patch applied.
Comment 7 commit-hook freebsd_committer 2017-08-10 03:43:35 UTC
A commit references this bug:

Author: sbruno
Date: Thu Aug 10 03:43:23 UTC 2017
New revision: 322338
URL: https://svnweb.freebsd.org/changeset/base/322338

Log:
  Don't leak mbufs if clusers exceeds the number of segments.  This would
  leak mbufs over time causing crashes.

  PR:		221202
  Submitted by:	Matt Macy <matt@mattmacy.io>
  Reported by:	gergely.czuczy@harmless.hu
  Sponsored by:	Limelight Networks

Changes:
  head/sys/net/iflib.c
Comment 8 commit-hook freebsd_committer 2017-08-13 16:36:09 UTC
A commit references this bug:

Author: sunpoet
Date: Sun Aug 13 16:35:04 UTC 2017
New revision: 447898
URL: https://svnweb.freebsd.org/changeset/ports/447898

Log:
  MFH: r447129 r447530

  Add a code block for the qemu-user enabled cross build environment.  When using
  this environment in poudriere, CC is not set to the default of /usr/bin/cc and
  a cross-compile toolchain is used.  We need to hand edit this so that the run
  time configuration for python matches what the FreeBSD base system provides.

  PR:		208282
  Submitted by:	manu
  Approved by:	portmgr (mat)

  Pointyhat to me.

  A stray '.' somehow made it past my testing.

  Do *not* bump portrevision as this only affects the packaging/stage
  of these ports on mips/armv6 or other cross compiled targets.

  PR:		221202
  Reported by:	antoine

  Approved by:	ports-secteam (zi)

Changes:
_U  branches/2017Q3/
  branches/2017Q3/lang/python27/Makefile
  branches/2017Q3/lang/python33/Makefile
  branches/2017Q3/lang/python34/Makefile
  branches/2017Q3/lang/python35/Makefile