Bug 248008 - i386 system can hang with many processes sleeping on btalloc post base r358097
Summary: i386 system can hang with many processes sleeping on btalloc post base r358097
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: i386 Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2020-07-16 02:30 UTC by Rick Macklem
Modified: 2020-07-26 22:46 UTC (History)
4 users (show)

See Also:
koobs: maintainer-feedback? (jeff)


Attachments
Ryan Libby's suggestion #2 way to fix this (718 bytes, patch)
2020-07-16 02:34 UTC, Rick Macklem
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rick Macklem freebsd_committer 2020-07-16 02:30:08 UTC
I think the patch is not complete.  It looks like the problem is that
for systems that do not have UMA_MD_SMALL_ALLOC, we do
        uma_zone_set_allocf(vmem_bt_zone, vmem_bt_alloc);
but we haven't set an appropriate free function.  This is probably why
UMA_ZONE_NOFREE was originally there.  When NOFREE was removed, it was
appropriate for systems with uma_small_alloc.

So by default we get page_free as our free function.  That calls
kmem_free, which calls vmem_free ... but we do our allocs with
vmem_xalloc.  I'm not positive, but I think the problem is that in
effect we vmem_xalloc -> vmem_free, not vmem_xfree.

Three possible fixes:
 1: The one you tested, but this is not best for systems with
    uma_small_alloc.
 2: Pass UMA_ZONE_NOFREE conditional on UMA_MD_SMALL_ALLOC.
 3: Actually provide an appropriate vmem_bt_free function.

I think we should just do option 2 with a comment, it's simple and it's
what we used to do.  I'm not sure how much benefit we would see from
option 3, but it's more work.
Comment 1 Rick Macklem freebsd_committer 2020-07-16 02:34:39 UTC
Created attachment 216478 [details]
Ryan Libby's suggestion #2 way to fix this

When doing a kernel build over NFS, I could frequently get the system
hung with many processes sleeping on btalloc.
Ryan Libby made the following suggestions:
I think the patch is not complete.  It looks like the problem is that
for systems that do not have UMA_MD_SMALL_ALLOC, we do
        uma_zone_set_allocf(vmem_bt_zone, vmem_bt_alloc);
but we haven't set an appropriate free function.  This is probably why
UMA_ZONE_NOFREE was originally there.  When NOFREE was removed, it was
appropriate for systems with uma_small_alloc.

So by default we get page_free as our free function.  That calls
kmem_free, which calls vmem_free ... but we do our allocs with
vmem_xalloc.  I'm not positive, but I think the problem is that in
effect we vmem_xalloc -> vmem_free, not vmem_xfree.

Three possible fixes:
 1: The one you tested, but this is not best for systems with
    uma_small_alloc.
 2: Pass UMA_ZONE_NOFREE conditional on UMA_MD_SMALL_ALLOC.
 3: Actually provide an appropriate vmem_bt_free function.

I think we should just do option 2 with a comment, it's simple and it's
what we used to do.  I'm not sure how much benefit we would see from
option 3, but it's more work.

This patch implements #2 and seems to fix the problem. The problem
was not reproducible on an amd64 system with memory set to 1Gbyte.
Comment 2 Rick Macklem freebsd_committer 2020-07-16 02:43:37 UTC
Sorry, the comments got messed up. I guess I shouldn't have done the
attachment before submitting or something..

Anyhow, here is what has gone on.
Post r358097 (it took a while to bisect to this commit), running a kernel
build over NFS on an i386 system (I also got it once doing the build on
UFS), I've gotten hangs with many processes sleeping on "btalloc".

Ryan Libby made the following suggestions and I implemented #2, which
fixed the hangs.

   I think the patch is not complete.  It looks like the problem is that
   for systems that do not have UMA_MD_SMALL_ALLOC, we do
        uma_zone_set_allocf(vmem_bt_zone, vmem_bt_alloc);
   but we haven't set an appropriate free function.  This is probably why
   UMA_ZONE_NOFREE was originally there.  When NOFREE was removed, it was
   appropriate for systems with uma_small_alloc.

   So by default we get page_free as our free function.  That calls
   kmem_free, which calls vmem_free ... but we do our allocs with
   vmem_xalloc.  I'm not positive, but I think the problem is that in
   effect we vmem_xalloc -> vmem_free, not vmem_xfree.

   Three possible fixes:
    1: The one you tested, but this is not best for systems with
    uma_small_alloc.
    2: Pass UMA_ZONE_NOFREE conditional on UMA_MD_SMALL_ALLOC.
    3: Actually provide an appropriate vmem_bt_free function.

   I think we should just do option 2 with a comment, it's simple and it's
   what we used to do.  I'm not sure how much benefit we would see from
   option 3, but it's more work.
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2020-07-16 03:55:46 UTC
^Triage: Request feedback from base r358097 committer
Comment 4 Rick Macklem freebsd_committer 2020-07-26 22:46:41 UTC
Just fyi, the email discussion that preceeded this PR is here:
http://docs.FreeBSD.org/cgi/mid.cgi?QB1PR01MB3364709D9D16CD0E550DAC7CDD6A0