Bug 185727 - Devices fail to probe on 128GB or larger memory machines
Summary: Devices fail to probe on 128GB or larger memory machines
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: Alan Cox
URL:
Keywords:
: 194455 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-01-13 01:40 UTC by Alfred Perlstein
Modified: 2015-10-11 23:38 UTC (History)
5 users (show)

See Also:
ngie: mfc-stable10?
ngie: mfc-stable9?


Attachments
file.diff (719 bytes, patch)
2014-01-13 01:40 UTC, Alfred Perlstein
no flags Details | Diff
New patch (12.50 KB, patch)
2014-11-20 18:06 UTC, Alan Cox
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alfred Perlstein freebsd_committer freebsd_triage 2014-01-13 01:40:00 UTC
FreeBSD machines with large memory, it seems 128GB or higher wind up without enough memory under 4GB for many devices to work.

This results in USB not working (keyboard/mouse doesn't attach), and some HBAs fail as well.  I think other devices may fail as well, but I've not experienced that yet.

Fix: Alan Cox posted the attached patch (
hack2.patch) , however there needs to be some work done still on it according to him:

> The only issue with this patch is that it will pessimize the speed of
> physical memory allocation on amd64 machines with small amounts of
> memory.  I need to augment the attached patch, which just changes some
> #define's, with some changes to vm_phys.c to avoid creating excess free
> page queues on small memory machines.
>
> Alan

Patch attached with submission follows:
How-To-Repeat: Boot FreeBSD 10.x or FreeBSD-current with more than 128GB of memory and some hardware configs.
Comment 1 Alfred Perlstein freebsd_committer freebsd_triage 2014-01-13 01:49:00 UTC
Responsible Changed
From-To: freebsd-bugs->alc@freebsd.org

Assign to alc, fix originator.
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2014-01-13 02:16:30 UTC
Responsible Changed
From-To: alc@freebsd.org->alc

Canonicalize assignment.
Comment 3 vsjcfm 2014-01-18 22:31:09 UTC
Is this problem exists in 9.x too?
And about the patch - will it cause performance degradation on systems
with 128g+ ram?
Comment 4 Alfred Perlstein freebsd_committer freebsd_triage 2014-01-19 14:28:07 UTC
On 1/18/14 2:31 PM, Anton Sayetsky wrote:
> Is this problem exists in 9.x too?
> And about the patch - will it cause performance degradation on systems
> with 128g+ ram?
>
Probably not, however it may cause problems on machines with less. I'd 
start to worry about 16GB or even 32GB and less.

-Alfred
Comment 5 vsjcfm 2014-01-19 20:52:19 UTC
2014/1/19 Alfred Perlstein <alfred@freebsd.org>:
> Probably not, however it may cause problems on machines with less. I'd start
> to worry about 16GB or even 32GB and less.
I'm planning to deploy system with 100+ TiB zpool & 128-256 GiB RAM,
respectively.
At least now I can try this workaround. It will not be done tomorrow,
but I'll send followup later.
Comment 6 Kurt Lidl freebsd_committer freebsd_triage 2014-03-28 19:45:56 UTC
Is there any forward progress on resolving this issue?

We just ran into this same issue on a 128GB machine - the USB
subsystem failed because it could not allocate memory in
the first 4GB of space.

We're running with the patch noted in this bug report, to
no avail.

Thanks for any updates.
Comment 7 Alfred Perlstein freebsd_committer freebsd_triage 2014-04-19 15:54:39 UTC
Kurt,

Does the patch work for you or no?
Comment 8 Kurt Lidl freebsd_committer freebsd_triage 2014-04-20 07:28:44 UTC
On 4/19/14 10:54 AM, Alfred Perlstein wrote:
> Kurt,
>
> Does the patch work for you or no?

More or less.  We have other things that demand
memory allocation allocations below the 4GB
mark.  With a little juggling, we were able to
make this work on our 128GB machines.

Sorry that's sorta a wishy-washy answer, but
it's the best I can give right now.

-Kurt
Comment 9 Conrad Meyer 2014-10-18 20:50:26 UTC
We have 96-256 GB systems and use a spiritually similar patch in our tree at Isilon. I filed bug 194455, but maybe it's just a total duplicate of this bug/patch.

What can I do to help get this into HEAD? We'd prefer not to keep carrying a local patch for this. Thanks.
Comment 10 Conrad Meyer 2014-10-18 20:57:55 UTC
(In reply to Alfred Perlstein from comment #4)
> On 1/18/14 2:31 PM, Anton Sayetsky wrote:
> > And about the patch - will it cause performance degradation [...]
> >
> Probably not, however it may cause problems on machines with less. I'd 
> start to worry about 16GB or even 32GB and less.

If this is a real concern, we could make it a tunable. E.g.:

  - If the tunable is set to zero, we only have the ISADMA and DEFAULT regions, like today.

  - If the tunable is set to somewhere in [16 MB, 4 GB], we have ISADMA, DEFAULT, HIGHMEM regions.

Defaulting the tunable to 4GB will make most systems work without fiddling ("safe default"); defaulting to off will preserve current behavior and avoid performance impact. I'm fine with either default and you could argue it either way.

Have we actually measured a performance impact on small memory systems? It would be good to know if there's actually a problem before we tilt at windmills.

Finally, I'm happy to draft a patch for the tunable proposal if it seems reasonable and necessary. Thanks.
Comment 11 Enji Cooper freebsd_committer freebsd_triage 2014-10-19 07:05:48 UTC
*** Bug 194455 has been marked as a duplicate of this bug. ***
Comment 12 Enji Cooper freebsd_committer freebsd_triage 2014-10-19 07:06:39 UTC
There's additional discussion in bug 194455 that could be potentially valuable in resolving this issue.
Comment 13 Alan Cox freebsd_committer freebsd_triage 2014-11-20 18:06:40 UTC
Created attachment 149648 [details]
New patch

I finally got a chance to work on this problem and develop a patch that I'm not ashamed to commit.  This patch also addresses a related problem on small MIPS systems where we create a set of free lists that will never contain any pages.  Nonetheless, we will search those free lists on every page allocation:

 sysctl vm.phys_free
vm.phys_free:
DOMAIN 0:

FREE LIST 0:

  ORDER (SIZE)  |  NUMBER
                |  POOL 0  |  POOL 1  |  POOL 2
--            -- --      -- --      -- --      --
   8 (  1024K)  |       0  |       0  |       0
   7 (   512K)  |       0  |       0  |       0
   6 (   256K)  |       0  |       0  |       0
   5 (   128K)  |       0  |       0  |       0
   4 (    64K)  |       0  |       0  |       0
   3 (    32K)  |       0  |       0  |       0
   2 (    16K)  |       0  |       0  |       0
   1 (     8K)  |       0  |       0  |       0
   0 (     4K)  |       0  |       0  |       0

FREE LIST 1:

  ORDER (SIZE)  |  NUMBER
                |  POOL 0  |  POOL 1  |  POOL 2
--            -- --      -- --      -- --      --
   8 (  1024K)  |      21  |       0  |       0
   7 (   512K)  |       0  |       1  |       0
   6 (   256K)  |       1  |       0  |       0
   5 (   128K)  |       1  |       1  |       0
   4 (    64K)  |       4  |       0  |       0
   3 (    32K)  |       3  |       0  |       0
   2 (    16K)  |       1  |       0  |       0
   1 (     8K)  |       0  |       1  |       0
   0 (     4K)  |       1  |       0  |       0
Comment 14 Conrad Meyer 2014-11-20 18:16:33 UTC
Alan,

Can you throw this up on phabricator?

+ * Create the DMA32 free list only if the number of physical pages above
+ * physical address 4G is at least 16M, which amounts to 64GB of physical
+ * memory.
+ */
+#define	VM_DMA32_THRESHOLD	16777216

How did you arrive at "only above 64GB"?

Thanks!
Comment 15 Alfred Perlstein freebsd_committer freebsd_triage 2014-12-07 12:26:13 UTC
Alan,

This has been added for review here:

https://reviews.freebsd.org/D1274
Comment 16 commit-hook freebsd_committer freebsd_triage 2014-12-31 00:55:03 UTC
A commit references this bug:

Author: alc
Date: Wed Dec 31 00:54:40 UTC 2014
New revision: 276439
URL: https://svnweb.freebsd.org/changeset/base/276439

Log:
  The physical memory allocator supports the use of distinct free lists for
  managing pages from different address ranges.  Generally speaking, this
  feature is used to increase the likelihood that physical pages are
  available that can meet special DMA requirements or can be accessed through
  a limited-coverage direct mapping (e.g., MIPS).  However, prior to this
  change, the configuration of the free lists was static, i.e., it was
  determined at compile time.  Consequentally, free lists could be created
  for address ranges that held no actual pages, for example, on 32-bit MIPS-
  based systems with 512 MB or less of physical memory.  This change makes
  the creation of the free lists dynamic, i.e., it is based on the available
  physical memory at boot time.

  On 64-bit x86-based systems with 64 GB or more of physical memory, create
  free lists for managing pages with physical addresses below 4 GB.  This
  change is to address reported problems with initializing devices that
  require the allocation of physical pages below 4 GB on some systems with
  128 GB or more of physical memory.

  PR:		185727
  Differential Revision:	https://reviews.freebsd.org/D1274
  Reviewed by:	jhb, kib
  MFC after:	3 weeks
  Sponsored by:	EMC / Isilon Storage Division

Changes:
  head/sys/amd64/include/vmparam.h
  head/sys/mips/include/vmparam.h
  head/sys/vm/vm_phys.c
  head/sys/vm/vm_phys.h
Comment 17 Glen Barber freebsd_committer freebsd_triage 2015-07-08 18:32:11 UTC
To originators/assignees of this PR:

A commit to the tree references this PR, however the PR is still in a non-closed state.

Please review this PR and close as appropriate, or if closing the PR requires a merge to stable/10, please let re@ know as soon as possible.

Thank you.

Glen
Comment 18 commit-hook freebsd_committer freebsd_triage 2015-07-16 14:42:36 UTC
A commit references this bug:

Author: kib
Date: Thu Jul 16 14:42:00 UTC 2015
New revision: 285634
URL: https://svnweb.freebsd.org/changeset/base/285634

Log:
  MFC r276439 (by alc):
  Make the creation of the free lists dynamic, i.e., it is based on the
  available physical memory at boot time. For amd64 systems with 64 GB
  or more of physical memory, create free lists for managing pages with
  physical addresses below 4 GB.

  PR:	185727
  Requested by:	alc
  Approved by:	re (gjb)

Changes:
_U  stable/10/
  stable/10/sys/amd64/include/vmparam.h
  stable/10/sys/mips/include/vmparam.h
  stable/10/sys/vm/vm_phys.c
  stable/10/sys/vm/vm_phys.h
Comment 19 Alan Cox freebsd_committer freebsd_triage 2015-10-11 23:38:08 UTC
My fix was applied to the 10.x branch before the 10.2 release, and I'm not inclined to adapt the fix to the 9.x branch, so I'm closing this bug.