Bug 197789

Summary: (zfs+i386 No PAE) panic: kmem_malloc(36864): kmem_map too small: 431976448 total allocated
Product: Base System Reporter: Michelle Sullivan <michelle>
Component: kernAssignee: freebsd-fs (Nobody) <fs>
Status: New ---    
Severity: Affects Some People CC: michelle, mmoll, ota
Priority: --- Keywords: crash
Version: 9.3-RELEASE   
Hardware: i386   
OS: Any   

Description Michelle Sullivan 2015-02-18 15:14:48 UTC
Cores etc:

http://flashback.sorbs.net/packages/crash/9.3-i386/ (specifically: core.txt.0.1424274909)

93i386 dumped core - see /var/crash/vmcore.0

Wed Feb 18 16:53:59 CET 2015

FreeBSD 93i386 9.3-RELEASE-p9 FreeBSD 9.3-RELEASE-p9 #0: Tue Jan 27 10:20:56 UTC 2015     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  i386

panic: kmem_malloc(36864): kmem_map too small: 431976448 total allocated

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: kmem_malloc(36864): kmem_map too small: 431976448 total allocated
cpuid = 3
KDB: stack backtrace:
#0 0xc0b0f96f at kdb_backtrace+0x4f
#1 0xc0ad65af at panic+0x16f
#2 0xc0d62caa at kmem_malloc+0x28a
#3 0xc0d563b7 at page_alloc+0x27
#4 0xc0d58a60 at uma_large_malloc+0x50
#5 0xc0abdf3c at malloc+0x8c
#6 0xc86e3f50 at zfs_kmem_alloc+0x20
#7 0xc8627ce4 at zio_data_buf_alloc+0x44
#8 0xc85913a0 at arc_get_data_buf+0x250
#9 0xc8591a07 at arc_buf_alloc+0x97
#10 0xc859cddc at dbuf_new_size+0x6c
#11 0xc85b9b01 at dnode_set_blksz+0x321
#12 0xc85a12dc at dmu_object_set_blocksize+0x5c
#13 0xc861dc42 at zfs_grow_blocksize+0x72
#14 0xc8655513 at zfs_freebsd_write+0xf93
#15 0xc0fc5835 at VOP_WRITE_APV+0x115
#16 0xc0b882d2 at vn_write+0x1f2
#17 0xc0b85d6a at vn_io_fault+0x9a
Uptime: 21m12s
Physical memory: 3499 MB
Dumping 539 MB: 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300 284 268 252 236 220 204 188 172 156 140 124 108 92 76 60 44 28 12

Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from /boot/kernel/fdescfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/fdescfs.ko
Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from /boot/kernel/nullfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/nullfs.ko
#0  doadump (textdump=1) at pcpu.h:250
250	pcpu.h: No such file or directory.
	in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:250
#1  0xc0ad62f5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:454
#2  0xc0ad65f2 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:642
#3  0xc0d62caa in kmem_malloc (map=0xc1bdd090, size=36864, flags=2050)
    at /usr/src/sys/vm/vm_kern.c:305
#4  0xc0d563b7 in page_alloc (zone=0x0, bytes=36864, pflag=0xf104964f "\002", 
    wait=2050) at /usr/src/sys/vm/uma_core.c:994
#5  0xc0d58a60 in uma_large_malloc (size=36864, wait=2050)
    at /usr/src/sys/vm/uma_core.c:3071
#6  0xc0abdf3c in malloc (size=36864, mtp=0xc86e511c, flags=2050)
    at /usr/src/sys/kern/kern_malloc.c:527
#7  0xc86e3f50 in zfs_kmem_alloc (size=34816, kmflags=2050)
    at /usr/src/sys/modules/opensolaris/../../cddl/compat/opensolaris/kern/opensolaris_kmem.c:74
#8  0xc8627ce4 in zio_data_buf_alloc (size=34816)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:286
#9  0xc85913a0 in arc_get_data_buf (buf=0xda1808e8)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2794
#10 0xc8591a07 in arc_buf_alloc (spa=0xc8420000, size=34816, tag=0xda351cc0, 
    type=ARC_BUFC_DATA)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1497
#11 0xc859cddc in dbuf_new_size (db=0xda351cc0, size=34816, tx=0xca85e880)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:967
#12 0xc85b9b01 in dnode_set_blksz (dn=0xd1be3960, size=34816, ibs=0, 
    tx=0xca85e880)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c:1347
#13 0xc85a12dc in dmu_object_set_blocksize (os=0xc8c1d000, object=366, 
    size=34393, ibs=0, tx=0xca85e880)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1595
#14 0xc861dc42 in zfs_grow_blocksize (zp=0xd1644000, size=34393, 
    tx=0xca85e880)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1480
#15 0xc8655513 in zfs_freebsd_write (ap=0xf1049a98)
    at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1010
#16 0xc0fc5835 in VOP_WRITE_APV (vop=0xc86cd8a0, a=0xf1049a98)
    at vnode_if.c:983
#17 0xc0b882d2 in vn_write (fp=0xca95aab8, uio=0xf1049c20, 
    active_cred=0xca611100, flags=1, td=0xc991d000) at vnode_if.h:413
#18 0xc0b85d6a in vn_io_fault (fp=0xca95aab8, uio=0xf1049c20, 
    active_cred=0xca611100, flags=0, td=0xc991d000)
    at /usr/src/sys/kern/vfs_vnops.c:911
#19 0xc0b22ba9 in dofilewrite (td=0xc991d000, fd=7, fp=0xca95aab8, 
    auio=0xf1049c20, offset=-1, flags=0) at file.h:295
#20 0xc0b22eb8 in kern_writev (td=0xc991d000, fd=7, auio=0xf1049c20)
    at /usr/src/sys/kern/sys_generic.c:463
#21 0xc0b22f3f in sys_write (td=0xc991d000, uap=0xf1049ccc)
    at /usr/src/sys/kern/sys_generic.c:379
#22 0xc0f9ef23 in syscall (frame=0xf1049d08) at subr_syscall.c:135
#23 0xc0f88751 in Xint0x80_syscall ()
    at /usr/src/sys/i386/i386/exception.s:270
#24 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)
Comment 1 Michelle Sullivan 2015-02-19 00:59:31 UTC
Getting lots of cores for the same issue (url in the original report)

Set loader.conf to:

vm.kmem_size_max="1024M"
vfs.zfs.arc_meta_limit=11381616
vfs.zfs.arc_min=3375104
vfs.zfs.arc_max=16875520

System stays up longer, but still dies.
Comment 2 Michelle Sullivan 2015-02-19 14:43:44 UTC
Solution (it's stayed up so far) seems to be remove a processor...  Running on one processor it's been running 12 hours... where before it was around an hour (45 minutes after starting poudriere.)
Comment 3 Michelle Sullivan 2015-02-19 14:48:25 UTC
Correction... that was not the last thing I did...  I changed from using MFSSIZE to TMPFS.
Comment 4 Michelle Sullivan 2015-05-17 18:03:35 UTC
Ok got it down to something a little more clearer...

No changing of vm.kmem_size_max has any effect except delaying the issue... just as vm.kmem_size doesn't.

What does stop the panic (so far I've build 800+ packages using poudriere - with 'svn update' prior - which before the update didn't even complete and after going through several boots it manages around 20 packages before panicing - if it even gets past the checking sanity phase)....

Reduce the CPU count to one cpu ....

Interestingly setting max ARC size to 40M and within minutes of multiple CPUs it's over 400M ... before it panics...  With one CPU it grows to 80(ish)M and doesn't panic...

some 40 or 50 cores publicly available at the link i post earlier...

both 9.2 and 9.3 are affected... lets hope someone will see to patch it before 9.4..  I wouldn't have a clue where to look for the issue or I'd take a shot at it myself.
Comment 5 Michael Moll freebsd_committer freebsd_triage 2015-05-17 18:56:32 UTC
Michelle, did you compile that kernel with raised KVA_PAGES as described in https://wiki.freebsd.org/ZFSTuningGuide#i386 ?
Comment 6 Michelle Sullivan 2015-05-17 19:14:31 UTC
No I haven't - and quite deliberately (mainly because I saw that later and the systems are set to update using freebsd-update and I don't want the kernel to bugger up the patching)...  and I wanted to see if I could get to the bottom of the cause or at least make it reproducible...  I see no reason why zfs on i386 shouldn't work without the need to recompile the kernel (or zfs should be removed so that anyone wishing to use it should have to compile)... and I've finally got progress, single CPU and no panic, multiple CPUs and reproducible panic.. can it be looked at to resolve it..?  (because in reality it has to be some form of bug - ARC should not exhaust the memory, and it should be constrained by the limits in the loader conf(ig))
Comment 7 Michael Moll freebsd_committer freebsd_triage 2015-05-17 20:05:45 UTC
It has been quite a while I used ZFS on i386, but from what I remember:
- Default kernels can allocate 512MB max. as kmem (ALL kmem, not only ARC!)
  o That means ARC should be limited to 256MB or so, to still have room
    for other kernel tasks and some safety buffer.
- Limiting the memory down to such values will make ZFS _very_ slow.
- In general ZFS was not really designed for 32 bit systems anyway.
- I used ZFS on i386 successfully with 4GB of RAM by setting:
  o options KVA_PAGES=512 in the custom kernel
  o vm.kmem_size and vm.kmem_size_max to 1536MB in loader.conf

IMHO, at the end of the day the only advise here can be to move on to amd64
or if that's not possible to use a custom kernel with increased KVA_PAGES.
Comment 8 Michelle Sullivan 2015-05-17 21:07:05 UTC
100% right with the 512MB limit.

Setting arc_max to 40M is ignored... I have that now with a single CPU and it's showing (in "top"):

Total: 52M 1858K MFU, 24M MRU, 400K Anon, 1975K Header, 24M Other

If I set more than one CPU in the VM - the 'Total' will get to around 467M and then panic (same loader.conf settings)

This I believe is the bug/a bug...

Running ZFS on i386 should not be recommended, 100% with you... however:

1/ It is available in default kernels.
2/ KVA Pages needs to be set for default kernels. 
3/ I'm not using ZFS in production anywhere - it's used for poudriere otherwise I'd disable it completely. 

We have a couple of things/statements here:

1/ Its available for use by default, recommended or not.
2/ It seems to work (all be it slowly) on i386 with the correct tuning.
3/ More than one CPU and it doesn't work
4/ KVA pages is not set by default.

At least some should be resolved.... and this is how I see it:

1/ It either should be enabled by default and KVA_PAGES=512 set by default.
2/ It should be disabled by default with the warning that KVA_PAGES needs to be set if enabling.

and:

A/ someone should look into and resolve (if possible) the fact that arc_max is not respected when multiple CPUs are present.
B/ documentation to indicating (1) and/or (2) should be updated (currently there is the link that you indicated, it probably should be expanded to indicate that ZFS is disabled and KVA_PAGES should be added if enabling, or that zfs is enabled and so it KVA_PAGES by default and what ever risk that may entail - it should also probably add that ZFS is really not for i386 because it will be really slow as it wasn't designed for 32 bit... [does that make sense?])

Thoughts?

Not trying to be a pain here - but the default should work (even if that's "Disable ZFS in the default kernel") ...  and I really think that 1 CPU working, 2 CPU = panic is a bug, and possibly an important one that may even be there on amd64 just it is not noticed because of the platform difference.

Regards,

Michelle
Comment 9 Michelle Sullivan 2015-06-18 20:33:25 UTC
I think I might be one step closer to the cause.

I've noticed when using memory based disk on both i386 and amd64 the memory used (some/all?) cases doesn't appear in in the memory stats shown in 'top' ... however the memory is 'missing' .. could this be fooling the VM manager into thinking there is memory free when there is not and therefore screwing up the memory pressure handler...?
Comment 10 ota 2019-09-16 01:54:58 UTC
It looks this bug report is too old to further investigate.

On the other hand, I run 12.0-RELEASE/STABLE and 13-CURRENT on i386 system with 512MB of RAM , ZFS and tmpfs, and non-PAE kernel and haven't seen this type of errors for years.

It is also possible that this bug has been fixed as well.