The login_getcapstr function (and other functions in lib/libutil/login_cap.c) call cgetstr under the covers, which according the the manpage mallocs memory on the fly. However, the memory isn't free'd if certain functions are called multiple times, like pam_sm_acct_mgmt. One of the patches Isilon has had for some time doe the following to plug a leak in pam_nologin: $ git diff lib/libpam/modules/pam_nologin/pam_nolo diff --git a/lib/libpam/modules/pam_nologin/pam_nologin.c b/lib/libpam/modules/pam_nologin/pam_nologin.c index 1be63d2..b4a1421 100644 --- a/lib/libpam/modules/pam_nologin/pam_nologin.c +++ b/lib/libpam/modules/pam_nologin/pam_nologin.c @@ -38,6 +38,7 @@ __FBSDID("$FreeBSD$"); #include <sys/types.h> +#include <sys/cdefs.h> #include <sys/stat.h> #include <fcntl.h> #include <login_cap.h> @@ -97,6 +98,8 @@ pam_sm_acct_mgmt(pam_handle_t *pamh, int flags, nologin = login_getcapstr(lc, "nologin", nologin_def, nologin_def); fd = open(nologin, O_RDONLY, 0); + if (nologin != nologin_def) + free(__DECONST(char *, nologin)); if (fd < 0) { login_close(lc); return (PAM_SUCCESS); But this is not the right place to fix the issue probably. Memory needs to be handled better in lib/libutil/login_cap.c .
Ugh, this probably affects pam_unix(8) as well. The fault is evenly divided between the modules and libutil, since it uses cgetstr() almost everywhere and never free(3)s the result, even when it can (e.g. login_getcapnum(3), which by the way uses strtoq(3) incorrectly).
Wait. Actually, the login_cap(3) man page says: Once a program no longer wishes to use a login_cap_t object, login_close() may be called to free all resources used by the login class. The login_close() function may be passed a NULL pointer with no harmful side-effects. so this should not be a problem. Or did I miss something?
The man page lies :( login_getcapstr et al call cgetstr but don't track the allocation from it to free later from login_close. Some of the things are freed (class, style, cap) but if you call login_getcapstr, login _getcaptime, login_getcapnum or login_getcapsize there is no tracking. There are also a few login_* functions which call login_getcapstr and so will also leak. I am working on a patch to track them and free later but it's pretty prototypical at this stage.
Isn't cgetclose(3) supposed to free memory allocated by cgetstr(3)?
It would appear to lie also, or at least I can't work out how it would free anything :) int cgetclose(void) { if (pfp != NULL) { (void)fclose(pfp); pfp = NULL; } dbp = NULL; gottoprec = 0; slash = 0; return(0); } int cgetstr(char *buf, const char *cap, char **str) { ... if ((mem = malloc(SFRAG)) == NULL) { errno = ENOMEM; return (-2); /* couldn't even allocate the first fragment */ } ...
Created attachment 149711 [details] Fix & proto-test case I tried writing a test based on http://cvsweb.netbsd.org/bsdweb.cgi/src/tests/lib/libc/gen/t_dir.c?rev=1.6&content-type=text/x-cvsweb-markup&only_with_tag=MAIN but that relies on malloc() calling sbrk() which isn't true on FreeBSD. I used the jemalloc stats but it shows more deallocated than allocated so either I'm using it wrong or the stats aren't useful for this purpose.
Created attachment 149937 [details] Fix and test case against FreeBSD current
I fixed some bugs with my previous fix - cap_mkdb would crash processing the termcap file. I pushed them to a branch on github - https://github.com/DanielO/freebsd/tree/cgetstr-leak root@freebsdhead:~ # valgrind --leak-check=full /usr/obj/src/FreeBSD-HEAD/usr.bin/cap_mkdb/cap_mkdb -l -f /tmp/termcap /src/FreeBSD-HEAD/share/termcap/termcap ==95888== Memcheck, a memory error detector ==95888== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==95888== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==95888== Command: /usr/obj/src/FreeBSD-HEAD/usr.bin/cap_mkdb/cap_mkdb -l -f /tmp/termcap /src/FreeBSD-HEAD/share/termcap/termcap ==95888== ==95888== ==95888== HEAP SUMMARY: ==95888== in use at exit: 2,285 bytes in 2 blocks ==95888== total heap usage: 5,131 allocs, 5,129 frees, 6,569,582 bytes allocated ==95888== ==95888== 16 bytes in 1 blocks are definitely lost in loss record 1 of 2 ==95888== at 0x1007293: malloc (in /usr/local/lib/valgrind/vgpreload_memcheck-amd64-freebsd.so) ==95888== by 0x12A1B0F: strdup (strdup.c:47) ==95888== by 0x400D49: main (cap_mkdb.c:120) ==95888== ==95888== LEAK SUMMARY: ==95888== definitely lost: 16 bytes in 1 blocks ==95888== indirectly lost: 0 bytes in 0 blocks ==95888== possibly lost: 0 bytes in 0 blocks ==95888== still reachable: 2,269 bytes in 1 blocks ==95888== suppressed: 0 bytes in 0 blocks ==95888== Reachable blocks (those to which a pointer was found) are not shown. ==95888== To see them, rerun with: --leak-check=full --show-reachable=yes ==95888== ==95888== For counts of detected and suppressed errors, rerun with: -v ==95888== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) root@freebsdhead:~ # valgrind --leak-check=full /usr/obj/src/FreeBSD-HEAD/lib/libc/tests/gen/getcap_test cgetstr_leak ==96919== Memcheck, a memory error detector ==96919== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==96919== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==96919== Command: /usr/obj/src/FreeBSD-HEAD/lib/libc/tests/gen/getcap_test cgetstr_leak ==96919== getcap_test: WARNING: Running test cases outside of kyua(1) is unsupported getcap_test: WARNING: No isolation nor timeout control is being applied; you may get unexpected failures; see atf-test-case(4) OK: leaked 0 bytes passed ==96919== ==96919== HEAP SUMMARY: ==96919== in use at exit: 7,267 bytes in 40 blocks ==96919== total heap usage: 3,103 allocs, 3,063 frees, 4,208,879 bytes allocated ==96919== ==96919== LEAK SUMMARY: ==96919== definitely lost: 0 bytes in 0 blocks ==96919== indirectly lost: 0 bytes in 0 blocks ==96919== possibly lost: 0 bytes in 0 blocks ==96919== still reachable: 7,267 bytes in 40 blocks ==96919== suppressed: 0 bytes in 0 blocks ==96919== Reachable blocks (those to which a pointer was found) are not shown. ==96919== To see them, rerun with: --leak-check=full --show-reachable=yes ==96919== ==96919== For counts of detected and suppressed errors, rerun with: -v ==96919== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)