Bug 244493 - databases/lmdb: issue with MDB_USE_POSIX_MUTEX
Summary: databases/lmdb: issue with MDB_USE_POSIX_MUTEX
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Xin LI
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-28 10:48 UTC by Michał Kępień
Modified: 2020-06-16 21:56 UTC (History)
1 user (show)

See Also:
bugzilla: maintainer-feedback? (delphij)


Attachments
Sample program reproducing the issue (685 bytes, text/plain)
2020-02-28 10:48 UTC, Michał Kępień
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michał Kępień 2020-02-28 10:48:42 UTC
Created attachment 212019 [details]
Sample program reproducing the issue

Hi there,

ports r519246 switched LMDB from MDB_USE_POSIX_SEM to
MDB_USE_POSIX_MUTEX.  Unfortunately, it seems that there are some edge
cases in which LMDB does not play nicely with FreeBSD's process-shared
mutexes.

The particular problem I observed is that when a single process reopens
an LMDB environment (that is, an environment is opened, closed, then
opened again by the same process), then other processes trying to access
the reopened environment fail to grab the read table lock -
pthread_mutex_lock() returns EINVAL (22).  AFAICT, this happens because
libthr is unable to find the shared memory segment with the relevant
part of the LMDB lockfile mmap()'d.

I attached the simplest test case I could come up with.  To reproduce
the problem, first compile lmdb-mutex.c:

    cc -I/usr/local/include -L/usr/local/lib -llmdb lmdb-mutex.c -o lmdb-mutex

Then, run the first instance of the program.  It should start fine and
sleep for 30 seconds.

Before the first instance exits, start a second instance of the program.
It should fail with:

    Assertion failed: (mdb_txn_begin(env, 0, MDB_RDONLY, &txn) == MDB_SUCCESS), function main, file lmdb-mutex.c, line 14.

The sample program works fine in the above scenario if LMDB is compiled
with MDB_USE_POSIX_SEM.  It also works fine on other operating systems
using MDB_USE_POSIX_MUTEX.

One workaround I could come up with is enabling ASLR - it causes mmap()
to return different addresses for the LMDB lockfile mapping upon each
call to mdb_env_open(), causing libthr to use different off-pages for
the read table mutexes for the "old" and "new" environment (IIUC).

Note that LMDB never calls pthread_mutex_destroy() for the read table
lock when an environment is closed which I believe prevents the shared
memory segment for the "old" environment from being released.  But
please do not take my word for it, I do not understand
sys/kern/kern_umtx.c, libthr, and LMDB internals well enough to fully
explain what is happening (though I sure would like to find out!)

To see an example occurrence of this problem in the wild, install BIND
(e.g. dns/bind911), put the following into named.conf:

    options {
        allow-new-zones yes;
    };

and then run "named -g -c named.conf".  After it starts up, run
"named-nzd2nzf _default.nzd".  It will fail with:

    named-nzd2nzf: mdb_txn_begin: Invalid argument

My humble suggestion would be to revert the LMDB port back to
MDB_USE_POSIX_SEM for the time being, unless someone can immediately see
what the problem is and is able to fix it.

Hope this helps, please let me know if I can be of any further
assistance with this issue.
Comment 1 commit-hook freebsd_committer freebsd_triage 2020-06-16 21:52:05 UTC
A commit references this bug:

Author: delphij
Date: Tue Jun 16 21:51:56 UTC 2020
New revision: 539379
URL: https://svnweb.freebsd.org/changeset/ports/539379

Log:
  databases/lmdb: in db_env_close0(), destroy robust mutexes if we are
  the only remaining user.

  When closing an lmdb database, all memory and file descriptor resources
  are released, including the shared memory pages that contained the
  robust mutex.

  However, before this commit, prior to unmapping the pages that contained
  the robust mutexex, lmdb did not destroy the mutexes first.  This would
  create a problem when an application opens and closes a database, then
  open it again.

  According to libthr(3), by default, a shared lock backed by a mapped
  file in memory is automatically destroyed on the last unmap of the
  corresponding file' page, which is allowed by POSIX.

  After unmapping the shared pages, the kernel writes off all active
  robust mutexes associated with these pages.  However, the userland
  threading library still keeps the record (pshared_lookup in
  thr_pshared.c of libthr) for these objects as they are not really
  destroyed before, so that it don't have to ask the kernel every
  time when looking them up.

  Now, a later re-open of the database might have mapped the lock file
  to the same memory location.  Because the threading library have
  remembered the robust mutex object, it would just reuse it even though
  it was already invalid from kernel's point of view.  Unfortunately,
  regular lock operations would still work for this process.

  Should another lmdb process opens the same database, it would attempt
  to obtain the robust mutex (no longer recognized by kernel) because it
  would see another process holding a file lock, but that would fail
  because the robust mutex is invalid for the kernel.

  Explicitly destroy the mutex if we are the last remaining user to ensure
  the mutex is always in a known defined state.

  OpenLDAP ITS #9278

  With debugging help from:	kib
  PR:				244493
  MFH:				2020Q2

Changes:
  head/databases/lmdb/Makefile
  head/databases/lmdb/files/patch-mdb.c
Comment 2 commit-hook freebsd_committer freebsd_triage 2020-06-16 21:55:07 UTC
A commit references this bug:

Author: delphij
Date: Tue Jun 16 21:54:59 UTC 2020
New revision: 539380
URL: https://svnweb.freebsd.org/changeset/ports/539380

Log:
  MFH: r539379

  databases/lmdb: in db_env_close0(), destroy robust mutexes if we are
  the only remaining user.

  When closing an lmdb database, all memory and file descriptor resources
  are released, including the shared memory pages that contained the
  robust mutex.

  However, before this commit, prior to unmapping the pages that contained
  the robust mutexex, lmdb did not destroy the mutexes first.  This would
  create a problem when an application opens and closes a database, then
  open it again.

  According to libthr(3), by default, a shared lock backed by a mapped
  file in memory is automatically destroyed on the last unmap of the
  corresponding file' page, which is allowed by POSIX.

  After unmapping the shared pages, the kernel writes off all active
  robust mutexes associated with these pages.  However, the userland
  threading library still keeps the record (pshared_lookup in
  thr_pshared.c of libthr) for these objects as they are not really
  destroyed before, so that it don't have to ask the kernel every
  time when looking them up.

  Now, a later re-open of the database might have mapped the lock file
  to the same memory location.  Because the threading library have
  remembered the robust mutex object, it would just reuse it even though
  it was already invalid from kernel's point of view.  Unfortunately,
  regular lock operations would still work for this process.

  Should another lmdb process opens the same database, it would attempt
  to obtain the robust mutex (no longer recognized by kernel) because it
  would see another process holding a file lock, but that would fail
  because the robust mutex is invalid for the kernel.

  Explicitly destroy the mutex if we are the last remaining user to ensure
  the mutex is always in a known defined state.

  OpenLDAP ITS #9278

  With debugging help from:	kib
  PR:				244493
  Approved by:			ports-secteam

Changes:
_U  branches/2020Q2/
  branches/2020Q2/databases/lmdb/Makefile
  branches/2020Q2/databases/lmdb/files/patch-mdb.c