Bug 261196 - databases/db5: __db_pthread_mutex_lock fails with EINVAL on armv7
Summary: databases/db5: __db_pthread_mutex_lock fails with EINVAL on armv7
Status: Open
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Matthias Andree
URL:
Keywords:
Depends on: 197227 205001
Blocks:
  Show dependency treegraph
 
Reported: 2022-01-14 13:19 UTC by Robert Clausecker
Modified: 2022-01-15 20:24 UTC (History)
0 users

See Also:
mandree: maintainer-feedback+


Attachments
config.log (223.17 KB, text/plain)
2022-01-15 20:24 UTC, Robert Clausecker
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Clausecker 2022-01-14 13:19:49 UTC
A build of mail/bogofilter (databases/db5 consumer) fails during the test suite with failure cause:

FAIL: t.probe
=============

BDB2023 pthread lock failed: Invalid argument
BDB0061 PANIC: Invalid argument
BDB0060 PANIC: fatal region error detected; run recovery
bogofilter[5027]: DB_ENV->open, err: -30973, BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
To recover, run: bogoutil -v --db-recover "./checks.4991.20220114T122654"
FAIL t.probe (exit status: 3)

This error seems to be produced by function __db_pthread_mutex_lock in db-5.3.28/src/mutex/mut_pthread.c.  It appears that it only occurs when building on armv7 (and possibly armv6 which I cannot test).

Unfortunately the db5 code is quite convoluted and I have not been able to figure out what specific function has been called with an invalid argument.  Given that databases/db5 is subject to special treatment on armv6/armv7 (see bug #197227), this latent issue might have been hidden on other architectures.

If desired, I also volunteer to develop a patch to replace the SWP instruction with modern exclusive loads/stores to avoid having to go through the pthreads code path.
Comment 1 Matthias Andree freebsd_committer 2022-01-15 18:51:42 UTC
I believe we fixes this for armv6 shy of seven years ago already. 
Can you please provide more detail?

We need to know especially the ${ARCH} on your system, and possibly find out if db5 is using POSIX mutexes: 

What do you get from "make -C /usr/ports/databases/db5 -V ARCH"?

What FreeBSD version are you building on?

What do you get from "make -C /usr/ports/databases/db5 -V CONFIGURE_ARGS"?

Can you attach your config.log?
Comment 2 Matthias Andree freebsd_committer 2022-01-15 18:53:12 UTC
(In reply to Matthias Andree from comment #1)
oh, and the fix from the other PR 205001 then extended the bugfix to armv6hf and armv7 and whatever else matches armv*
Comment 3 Robert Clausecker 2022-01-15 20:24:13 UTC
Created attachment 231029 [details]
config.log

Hi Matthias,

See attached file for config.log.

The device has ARCH=armv7 and

    make -V CONFIGURE_ARGS
    --enable-cxx --enable-stl --enable-dbm  --enable-compat185 --enable-dump185  --includedir=/usr/local/include/db5  --libdir=/usr/local/lib/db5  --bindir=/usr/local/bin/db5 --with-cryptography=yes --disable-debug --disable-umrw --disable-java --disable-localization --disable-sql --disable-sql_codegen --disable-tcl --without-tcl --enable-posixmutexes --with-mutex=POSIX/pthreads --prefix=/usr/local ${_LATE_CONFIGURE_ARGS}

It is pretty clear the POSIX mutexes are enabled because error 2023 is only caused by the POSIX mutex code path.

> What FreeBSD version are you building on?

I am building on FreeBSD 13.0-RELEASE-p6.