#samba-tool domain backup offline --targetdir . running backup on dirs: /var/db/samba4/private /var/db/samba4 /usr/local/etc Starting transaction on /var/db/samba4/private/secrets (...) What is really hanged is a subprocess that samba-tool starts: /usr/local/bin/tdbbackup -s .copy.tdb /var/db/samba4/private/secrets.ldb This is a long standing issue since Samba 4.10 (which introduced this command). Now I tried upgrading to 4.12, but nothing changed. A discussion on Samba's mailing list suggested this might be caused by an older version of TDB and that that library should be bundled. Building (in Poudriere) with SAMBA4_BUNDLED_TDB=yes, however will produce the following: # samba-tool domain backup offline --targetdir . running backup on dirs: /var/db/samba4/private /var/db/samba4 /usr/local/etc Starting transaction on /var/db/samba4/private/secrets ERROR(<class 'FileNotFoundError'>): uncaught exception - [Errno 2] No such file or directory: '/root/bin/tdbbackup': '/root/bin/tdbbackup' File "/usr/local/lib/python3.7/site-packages/samba/netcmd/__init__.py", line 186, in _run return self.run(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/samba/netcmd/domain_backup.py", line 1061, in run self.backup_secrets(paths.private_dir, lp, logger) File "/usr/local/lib/python3.7/site-packages/samba/netcmd/domain_backup.py", line 954, in backup_secrets self.offline_tdb_copy(secrets_path + '.ldb') File "/usr/local/lib/python3.7/site-packages/samba/netcmd/domain_backup.py", line 928, in offline_tdb_copy tdb_copy(path, backup_path, readonly=True) File "/usr/local/lib/python3.7/site-packages/samba/tdb_util.py", line 40, in tdb_copy status = subprocess.check_call(tdbbackup_cmd, close_fds=True, shell=False) File "/usr/local/lib/python3.7/subprocess.py", line 358, in check_call retcode = call(*popenargs, **kwargs) File "/usr/local/lib/python3.7/subprocess.py", line 339, in call with Popen(*popenargs, **kwargs) as p: File "/usr/local/lib/python3.7/subprocess.py", line 800, in __init__ restore_signals, start_new_session) File "/usr/local/lib/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) A transaction is still active in ldb context [0x800a3cae0] on /var/db/samba4/private/secrets.ldb
samba412 expired and got removed, can you retry this with a later version?
(In reply to Rene Ladan from comment #1) The problem persists with 4.16.7.
Can someone retry?
(In reply to Michael Osipov from comment #3) Same hangs as before, with Samba 4.19, compiled with SAMBA4_BUNDLED_LDB=yes, but still not bundling TDB. Notice this is still on 2024Q4; I'll have packages for 2025Q1 in a few days. I might then try bundling TDB too.
(In reply to ml from comment #4) Thanks for the confirmation, updating title.
(In reply to ml from comment #4) I tried Samba from 2025Q1, with and without bundled TDB: it still hangs.
(In reply to ml from comment #6) Can you try to trace again and see which process exactly hangs?
(In reply to Michael Osipov from comment #7) The process that hangs is still "/usr/local/bin/tdbbackup -s .copy.tdb /var/db/samba4/private/secrets.ldb -r" and it's stuck in fcntl. I tried "truss"ing it, but it obviously gives not output. If needed, I can compile tdb with debug info and try to attach gdb to it. Anything else that might be useful?
(In reply to ml from comment #8) Better than nothing. We should at least see what is passed to fcntl...
I think I found out what the problem is. As I said, the process that locks is "/usr/local/bin/tdbbackup -s .copy.tdb /var/db/samba4/private/secrets.ldb -r". Looking at its main(), it uses getopt to interpret all the options; however getopt stops at "/var/db/samba4/private/secrets.ldb", so "-r" is never considered. Since "-r" is needed to open the databases in read-only mode, the process tries to open them read/write and locks up since they are used by the Samba processes. The correct way to call tdbbackup would be either "/usr/local/bin/tdbbackup -s .copy.tdb -r /var/db/samba4/private/secrets.ldb" or "/usr/local/bin/tdbbackup -r -s .copy.tdb /var/db/samba4/private/secrets.ldb" So the problem lies in /usr/local/lib/python3.11/site-packages/samba/tdb_util.py, where we find: tdbbackup_cmd = [toolpath, "-s", ".copy.tdb", file1] if readonly: tdbbackup_cmd.append("-r") This should be changed to something like: if readonly: tdbbackup_cmd = [toolpath, "-r", "-s", ".copy.tdb", file1] else: tdbbackup_cmd = [toolpath, "-s", ".copy.tdb", file1] With this change, "samba-tool domain backup offline" goes on and possibly succeed. Now, before I provide a patch, I only got one doubt... We don't change neither tdb_util.py, nor tdbbackup.c in our ports, so, unless our getopt works differently than Linux's, this is not a FreeBSD problem and should be reported upstream, instead.
I decided to report this upstream: https://bugzilla.samba.org/show_bug.cgi?id=15804
(In reply to ml from comment #11) Thank you, for the time being, can you provide a one-off patch? I'd create the review and get it into ports tree.
Created attachment 257599 [details] Patch against net/samba419 I guess the same could be applied to 4.20 (and 4.21).
man 3 getopt on RHEL8 says: If there are no more option characters, getopt() returns -1. Then optind is the index in argv of the first argv-element that is not an option. but, here is the BUT: By default, getopt() permutes the contents of argv as it scans, so that eventually all the nonoptions are at the end. Two other modes are also implemented. If the first character of optstring is '+' or the environment variable POSIXLY_CORRECT is set, then option processing stops as soon as a nonoption argument is encountered. If the first character of optstring is '-', then each nonoption argv-element is handled as if it were the argument of an option with character code 1. (This is used by programs that were written to expect options and other argv-elements in any order and that care about the ordering of the two.) The special argument "--" forces an end of option-scanning regardless of the scanning mode. This is why it does not fail on Linux. Ugly Linuxism. Going to create a review for this.
Created: https://reviews.freebsd.org/D49044
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=1db23ed5ef53c509dd421db6ded1c2f528916361 commit 1db23ed5ef53c509dd421db6ded1c2f528916361 Author: Andrea Venturoli <ml@netfence.it> AuthorDate: 2025-02-18 08:10:46 +0000 Commit: Michael Osipov <michaelo@FreeBSD.org> CommitDate: 2025-02-18 16:49:02 +0000 net/samba4{19,20}: "samba-tool domain backup offline" hangs PR: 250906 Tested by: ml@netfence.it Approved by: otis (mentor), kiwi, vvd, allanjude MFH: 2025Q1 Differential Revision: https://reviews.freebsd.org/D49044 net/samba419/Makefile | 2 +- net/samba419/files/patch-python_samba_tdb__util.py (new) | 15 +++++++++++++++ net/samba420/Makefile | 2 +- net/samba420/files/patch-python_samba_tdb__util.py (new) | 15 +++++++++++++++ 4 files changed, 32 insertions(+), 2 deletions(-)
A commit in branch 2025Q1 references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=cbfef6eec11b4465f3805ca26988be7e49773675 commit cbfef6eec11b4465f3805ca26988be7e49773675 Author: Andrea Venturoli <ml@netfence.it> AuthorDate: 2025-02-18 08:10:46 +0000 Commit: Michael Osipov <michaelo@FreeBSD.org> CommitDate: 2025-02-18 16:53:22 +0000 net/samba419: "samba-tool domain backup offline" hangs PR: 250906 Tested by: ml@netfence.it Approved by: otis (mentor), kiwi, vvd, allanjude MFH: 2025Q1 Differential Revision: https://reviews.freebsd.org/D49044 (cherry picked from commit 1db23ed5ef53c509dd421db6ded1c2f528916361) net/samba419/Makefile | 2 +- net/samba419/files/patch-python_samba_tdb__util.py (new) | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-)
Took way too long to fix. All done now. Mille grazie!