Bug 250906 - net/samba4{19,20}: "samba-tool domain backup offline" hangs
Summary: net/samba4{19,20}: "samba-tool domain backup offline" hangs
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: FreeBSD Samba Team
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-06 16:36 UTC by ml
Modified: 2025-02-18 16:55 UTC (History)
3 users (show)

See Also:
bugzilla: maintainer-feedback? (timur)


Attachments
Patch against net/samba419 (1.73 KB, patch)
2025-02-17 13:52 UTC, ml
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description ml 2020-11-06 16:36:51 UTC
#samba-tool domain backup offline --targetdir .
running backup on dirs: /var/db/samba4/private /var/db/samba4 /usr/local/etc
Starting transaction on /var/db/samba4/private/secrets
(...)

What is really hanged is a subprocess that samba-tool starts:
/usr/local/bin/tdbbackup -s .copy.tdb /var/db/samba4/private/secrets.ldb 



This is a  long standing issue since Samba 4.10 (which introduced this command).
Now I tried upgrading to 4.12, but nothing changed.

A discussion on Samba's mailing list suggested this might be caused by an older version of TDB and that that library should be bundled.

Building (in Poudriere) with SAMBA4_BUNDLED_TDB=yes, however will produce the following:
# samba-tool domain backup offline --targetdir .
running backup on dirs: /var/db/samba4/private /var/db/samba4 /usr/local/etc
Starting transaction on /var/db/samba4/private/secrets
ERROR(<class 'FileNotFoundError'>): uncaught exception - [Errno 2] No such file or directory: '/root/bin/tdbbackup': '/root/bin/tdbbackup'
  File "/usr/local/lib/python3.7/site-packages/samba/netcmd/__init__.py", line 186, in _run
    return self.run(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/samba/netcmd/domain_backup.py", line 1061, in run
    self.backup_secrets(paths.private_dir, lp, logger)
  File "/usr/local/lib/python3.7/site-packages/samba/netcmd/domain_backup.py", line 954, in backup_secrets
    self.offline_tdb_copy(secrets_path + '.ldb')
  File "/usr/local/lib/python3.7/site-packages/samba/netcmd/domain_backup.py", line 928, in offline_tdb_copy
    tdb_copy(path, backup_path, readonly=True)
  File "/usr/local/lib/python3.7/site-packages/samba/tdb_util.py", line 40, in tdb_copy
    status = subprocess.check_call(tdbbackup_cmd, close_fds=True, shell=False)
  File "/usr/local/lib/python3.7/subprocess.py", line 358, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/usr/local/lib/python3.7/subprocess.py", line 339, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/usr/local/lib/python3.7/subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "/usr/local/lib/python3.7/subprocess.py", line 1551, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
A transaction is still active in ldb context [0x800a3cae0] on /var/db/samba4/private/secrets.ldb
Comment 1 Rene Ladan freebsd_committer freebsd_triage 2022-12-18 12:37:31 UTC
samba412 expired and got removed, can you retry this with a later version?
Comment 2 ml 2022-12-18 15:03:30 UTC
(In reply to Rene Ladan from comment #1)

The problem persists with 4.16.7.
Comment 3 Michael Osipov freebsd_committer freebsd_triage 2025-01-10 16:53:14 UTC
Can someone retry?
Comment 4 ml 2025-01-12 11:57:20 UTC
(In reply to Michael Osipov from comment #3)
Same hangs as before, with Samba 4.19, compiled with SAMBA4_BUNDLED_LDB=yes, but still not bundling TDB.
Notice this is still on 2024Q4; I'll have packages for 2025Q1 in a few days.
I might then try bundling TDB too.
Comment 5 Michael Osipov freebsd_committer freebsd_triage 2025-01-12 18:44:25 UTC
(In reply to ml from comment #4)

Thanks for the confirmation, updating title.
Comment 6 ml 2025-01-21 09:49:20 UTC
(In reply to ml from comment #4)

I tried Samba from 2025Q1, with and without bundled TDB: it still hangs.
Comment 7 Michael Osipov freebsd_committer freebsd_triage 2025-01-21 09:51:28 UTC
(In reply to ml from comment #6)
Can you try to trace again and see which process exactly hangs?
Comment 8 ml 2025-01-22 18:24:57 UTC
(In reply to Michael Osipov from comment #7)

The process that hangs is still "/usr/local/bin/tdbbackup -s .copy.tdb /var/db/samba4/private/secrets.ldb -r" and it's stuck in fcntl.

I tried "truss"ing it, but it obviously gives not output.
If needed, I can compile tdb with debug info and try to attach gdb to it.
Anything else that might be useful?
Comment 9 Michael Osipov freebsd_committer freebsd_triage 2025-01-22 18:41:25 UTC
(In reply to ml from comment #8)
Better than nothing. We should at least see what is passed to fcntl...
Comment 10 ml 2025-01-26 17:26:39 UTC
I think I found out what the problem is.

As I said, the process that locks is "/usr/local/bin/tdbbackup -s .copy.tdb /var/db/samba4/private/secrets.ldb -r".
Looking at its main(), it uses getopt to interpret all the options; however getopt stops at "/var/db/samba4/private/secrets.ldb", so "-r" is never considered.

Since "-r" is needed to open the databases in read-only mode, the process tries to open them read/write and locks up since they are used by the Samba processes.

The correct way to call tdbbackup would be either
"/usr/local/bin/tdbbackup -s .copy.tdb -r /var/db/samba4/private/secrets.ldb"
or
"/usr/local/bin/tdbbackup -r -s .copy.tdb /var/db/samba4/private/secrets.ldb"




So the problem lies in /usr/local/lib/python3.11/site-packages/samba/tdb_util.py, where we find:
    tdbbackup_cmd = [toolpath, "-s", ".copy.tdb", file1]
    if readonly:
        tdbbackup_cmd.append("-r")

This should be changed to something like:
    if readonly:
        tdbbackup_cmd = [toolpath, "-r", "-s", ".copy.tdb", file1]
    else:
        tdbbackup_cmd = [toolpath, "-s", ".copy.tdb", file1]

With this change, "samba-tool domain backup offline" goes on and possibly succeed.



Now, before I provide a patch, I only got one doubt...
We don't change neither tdb_util.py, nor tdbbackup.c in our ports, so, unless our getopt works differently than Linux's, this is not a FreeBSD problem and should be reported upstream, instead.
Comment 11 ml 2025-02-13 10:13:29 UTC
I decided to report this upstream:
https://bugzilla.samba.org/show_bug.cgi?id=15804
Comment 12 Michael Osipov freebsd_committer freebsd_triage 2025-02-13 10:20:19 UTC
(In reply to ml from comment #11)

Thank you, for the time being, can you provide a one-off patch? I'd create the review and get it into ports tree.
Comment 13 ml 2025-02-17 13:52:39 UTC
Created attachment 257599 [details]
Patch against net/samba419

I guess the same could be applied to 4.20 (and 4.21).
Comment 14 Michael Osipov freebsd_committer freebsd_triage 2025-02-18 07:53:32 UTC
man 3 getopt on RHEL8 says:

       If there are no more option characters, getopt() returns -1.  Then optind is the index in argv
       of the first argv-element that is not an option.

but, here is the BUT:


       By default, getopt() permutes the contents of argv as it scans, so  that  eventually  all  the
       nonoptions  are  at the end.  Two other modes are also implemented.  If the first character of
       optstring is '+' or the environment variable POSIXLY_CORRECT is set,  then  option  processing
       stops  as soon as a nonoption argument is encountered.  If the first character of optstring is
       '-', then each nonoption argv-element is handled as if it were the argument of an option  with
       character  code  1.   (This  is used by programs that were written to expect options and other
       argv-elements in any order and that care about the ordering of the two.)  The special argument
       "--" forces an end of option-scanning regardless of the scanning mode.

This is why it does not fail on Linux.

Ugly Linuxism. Going to create a review for this.
Comment 15 Michael Osipov freebsd_committer freebsd_triage 2025-02-18 08:52:46 UTC
Created: https://reviews.freebsd.org/D49044
Comment 16 commit-hook freebsd_committer freebsd_triage 2025-02-18 16:49:37 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=1db23ed5ef53c509dd421db6ded1c2f528916361

commit 1db23ed5ef53c509dd421db6ded1c2f528916361
Author:     Andrea Venturoli <ml@netfence.it>
AuthorDate: 2025-02-18 08:10:46 +0000
Commit:     Michael Osipov <michaelo@FreeBSD.org>
CommitDate: 2025-02-18 16:49:02 +0000

    net/samba4{19,20}: "samba-tool domain backup offline" hangs

    PR:             250906
    Tested by:      ml@netfence.it
    Approved by:    otis (mentor), kiwi, vvd, allanjude
    MFH:            2025Q1
    Differential Revision:  https://reviews.freebsd.org/D49044

 net/samba419/Makefile                                    |  2 +-
 net/samba419/files/patch-python_samba_tdb__util.py (new) | 15 +++++++++++++++
 net/samba420/Makefile                                    |  2 +-
 net/samba420/files/patch-python_samba_tdb__util.py (new) | 15 +++++++++++++++
 4 files changed, 32 insertions(+), 2 deletions(-)
Comment 17 commit-hook freebsd_committer freebsd_triage 2025-02-18 16:54:39 UTC
A commit in branch 2025Q1 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=cbfef6eec11b4465f3805ca26988be7e49773675

commit cbfef6eec11b4465f3805ca26988be7e49773675
Author:     Andrea Venturoli <ml@netfence.it>
AuthorDate: 2025-02-18 08:10:46 +0000
Commit:     Michael Osipov <michaelo@FreeBSD.org>
CommitDate: 2025-02-18 16:53:22 +0000

    net/samba419: "samba-tool domain backup offline" hangs

    PR:             250906
    Tested by:      ml@netfence.it
    Approved by:    otis (mentor), kiwi, vvd, allanjude
    MFH:            2025Q1
    Differential Revision:  https://reviews.freebsd.org/D49044

    (cherry picked from commit 1db23ed5ef53c509dd421db6ded1c2f528916361)

 net/samba419/Makefile                                    |  2 +-
 net/samba419/files/patch-python_samba_tdb__util.py (new) | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)
Comment 18 Michael Osipov freebsd_committer freebsd_triage 2025-02-18 16:55:31 UTC
Took way too long to fix. All done now. Mille grazie!