Bug 209787 - net/samba44 : domain provisioning fails "Segmentation fault (core dumped)"
Summary: net/samba44 : domain provisioning fails "Segmentation fault (core dumped)"
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Many People
Assignee: Timur I. Bakeyev
URL:
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2016-05-27 09:42 UTC by dasti
Modified: 2017-12-18 04:24 UTC (History)
7 users (show)

See Also:
bugzilla: maintainer-feedback? (timur)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description dasti 2016-05-27 09:42:58 UTC
freebsd 10.3 up to date
samba 4.4.3_1 with package

- I found a python2.7.core with the size : 162463744 (not sure about what to do about that yet)
- tried povisionning without "--use-rfc2307" -> same result


the error message
############################################
samba-tool domain provision --use-rfc2307 --interactive
Realm [MYDOMAIN.LAN]: 
 Domain [MYDOMAIN]: 
 Server Role (dc, member, standalone) [dc]: 
 DNS backend (SAMBA_INTERNAL, BIND9_FLATFILE, BIND9_DLZ, NONE) [SAMBA_INTERNAL]: 
 DNS forwarder IP address (write 'none' to disable forwarding) [192.168.1.99]: 8.8.8.8
Administrator password: 
Retype password: 
Looking up IPv4 addresses
Looking up IPv6 addresses
No IPv6 address will be assigned
Setting up share.ldb
Setting up secrets.ldb
Setting up the registry
Setting up the privileges database
Setting up idmap db
Setting up SAM db
Setting up sam.ldb partitions and settings
Setting up sam.ldb rootDSE
Pre-loading the Samba 4 and AD schema
Adding DomainDN: DC=mydomain,DC=lan
Adding configuration container
Setting up sam.ldb schema
Setting up sam.ldb configuration data
Setting up display specifiers
Modifying display specifiers
Adding users container
Modifying users container
Adding computers container
Modifying computers container
Setting up sam.ldb data
Setting up well known security principals
Setting up sam.ldb users and groups
Setting up self join
Segmentation fault (core dumped)
Comment 1 dasti 2016-06-16 07:16:29 UTC
is there something I can do to help ? 

4.4.x is the official stable version of samba and it simply doesn't work on freebsd
Comment 2 Dron 2016-06-16 08:11:34 UTC
Yes, 4.4 port not working as DC.
This issue also raised with samba developers -https://bugzilla.samba.org/show_bug.cgi?id=11848
Comment 3 Alnis Morics 2016-07-16 11:14:00 UTC
The problem still persists as of Sat Jul 16 13:49:00 EEST 2016

freebsd-version -ku
10.3-RELEASE-p4
10.3-RELEASE-p5

Ports branch: 2016Q3, installed from ports.

From Makefile:
[..]
SAMBA4_VERSION=		4.4.5
[..]
PORTREVISION?=		1

# samba-tool domain provision --use-rfc2307 --interactive
[..]
Setting up self join
Segmentation fault (core dumped)
#

I also downloaded samba-4.4.5 source from samba.org, compiled it myself on a FreeBSD 10.3-RELEASE amd64 machine, and then provisioning and all the initial tests (AD DC default shares, DNS, Kerberos --see https://wiki.samba.org/index.php/Setup_a_Samba_Active_Directory_Domain_Controller#Testing_your_Samba_Domain_Controller) worked OK.
Comment 4 dasti 2016-07-17 03:54:29 UTC
I confirm samba 4.4.5_1  has the same problem on freebsd 10.3.



whaou, Alnis Morics, do you have the list of all the dependencies ? is it difficult to do that directly from the sources ?
Comment 5 Alnis Morics 2016-07-17 06:44:56 UTC
(In reply to dasti from comment #4)

No, it wasn't difficult at all. As I had installed the port (which I deinstalled because of this bug), all the dependencies where there. If anything, the configure script should tell you about missing dependencies (see https://wiki.samba.org/index.php/Build_Samba_from_source).

What I wanted to tell by this is that the problem probably isn't in the latest Samba 4.4.x source as suggested above but with the FreeBSD port. I have not yet done all the configurations that I intended for my server, though.
Comment 6 Timur I. Bakeyev freebsd_committer 2016-07-19 21:09:16 UTC
(In reply to Alnis Morics from comment #5)

Sounds like port problem, but honestly, I dont see how any of the patches could have such an effect. Most of them just adjust environment to fit hier(7) better.

Are you sure that your custom build picked up all the external libraries over bundled ones?
Comment 7 Alnis Morics 2016-07-20 07:46:22 UTC
(In reply to Timur I. Bakeyev from comment #6)

It probably does. As the net/samba44 port didn't work, I first only deinstalled the port itself retaining all the dependencies, including databases/ldb. When I built Samba from the tarball, the installed ldb was apparently detected, so it wasn't built. But the ldb from ports didn't load the Kerberos module when I tried ldbsearch. So I deinstalled  databases/ldb, too, recompiled and reinstalled Samba, and now it was built with its bundled ldb, and now ldbsearch loads Kerberos. And I didn't change the install target directories, so everything got installed under /usr/local/samba.

The Samba binary now uses at least these external libs:

# ldd /usr/local/samba/sbin/samba | grep -v '/usr/local/samba'
	libthr.so.3 => /lib/libthr.so.3 (0x801617000)
	libpopt.so.0 => /usr/local/lib/libpopt.so.0 (0x805838000)
	libtalloc.so.2 => /usr/local/lib/libtalloc.so.2 (0x805a44000)
	libtevent.so.0 => /usr/local/lib/libtevent.so.0 (0x805c51000)
	libc.so.7 => /lib/libc.so.7 (0x800821000)
	libmd.so.6 => /lib/libmd.so.6 (0x80666e000)
	libiconv.so.2 => /usr/local/lib/libiconv.so.2 (0x80687e000)
	libtdb.so.1 => /usr/local/lib/libtdb.so.1 (0x80865a000)
	libpam.so.5 => /usr/lib/libpam.so.5 (0x8098d6000)
	libinotify.so.0 => /usr/local/lib/libinotify.so.0 (0x80abab000)
	libintl.so.8 => /usr/local/lib/libintl.so.8 (0x80c5ef000)
	librt.so.1 => /usr/lib/librt.so.1 (0x80c7fa000)
	libcrypt.so.5 => /lib/libcrypt.so.5 (0x80ca00000)
	libgnutls.so.30 => /usr/local/lib/libgnutls.so.30 (0x80d953000)
	libexecinfo.so.1 => /usr/lib/libexecinfo.so.1 (0x80e11b000)
	libz.so.6 => /lib/libz.so.6 (0x80e31e000)
	libp11-kit.so.0 => /usr/local/lib/libp11-kit.so.0 (0x80f361000)
	libidn.so.11 => /usr/local/lib/libidn.so.11 (0x80f5bd000)
	libtasn1.so.6 => /usr/local/lib/libtasn1.so.6 (0x80f7f0000)
	libnettle.so.6 => /usr/local/lib/libnettle.so.6 (0x80fa02000)
	libhogweed.so.4 => /usr/local/lib/libhogweed.so.4 (0x80fc39000)
	libgmp.so.10 => /usr/local/lib/libgmp.so.10 (0x80fe6c000)
	libelf.so.1 => /usr/lib/libelf.so.1 (0x8100e2000)
	libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x8102f7000)
	libffi.so.6 => /usr/local/lib/libffi.so.6 (0x810505000)

Then there are 66 libs that explicitly contain the name 'samba4'. The remaining 15 that don't, still seem to be Samba-specific:

# ldd /usr/local/samba/sbin/samba | grep '/usr/local/samba' | grep -v 'samba4'
/usr/local/samba/sbin/samba:
	libsamba-util.so.0 => /usr/local/samba/lib/libsamba-util.so.0 (0x80183c000)
	libldb.so.1 => /usr/local/samba/lib/private/libldb.so.1 (0x801ac2000)
	libsamba-credentials.so.0 => /usr/local/samba/lib/libsamba-credentials.so.0 (0x801f06000)
	libsamba-hostconfig.so.0 => /usr/local/samba/lib/libsamba-hostconfig.so.0 (0x80211c000)
	libndr.so.0 => /usr/local/samba/lib/libndr.so.0 (0x8029ac000)
	libsamba-errors.so.1 => /usr/local/samba/lib/libsamba-errors.so.1 (0x803ad5000)
	libdcerpc.so.0 => /usr/local/samba/lib/libdcerpc.so.0 (0x803e11000)
	libsamdb.so.0 => /usr/local/samba/lib/libsamdb.so.0 (0x804831000)
	libndr-standard.so.0 => /usr/local/samba/lib/libndr-standard.so.0 (0x804e00000)
	libdcerpc-binding.so.0 => /usr/local/samba/lib/libdcerpc-binding.so.0 (0x805606000)
	libwbclient.so.0 => /usr/local/samba/lib/libwbclient.so.0 (0x808e9b000)
	libtevent-util.so.0 => /usr/local/samba/lib/libtevent-util.so.0 (0x8090b0000)
	libndr-nbt.so.0 => /usr/local/samba/lib/libndr-nbt.so.0 (0x80b1bf000)
	libndr-krb5pac.so.0 => /usr/local/samba/lib/libndr-krb5pac.so.0 (0x80bbc9000)
	libtevent-unix-util.so.0 => /usr/local/samba/lib/libtevent-unix-util.so.0 (0x80d751000)
Comment 8 Michael Osipov 2016-07-21 13:36:14 UTC
Struck by this too, here is the output I get:

===
The very last output of samba-tool:
store_acl_blob_fsp: storing blob length 356 on file /var/db/samba4/sysvol/ad001.osipov.eu/Policies
delete_windows_lock_ref_count for file /var/db/samba4/sysvol/ad001.osipov.eu/Policies
Speicherschutzverletzung(core dumped)

Running the same operation again with truss:
write(2,"store_acl_blob_fsp: storing blob"...,99) = 99 (0x63)
extattr_set_fd(0xf,0x1,0x827447cbf,0x825859760,0x164) = 356 (0x164)
write(2,"delete_windows_lock_ref_count fo"...,86) = 86 (0x56)
close(15)                                        = 0 (0x0)
umask(0x12)                                      = 0 (0x0)
fcntl(10,F_SETLKW,0x7fffffffcba8)                = 0 (0x0)
fcntl(10,F_SETLKW,0x7fffffffcc48)                = 0 (0x0)
fcntl(12,F_SETLK,0x7fffffffbdf8)                 = 0 (0x0)
fcntl(12,F_SETLKW,0x7fffffffbe78)                = 0 (0x0)
fcntl(12,F_SETLKW,0x7fffffffbee8)                = 0 (0x0)
fcntl(11,F_SETLK,0x7fffffffcd28)                 = 0 (0x0)
fcntl(11,F_SETLKW,0x7fffffffcda8)                = 0 (0x0)
fcntl(11,F_SETLKW,0x7fffffffce18)                = 0 (0x0)
SIGNAL 11 (SIGSEGV)
process killed, signal = 11 (core dumped)

Loading the core dump in GDB (command 'where') gives me:
#0  0x0000000806e5ba6e in ndr_pull_uint8 (ndr=0x833ce82e0, ndr_flags=256,
    v=0x7fffffffcc5f "") at ../librpc/ndr/ndr_basic.c:82
#1  0x0000000806e5f133 in ndr_pull_enum_uint8 (ndr=0x833ce82e0, ndr_flags=256,
    v=0x7fffffffcc5f "") at ../librpc/ndr/ndr_basic.c:346
#2  0x000000080708bf95 in ndr_pull_security_descriptor_revision (
    ndr=0x833ce82e0, ndr_flags=256, r=0x832200480)
    at default/librpc/gen_ndr/ndr_security.c:657
#3  0x000000080708ccb7 in ndr_pull_security_descriptor (ndr=0x833ce82e0,
    ndr_flags=768, r=0x832200480) at default/librpc/gen_ndr/ndr_security.c:768
#4  0x0000000806e692f6 in ndr_pull_struct_blob_all (blob=0x7fffffffce08,
    mem_ctx=0x8021fb100, p=0x832200480,
    fn=0x80708cbb0 <ndr_pull_security_descriptor>) at ../librpc/ndr/ndr.c:1133
#5  0x0000000811761084 in py_security_descriptor_ndr_unpack (
    py_obj=0x82607bf50, args=0x817a7aa10, kwargs=0x82607d398)
    at default/librpc/gen_ndr/py_security.c:1518
..

Both, truss output and core dump can be provided.
===

Can I assist somehow to resolve this issue? I can setup a VM within an hour for testing.
Comment 9 Michael Osipov 2016-07-21 13:36:58 UTC
Can someone meanwhile mark the port as known broken when compiled with AD DC support?
Comment 10 Juan Garcia 2016-10-04 02:13:47 UTC
I can confirm this behaviour with samba44-4.4.5_1

#0  0x0000000806e5ba6e in ndr_pull_uint8 (ndr=0x80201a1a0, ndr_flags=256, v=0x802019f60 "") at ../librpc/ndr/ndr_basic.c:82
82	../librpc/ndr/ndr_basic.c: No such file or directory.
	in ../librpc/ndr/ndr_basic.c
[New Thread 802006400 (LWP 100922/<unknown>)]
Current language:  auto; currently minimal
(gdb) bt
#0  0x0000000806e5ba6e in ndr_pull_uint8 (ndr=0x80201a1a0, ndr_flags=256, v=0x802019f60 "") at ../librpc/ndr/ndr_basic.c:82
#1  0x0000000807092483 in ndr_pull_dom_sid (ndr=0x80201a1a0, ndr_flags=768, r=0x802019f60) at ../librpc/ndr/ndr_sec_helper.c:332
#2  0x0000000806e692f6 in ndr_pull_struct_blob_all (blob=0x7fffffffd668, mem_ctx=0x802212860, p=0x802019f60, fn=0x8070923f0 <ndr_pull_dom_sid>) at ../librpc/ndr/ndr.c:1133
#3  0x0000000811e0cb94 in py_dom_sid_ndr_unpack (py_obj=0x8156dd090, args=0x8156c2ed0, kwargs=0x8156eb398) at default/librpc/gen_ndr/py_security.c:360
#4  0x0000000800b2cf40 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.7.so.1
#5  0x0000000800b258b4 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.7.so.1
#6  0x0000000800b32329 in _PyEval_SliceIndex () from /usr/local/lib/libpython2.7.so.1
#7  0x0000000800b2cc99 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.7.so.1
#8  0x0000000800b258b4 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.7.so.1
#9  0x0000000800ab659c in PyFunction_SetClosure () from /usr/local/lib/libpython2.7.so.1
#10 0x0000000800a92864 in PyObject_Call () from /usr/local/lib/libpython2.7.so.1
#11 0x0000000800b2db6f in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.7.so.1
#12 0x0000000800b258b4 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.7.so.1
#13 0x0000000800ab659c in PyFunction_SetClosure () from /usr/local/lib/libpython2.7.so.1
#14 0x0000000800a92864 in PyObject_Call () from /usr/local/lib/libpython2.7.so.1
#15 0x0000000800b2db6f in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.7.so.1
#16 0x0000000800b258b4 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.7.so.1
#17 0x0000000800ab659c in PyFunction_SetClosure () from /usr/local/lib/libpython2.7.so.1
#18 0x0000000800a92864 in PyObject_Call () from /usr/local/lib/libpython2.7.so.1
#19 0x0000000800b2db6f in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.7.so.1
#20 0x0000000800b258b4 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.7.so.1
#21 0x0000000800ab659c in PyFunction_SetClosure () from /usr/local/lib/libpython2.7.so.1
#22 0x0000000800a92864 in PyObject_Call () from /usr/local/lib/libpython2.7.so.1
#23 0x0000000800b2db6f in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.7.so.1
#24 0x0000000800b258b4 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.7.so.1
#25 0x0000000800b25226 in PyEval_EvalCode () from /usr/local/lib/libpython2.7.so.1
#26 0x0000000800b52554 in PyRun_FileExFlags () from /usr/local/lib/libpython2.7.so.1
#27 0x0000000800b520d1 in PyRun_SimpleFileExFlags () from /usr/local/lib/libpython2.7.so.1
#28 0x0000000800b65206 in Py_Main () from /usr/local/lib/libpython2.7.so.1
#29 0x00000000004007bf in _start ()


When comparing samba43 and 44 for the file which causes the crash I see only these two changes:

https://github.com/samba-team/samba/commit/ffbd9c4584d83c56e58901bc91effa75ebdcbb02
https://github.com/samba-team/samba/commit/2bb0b473c1255152291c3c43f82b1cc45fa83b00


Not sure if that helps anyone, but I can't see what the problem is.
Comment 11 Benno Rice freebsd_committer 2017-01-19 23:53:31 UTC
So Andrew Bartlett bailed me up on this bug at Linux.conf.au and I've tried it on a 12-CURRENT box. It looks like the seg fault's gone but there are other issues. I've xxx'ed out some values but:

> uname -a
FreeBSD bjrbsd 12.0-CURRENT FreeBSD 12.0-CURRENT #3 r310121: Fri Dec 16 18:50:43 PST 2016     benno@bjrbsd:/src/obj/src/freebsd/sys/GENERIC-NODEBUG  amd64
> samba-tool domain provision --use-rfc2307 --interactive
Realm [xxx]: FAKEDOMAIN.FAKE
 Domain [FAKEDOMAIN]:
 Server Role (dc, member, standalone) [dc]:
 DNS backend (SAMBA_INTERNAL, BIND9_FLATFILE, BIND9_DLZ, NONE) [SAMBA_INTERNAL]:
 DNS forwarder IP address (write 'none' to disable forwarding) [xxx]:
Administrator password:
Invalid administrator password.
Administrator password:
Retype password:
Looking up IPv4 addresses
Looking up IPv6 addresses
No IPv6 address will be assigned
Setting up secrets.ldb
Setting up the registry
Setting up the privileges database
Setting up idmap db
Setting up SAM db
Setting up sam.ldb partitions and settings
Setting up sam.ldb rootDSE
Pre-loading the Samba 4 and AD schema
Adding DomainDN: DC=fakedomain,DC=fake
Adding configuration container
Setting up sam.ldb schema
Setting up sam.ldb configuration data
Setting up display specifiers
Modifying display specifiers
Adding users container
Modifying users container
Adding computers container
Modifying computers container
Setting up sam.ldb data
Setting up well known security principals
Setting up sam.ldb users and groups
Setting up self join
ERROR(<class 'samba.provision.ProvisioningError'>): Provision failed - ProvisioningError: Your filesystem or build does not support posix ACLs, which s3fs requires.  Try the mounting the filesystem with the 'acl' option.
  File "/usr/local/lib/python2.7/site-packages/samba/netcmd/domain.py", line 461, in run
    nosync=ldap_backend_nosync, ldap_dryrun_mode=ldap_dryrun_mode)
  File "/usr/local/lib/python2.7/site-packages/samba/provision/__init__.py", line 2171, in provision
    skip_sysvolacl=skip_sysvolacl)
  File "/usr/local/lib/python2.7/site-packages/samba/provision/__init__.py", line 1805, in provision_fill
    names.domaindn, lp, use_ntvfs)
  File "/usr/local/lib/python2.7/site-packages/samba/provision/__init__.py", line 1557, in setsysvolacl
    raise ProvisioningError("Your filesystem or build does not support posix ACLs, which s3fs requires.  "

The filesystem in question is ZFS.
Comment 12 Benno Rice freebsd_committer 2017-01-20 04:57:01 UTC
... and it turns out Samba needs some work before it supports ZFS.

Having got my sysvol directory on to a UFS filesystem I'm now getting the seg fault on 12-CURRENT.
Comment 13 Benno Rice freebsd_committer 2017-01-20 06:18:03 UTC
It works if you use gcc (in my case gcc 4.9.4) as the compiler. I suspect that clang isn't zeroing the DATA_BLOB structure on the stack in the py_security_descriptor_ndr_unpack function in the PIDL-generated bin/default/librpc/gen_ndr/py_security.c file.
Comment 14 Dron 2017-01-20 10:07:08 UTC
Interesting thing, that if compile samba from sources without port, provision works and no segfault appears. I wrote about that at samba bug tracker - https://bugzilla.samba.org/show_bug.cgi?id=11848 and posted link here above.
Comment 15 Alnis Morics 2017-01-30 14:30:56 UTC
I just built Samba 4.5.4 twice: with and without gcc on 11.0-RELEASE-p7, UFS filesystem. In both cases provisioning works but I was now stuck elsewhere: "net rpc" commands don't work, also in both cases, e.g.:

# net rpc -I 192.168.0.192 rights list -U administrator
Enter administrator's password:
... 
Could not connect to server 192.168.0.192
Connection failed: NT_STATUS_UNSUCCESSFUL
failed to make ipc connection: NT_STATUS_UNSUCCESSFUL
return code = -1
Opening cache file at /usr/local/samba/var/cache/gencache.tdb
Opening cache file at /usr/local/samba/var/lock/gencache_notrans.tdb
tdb(/usr/local/samba/var/lock/gencache_notrans.tdb): allrecord_mutex_lock() failed: Invalid argument
Could not get allrecord lock on gencache_notrans.tdb: Locking error
Freeing parametrics:
#

When I built Samba 4.4.5 some months earlier (see comment #3) there were no problems with provisioning or "net rpc" commands.
Comment 16 Dron 2017-01-30 14:47:08 UTC
Building directly from sources have no issue with domain provisioning, at least with 4.4.X. This problem appears if samba44 port is used. Samba45 port is not exists for now.

I compiled port with gcc and really there is no problem with domain provisioning. Meanwhile tdb, ldb, talloc and tevent were compiled with clang (from previous building, not recompiled them) and seems have no influence to this bug. Interesting why compiled from sources with clang samba don't segfault and provision finishes sucessfully...
Comment 17 Alnis Morics 2017-01-30 15:06:26 UTC
(In reply to Dron comment #16)

Just to clarify: Of course, samba45 port doesn't exist yet. It was from sources that I now built 4.5.4. I also described it on Samba mailinglist at https://lists.samba.org/archive/samba/2017-January/206184.html
Comment 18 Alnis Morics 2017-02-10 16:00:33 UTC
Maybe it's worth to note that provisioning works well on an i386 architecture.

(FreeBSD 11.0-RELEASE-p7, samba44-4.4.8_1, installed via pkg, 2017Q1 quarterly)
Comment 19 Dron 2017-03-28 10:40:09 UTC
Issue was with PIDL. In FreeBSD it was as separate port for it and PIDL was used from 4.3 branch. Switching to PIDL bundled with samba resolved issue.
4.5 and 4.6 ports are using bundled version of PIDL.
Comment 20 Michael Osipov 2017-03-28 11:36:04 UTC
(In reply to Dron from comment #19)

How did you find this out?
Comment 21 Timur I. Bakeyev freebsd_committer 2017-03-28 13:07:36 UTC
(In reply to Michael Osipov from comment #20)

That was my guess after seeing code breakage when trying to build samba45 with Pidl-4.6.1, as well as I couldn't reproduce seg.faults in my environment, where samba44 was compiled with external Pidl, but from samba44 code base.

I still would like to see more confirmations of this fact, but meanwhile anyhow will convert samba44 to use bundled Pidl. samba45 and samba46 already do use their own versions due reasons written above.
Comment 22 Timur I. Bakeyev freebsd_committer 2017-12-18 04:24:22 UTC
New port version of Samba 4.7 was added to the ports tree with a lot of local and upstream fixes, as well as patches provided by iXsystems Inc.

There is a huge amount of work that has been done and it seems, that AD provisioning is working back again on the ZFS volumes(didn't have UFS volume for testing, but I hope it works as well).

Please, try the new port and report back if there are still any problems with the provisioning are remaining.