Based on some initial testing, a samba port built on a base system based from the git repo core dumps. A samba port built on a base system based of the svn repo does not. The core dump demonstrates: (lldb) thread backtrace all * thread #1, name = 'smbd', stop reason = signal SIGILL * frame #0: 0x00000008016c7ff6 libsmbconf.so.0`___lldb_unnamed_symbol100$$libsmbconf.so.0 + 118 frame #1: 0x0000000801bc087c libmessages-dgm-samba4.so`___lldb_unnamed_symbol32$$libmessages-dgm-samba4.so + 108 frame #2: 0x0000000801bbf4f7 libmessages-dgm-samba4.so`___lldb_unnamed_symbol18$$libmessages-dgm-samba4.so + 615 frame #3: 0x0000000802eb3e5c libtevent.so.0`tevent_common_invoke_fd_handler + 140 frame #4: 0x0000000802eb6cdd libtevent.so.0`___lldb_unnamed_symbol40$$libtevent.so.0 + 1901 frame #5: 0x0000000802eb3071 libtevent.so.0`_tevent_loop_once + 225 frame #6: 0x0000000802eb4fe1 libtevent.so.0`tevent_req_poll + 49 frame #7: 0x0000000001031dbe smbd`___lldb_unnamed_symbol25$$smbd + 622 frame #8: 0x0000000001030728 smbd`main + 2824 frame #9: 0x000000000102d0f2 smbd`_start + 226 I'm wondering that maybe the base system built from the git repo exports the system version wrong? Hence the SIGILL. I jumped from r368387 (svn) to r368820+7d8ff3245227-c255291(main) (git)
As extra info, I tested multiple samba versions, 4.12, 4.13 they all have the same behavior.
I had same problem on my FreeBSD CURRENT main-c255394-gf20c0e33195. And I have tried to recompile net/samba412 with following options in /etc/make.conf. ``` CC=clang11 CXX=clang++11 CPP=clang-cpp11 ``` The rebuilt smbd server works for me. It seems about a compiler problem.
Created attachment 221054 [details] samba.diff Default to llvm10 for the samba built. This does not result in core dumps of the daemon. llvm11 from ports is also fine (for now), as there have been patches in base which are not present in the port version. Not sure for how long. I was building llvm10 anyway for mesa-libs.
Thanks for the direction Yuichiro NAITO works fine now!
CC llvm maintainer from base. Maybe he knows the proper solution.
I've reproduced the SIGILL on 13.0-CURRENT main-c255407-g4f4111d2c5ab with samba413-4.13.1_1, and I'm doing some debugging. No clues yet. :)
What seems to happen is that messaging_recv_cb() has a variable length array (aka VLA) 'fds64[]', which is initialized with a zero count, and this is undefined behavior: Program received signal SIGSEGV, Segmentation fault. 0x0000000801c784a7 in messaging_recv_cb (ev=0x805475060, msg=0x7fffffffdbe8 "\035#", msg_len=98, fds=0x7fffffffdbdc, num_fds=0, private_data=0x80546e300) at ../../source3/lib/messages.c:394 394 int64_t fds64[MIN(num_fds, INT8_MAX)]; (gdb) print num_fds $6 = 0 Digging deeper.
Created attachment 221099 [details] Fix zero-sized VLAs in messaging part of net/samba413 Here is a patch for net/samba413 which should fix the undefined behavior with zero-sized VLAs in lib/source3/messages*.c. I will also attach patches for samba411 and samba412.
Created attachment 221100 [details] Fix zero-sized VLAs in messaging part of net/samba412
Created attachment 221101 [details] Fix zero-sized VLAs in messaging part of net/samba411
My advice would be to upstream these patches to Samba. In fact, they should probably do a full sweep of their source for these possibly zero-sizes VLAs, and compile the whole of Samba with -fsanitize=undefined, then doing a full regression test. (I tried adding -fsanitize=undefined to the CFLAGS of this port, but I could not get the waf build tools to correctly link the various dynamic libraries. So I will gladly leave that to the waf and/or samba experts. :) @Dries, if you could please check whether one of the patches fixes the crashes for you?
I runtime tested the patch for samba413, no more core dumps. Thanks!
I am observing the same thing on my box recently upgraded to the sources of the git repo when i upgraded my samba: A samba port built on a git based system repo doesn't work where the previously samba port (samba412) built on the svn repo of the system does. i don't see any core dumps thought; but the "signal 4" message at start on /var/log/messages: Dec 30 14:32:41 pcgyver kernel: pid 19312 (smbd), jid 0, uid 0: exited on signal 4 Good news: the patch provided for samba413 seems working for me too. Thanks
Reported upstream: https://bugzilla.samba.org/show_bug.cgi?id=14605 Merge request: https://gitlab.com/samba-team/samba/-/merge_requests/1743
The fix was accepted by upstream: https://git.samba.org/samba.git/?p=samba.git;a=commitdiff;h=3e96c95d41e4ccd0bf43b3ee78af644e2bc32e30
Ping? :)
Given maintainer timeout I think its fine you commit it. (> 3 weeks)
A commit references this bug: Author: dim Date: Sat Jan 30 13:22:41 UTC 2021 New revision: 563405 URL: https://svnweb.freebsd.org/changeset/ports/563405 Log: net/samba411 net/samba412 net/samba413: Fix zero-sized VLAs With recent versions of clang, samba could dump core shortly after startup, terminating with either SIGILL or SIGSEGV. Investigation showed that samba is using C99 variable length arrays (VLAs), and in some cases the length of these arrays would become zero. Since this is undefined behavior, various interesting things would happen, often ending in segfaults. Fix this by avoiding to use zero as the length for these VLA declarations. A similar patch was also sent upstream, and was accepted and included in subsequent samba releases. See also: https://bugzilla.samba.org/show_bug.cgi?id=14605 Reported by: Dries Michiels <driesm.michiels@gmail.com> PR: 252157 MFH: 2021Q1 Changes: head/net/samba411/Makefile head/net/samba411/files/patch-source3_lib_messages.c head/net/samba412/Makefile head/net/samba412/files/patch-source3_lib_messages.c head/net/samba413/Makefile head/net/samba413/files/patch-source3_lib_messages.c