kf5-syntax-highlighting build fails with undefined symbol error on a FreeBSD 11.2 system with at least one VLAN network interface. I know it is odd for network configuration on the system to affect the build, but it is really what I found after 3 days of debugging. Here are the error messages: [94/132] cd /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data && /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/katehighlightingindexer /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data/index.katesyntax /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/syntax-highlighting-5.49.0/data/schema/language.xsd /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data/syntax-data.qrc FAILED: data/index.katesyntax cd /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data && /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/katehighlightingindexer /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data/index.katesyntax /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/syntax-highlighting-5.49.0/data/schema/language.xsd /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data/syntax-data.qrc /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so: Undefined symbol "_ZN17QNetworkInterfaceC1ERKS_@Qt_5" ninja: build stopped: subcommand failed. I guess this is a memory corruption issue in Qt5 network module, which may provide the kernel a bad pointer and cause the kernel to overwrite data of the runtime linker. The symbol '_ZN17QNetworkInterfaceC1ERKS_' does exist in /usr/local/lib/qt5/libQt5Network.so.5 and /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so correctly lists libQt5Network.so.5 as its dependency with NEEDED, but the runtime linker rejects the symbol in libQt5Network.so.5 when comparing version tags. Steps to reproduce the problem: 1. Install FreeBSD 11.2 amd64 and download the ports tree. Whether it is a physical machine or a virtual machine doesn't matter. 2. Create a VLAN network interface. It can be done with command 'ifconfig vlan3 create vlan 3 vlandev re0' where 're0' is your network interface. 3. Make sure the runtime linker /libexec/ld-elf.so.1 is compiled with -O2 option. This is the default, so you don't have to do anything in this step unless you don't use binaries distributed by FreeBSD project. 4. Install textproc/qt5-xmlpatterns port with portmaster. 5. Build textproc/kf5-syntax-highlighting. It was tested on FreeBSD 11.2-RELEASE-p3 amd64 with ports revision 479821. I could reproduce it on 3 systems (physical machine, virtual machine, jail on virtual machine) and each of them runs on different hardware. I mentioned qt5-xmlpatterns above because it is an optional dependency of kf5-syntax-highlighting. kf5-syntax-highlighting can be built without problems when qt5-xmlpatterns is not installed, but it also means that it doesn't link to qt5-network. kf5-syntax-highlighting automatically picks up qt5-xmlpatterns during the configure phase and it is qt5-xmlpatterns that causes kf5-syntax-highlighting to load qt5-network during the build. The following are results of my debugging. I haven't found the root cause of the problem, but I think these notes may be useful to do further debugging. I started by checking symbol tables of both libqgenericbearer.so and libQt5Network.so.5. $ pkg which /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so was installed by package qt5-network-5.11.1 $ readelf -aW /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so Symbol table (.dynsym) contains 140 entries: Num: Value Size Type Bind Vis Ndx Name 69: 0000000000000000 21 FUNC GLOBAL DEFAULT UND _ZN17QNetworkInterfaceC1ERKS_@Qt_5 (2) $ pkg which /usr/local/lib/qt5/libQt5Network.so.5 /usr/local/lib/qt5/libQt5Network.so.5 was installed by package qt5-network-5.11.1 $ readelf -aW /usr/local/lib/qt5/libQt5Network.so.5 Symbol table (.dynsym) contains 2161 entries: Num: Value Size Type Bind Vis Ndx Name 1245: 00000000000c7790 21 FUNC GLOBAL DEFAULT 12 _ZN17QNetworkInterfaceC1ERKS_@@Qt_5 (3) The plugin links to libQt5Network.so.5 properly: $ ldd /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/katehighlightingindexer /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/katehighlightingindexer: libQt5XmlPatterns.so.5 => /usr/local/lib/qt5/libQt5XmlPatterns.so.5 (0x800a00000) libQt5Network.so.5 => /usr/local/lib/qt5/libQt5Network.so.5 (0x801033000) libQt5Core.so.5 => /usr/local/lib/qt5/libQt5Core.so.5 (0x801400000) libc++.so.1 => /usr/lib/libc++.so.1 (0x801aec000) ... $ ldd /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so: libQt5Network.so.5 => /usr/local/lib/qt5/libQt5Network.so.5 (0x80120c000) libQt5Core.so.5 => /usr/local/lib/qt5/libQt5Core.so.5 (0x801600000) libc++.so.1 => /usr/lib/libc++.so.1 (0x801cec000) ... But the program which throws the undefined symbol error, katehighlightingindexer, doesn't link to libqgenericbearer.so. It suggests that libqgenericbearer.so is loaded by calling dlopen. I set a breakpoint on dlopen in GDB, and yes, it calls it with: dlopen("/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so", RTLD_NODELETE | RTLD_LAZY); The return value of dlopen is correct. It is properly loaded, and the hash of the version entry is 363045. (gdb) b dlopen Function "dlopen" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (dlopen) pending. (gdb) r 1 2 3 Starting program: /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/katehighlightingindexer 1 2 3 [New LWP 101325 of process 74133] Thread 1 hit Breakpoint 1, dlopen (name=0x805415498 "/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so", mode=4097) at /usr/src/libexec/rtld-elf/rtld.c:3193 warning: Source file is more recent than executable. 3193 return (rtld_dlopen(name, -1, mode)); (gdb) finish Run till exit from #0 dlopen (name=0x805415498 "/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so", mode=4097) at /usr/src/libexec/rtld-elf/rtld.c:3193 0x000000080165a731 in ?? () from /usr/local/lib/qt5/libQt5Core.so.5 Value returned is $2 = (void *) 0x80067e000 (gdb) p ((Obj_Entry *)(0x80067e000))->vertab[2] $3 = {hash = 363045, flags = 0, name = 0x807202678 "Qt_5", file = 0x8072025de "libQt5Network.so.5"} (gdb) p ((Obj_Entry *)(0x80067e000))->path $8 = 0x800634f40 "/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so" The number '2' seems to come from the '(2)' suffix of the output of readelf. I assumes it means the version tag used by the symbol has index 2. (gdb) b _rtld_bind if $_streq(obj->path, "/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so") && obj->vertab[2].hash != 363045 Breakpoint 3 at 0x80060f907: file /usr/src/libexec/rtld-elf/rtld.c, line 810. (gdb) c Continuing. [Switching to LWP 101325 of process 74133] Thread 2 hit Breakpoint 3, _rtld_bind (obj=0x80067e000, reloff=1272) at /usr/src/libexec/rtld-elf/rtld.c:810 810 rlock_acquire(rtld_bind_lock, &lockstate); (gdb) p obj->vertab[2] $17 = {hash = 32, flags = 0, name = 0x807202678 "Qt_5", file = 0x8072025de "libQt5Network.so.5"} The value of the hash field of the version entry has changed from 363045 to 32. The value '32' isn't random. I always get the same value here. If you follow the execution of the correct _rtld_bind call, you will find it fails to match the version tag at file /usr/src/libexec/rtld-elf/rtld.c, function matched_symbol, line 4329: 4329 if (obj->vertab[verndx].hash != req->ventry->hash || 4330 strcmp(obj->vertab[verndx].name, req->ventry->name)) { 4331 /* 4332 * Version does not match. Look if this is a 4333 * global symbol and if it is not hidden. If 4334 * global symbol (verndx < 2) is available, 4335 * use it. Do not return symbol if we are 4336 * called by dlvsym, because dlvsym looks for 4337 * a specific version and default one is not 4338 * what dlvsym wants. 4339 */ 4340 if ((req->flags & SYMLOOK_DLSYM) || 4341 (verndx >= VER_NDX_GIVEN) || 4342 (obj->versyms[symnum] & VER_NDX_HIDDEN)) 4343 return (false); 4344 } verndx is 2, and req->ventry->hash is 363045. If obj->vertab[2].hash hasn't been modified, the runtime linker will pick this symbol and the execution can continue. I tried to set a hardware watchpoint on obj->vertab[2].hash in GDB, but the watchpoint never hit. I also tried to set a software watchpoint on the same address, and the result wasn't always the same. Most of the time it ran forever and I interrupted it after a few minutes, but sometimes it stopped at instructions which should not modify the memory, such as 'mov r15,QWORD PTR fs:0x10' and 'mov r15,rdi'. Therefore, I thought the hash value was modified by the kernel, but 'catch syscall' command in GDB didn't seem to work for me. GDB kept printing 'Thread 2 received signal SIGSYS, Bad system call.' and made the program behave abnormally. I decided to use DTrace to track the hash value changes for me: # dtrace -n 'syscall:::entry, syscall:::return /pid == 99608/ { printf("%s %u ==> %x %x %x %x", probefunc, *(unsigned int *)copyin(0x801242230, 4), arg0, arg1, arg2, arg3); }' dtrace: description 'syscall:::entry, syscall:::return ' matched 2168 probes CPU ID FUNCTION:NAME 1 80243 ioctl:entry ioctl 363045 ==> 8 c0306938 7fffdfffd770 0 1 80244 ioctl:return ioctl 32 ==> 0 0 0 0 0x801242230 was the address of the hash variable obtained from GDB. It seems it was a 'ioctl(8, SIOCGIFMEDIA, 0x7fffdfffd730)' call that changed the value. 8 was a socket file descriptor created by calling 'socket(PF_INET, SOCK_DGRAM | SOCK_CLOEXEC, 0)'. 0x7fffdfffd730 looked like a pointer on the stack, as 'procstat -v' said this region grew down. I stopped debugging here and temporarily removed the VLAN interface with 'ifconfig vlan3 destroy' to let portmaster upgrade kf5-syntax-highlighting and hundreds of other ports for me. The conclusion is that I probably have to read the code of qt5-network in order to figure out what really happens. I found totally 3 ways to workaround the problem on systems affected by this problem: 1. Remove all VLAN interfaces, which may not be possible if your networking environment requires it. 2. Use Clang 6 shipped with FreeBSD base to recompile /libexec/ld-elf.so.1 with -O1, -O0, or -DDEBUG. 3. Use GCC 8 from ports to recompile /libexec/ld-elf.so.1 with -O0. Using -O1 or -DDEBUG doesn't help when using GCC. In fact, I didn't replace /libexec/ld-elf.so.1 on the system because it is risky. I did the test by either running the compiled ld-elf.so.1 under /usr/src/libexec/rtld-elf directly as an executable or modifying the interpreter path stored in katehighlightingindexer executable with 'patchelf --set-interpreter' command.
The vlans seem like a red herring to me (unless networking is loading genericbearer based on the presence of vlans .. which seems really peculiar). However, kf5-syntax-highlighting *should not* depend on xmlpatterns and should not change its build based on its presence. So there's some configuration that needs wrangling there at the very least.
kf5-syntax-highlighting prints these messages at the end of configure phase: -- The following OPTIONAL packages have been found: * Qt5Widgets Example application. * Qt5XmlPatterns Compile-time validation of syntax definition files. libqgenericbearer.so is always loaded and _ZN17QNetworkInterfaceC1ERKS_ is always called when katehighlightingindexer is linked to qt5-xmlpatterns. When there is no VLAN interface, the hash value isn't modified and the symbol can be found successfully. When I set a breakpoint on _ZN17QNetworkInterfaceC1ERKS_, it hits multiple times before it exits. When there is a VLAN interface, the hash value is modified and the symbol cannot be found. I set breakpoints on both _ZN17QNetworkInterfaceC1ERKS_@plt in libqgenericbearer.so and _ZN17QNetworkInterfaceC1ERKS_ in libQt5Network.so.5. Only the PLT one is hit because _rtld_bind calls rtld_die, which causes the program to exit early.
Yes, I have the same issue. Vlans are configured and kf5-syntax-highlighting does not build.
(In reply to Yan Batyuto from comment #3) I've seen something similar (but with lumina and telegram-desktop). Using VLANs I get: # telegram-desktop Got keys from plugin meta data ("generic") QFactoryLoader::QFactoryLoader() checking directory path "/usr/local/bin/bearer" ... loaded library "/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so" /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so: Undefined symbol "_ZN17QNetworkInterfaceC1ERKS_@Qt_5" # start-lumina-desktop ..... [Lumina] Checking User Files - Old Version: "1.4.1" - Current Version: "1.4.1" - Made Changes: false Finished with user files check QFactoryLoader::QFactoryLoader() checking directory path "/usr/local/lib/qt5/plugins/accessiblebridge" ... QFactoryLoader::QFactoryLoader() checking directory path "/usr/local/bin/accessiblebridge" ... qt.qpa.xcb: QXcbConnection: XCB error: 148 (Unknown), sequence: 206, resource id: 0, major code: 140 (Unknown), minor code: 20 Got Desktop Process Finished: 1 Finished Closing Down Lumina QLibraryPrivate::unload succeeded on "/usr/local/lib/qt5/plugins/platforminputcontexts/libcomposeplatforminputcontextplugin.so" QLibraryPrivate::unload succeeded on "/usr/local/lib/qt5/plugins/xcbglintegrations/libqxcb-glx-integration.so" QLibraryPrivate::unload succeeded on "/usr/local/lib/qt5/plugins/platforms/libqxcb.so" QLibraryPrivate::unload succeeded on "Xcursor" # lumina-desktop (just to see the qt5_debug_plugins) Got keys from plugin meta data ("generic") QFactoryLoader::QFactoryLoader() checking directory path "/usr/local/bin/bearer" ... loaded library "/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so" /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so: Undefined symbol "_ZN17QNetworkInterfaceC1ERKS_@Qt_5" Exit 1 If I destroy the vlans everything works.
(In reply to Ricardo from comment #4) Forgot to mention, it's a stable/11 (FreeBSD 11.2-STABLE #2 r338902)
We can "workaround" with kf5-syntax-highlighting by blocking Qt5Network, which will turn of some build-time schema-validation. That's build-time validation that happens elsewhere anyway, so it doesn't really buy us anything but extra dependencies and build time. This will get it to build again. *But* the bigger issue is that underneath, in Qt5Networking, there's a problem with VLANs on FreeBSD. That is what actually needs to be sorted out.
*** Bug 231999 has been marked as a duplicate of this bug. ***
Any Qt5 application with networking (e.g. quassel, kmail, falkon ...) can be crashed by creating a vlan while the application is running; once there's a vlan the application no longer starts. It all comes down to the missing symbols / mismatched symbols as Ting-Wei Lan has described. I've started doing *some* debugging, but it's a giant pain in the butt and I don't understand the symbol versioning very well. readelf(1) tells me there are @@Qt_5 symbols and @Qt_5 symbols (one @ or two) and that *seems* to be related.
*** Bug 232318 has been marked as a duplicate of this bug. ***
(In reply to Adriaan de Groot from comment #8) Note that there is no missing or mismatched symbols here. Both libraries are built correctly and symbols should be successfully resolved. It is the memory corruption issue that overwrites the data of the runtime linker, causing it to reject the symbol early before comparing strings. If I understand correctly, both @ and @@ are used to denote the version tag. In addition, @@ means it is the default version and the build time linker should choose it if object files don't specify a version. Therefore, undefined @Qt_5 symbol should be resolved to the @@Qt_5 at runtime if there is no memory issue and it is the case when there is no VLAN interface on the system.
Ting-Wei Lan, could you file a bug against ld, then? This isn't going to get fixed by us staring at the code of Qt5Network.
A commit references this bug: Author: adridg Date: Thu Oct 18 12:19:58 UTC 2018 New revision: 482342 URL: https://svnweb.freebsd.org/changeset/ports/482342 Log: Workaround textprof/kf5-syntax-hightlighting build failure. (library) Qt5Network crashes in the presence of VLANs. This terminates the build when the host build process runs applications that touch the network -- which happens during schema validation, which is done if the host has XmlPatters installed. Workaround by ignoring XmlPatterns. Underlying problem (Qt5Network and VLANs) has not been addressed. PR: 231402 Reported by: Ting-Wei Lan Changes: head/textproc/kf5-syntax-highlighting/Makefile
(In reply to commit-hook from comment #12) It doesn't look like an issue of ld or rtld to me. Both ld and rtld do the right thing, and it is an ioctl call on a socket file descriptor that modifies the internal data structure of rtld. I still believe the problem is in Qt5Network itself, but I haven't spent time debugging the issue further.
A **workaround** is to add QT_EXCLUDE_GENERIC_BEARER=1 to your environment.
*** Bug 233798 has been marked as a duplicate of this bug. ***
I've been debug-chasing this for a few days in an 11.2 VM. The goal is to allow genericbearer to load -- that is, the environment-variable workaround should not be necessary. As Ting-Wei Lan pointed out originally, everything looks like memory corruption **somewhere**. - removing the call to get the interfaceFromIndex(0) fixes the problem - I found a spot in QNetworkInterface where adding qWarning() << "foo" fixes the problem When build WITH_DEBUG=yes I get crashes (SEGV), rather than unresolved symbols: more hint that it's memory corruption. In any case: *because* this is corrupting memory from a Qt-internal method that is listing network interfaces, I would like to fix the root cause rather than working around things.
A commit references this bug: Author: adridg Date: Mon Dec 24 16:46:18 UTC 2018 New revision: 488276 URL: https://svnweb.freebsd.org/changeset/ports/488276 Log: Fix net/qt5-network in the face of VLANs. Adding a VLAN to a FreeBSD system caused memory corruption -- usually enough to make rtld fall over with symbol resolution errors, although in DEBUG builds it would just crash. Revamp network interface discovery to not be full of memory gotcha's. An explanation is included in the patches. While here, "make makesum" has moved some files around. PR: 231402, 233798, 232318 Reported by: Ting-Wei Lan, Nils Beyer, Marek Zarychta Changes: head/net/qt5-network/Makefile head/net/qt5-network/files/patch-qsslcontext_openssl.cpp head/net/qt5-network/files/patch-src_network_kernel_qnetworkinterface__unix.cpp head/net/qt5-network/files/patch-src_network_socket_qnet__unix__p.h head/net/qt5-network/files/patch-src_network_socket_qnet_unix_p.h head/net/qt5-network/files/patch-src_network_ssl_qsslcontext__openssl.cpp head/net/qt5-network/files/patch-src_plugins_bearer_generic_qgenericengine.cpp