Trying to get graphviz to make an image results in a segmentation fault. # cat /tmp/test.dot | dot -Tpng Segmentation fault (core dumped) Already at latest base and kernel (FreeBSD 12) # uname -a FreeBSD NODE001 12.0-ALPHA8 FreeBSD 12.0-ALPHA8 #3 r339012M: Mon Oct 8 20:23:15 UTC 2018 freebsd@NODE005:/usr/obj/usr/src/arm64.aarch64/sys/sopine arm64 Started troubleshooting myself but got a bit stuck at this weird curthread pointer. # gdb /usr/local/bin/dot dot.core GNU gdb (GDB) 8.1 [GDB v8.1 for FreeBSD] Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "aarch64-portbld-freebsd12.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/local/bin/dot...done. [New LWP 100082] Core was generated by `dot -v -Tpng'. Program terminated with signal SIGSEGV, Segmentation fault. #0 _thr_rtld_rlock_acquire (lock=0x411dec80) at /usr/src/lib/libthr/thread/thr_rtld.c:125 125 THR_CRITICAL_ENTER(curthread); (gdb) bt full #0 _thr_rtld_rlock_acquire (lock=0x411dec80) at /usr/src/lib/libthr/thread/thr_rtld.c:125 l = 0x411dec80 curthread = 0x801800000000001 errsave = <error reading variable errsave (Cannot access memory at address 0x80180000000018d)> #1 0x00000000402390c4 in rlock_acquire (lock=0x40270090 <rtld_locks>, lockstate=0xffffffffab40) at /usr/src/libexec/rtld-elf/rtld_lock.c:209 No locals. #2 0x0000000040232bec in _rtld_bind (obj=0x409bf000, reloff=96) at /usr/src/libexec/rtld-elf/rtld.c:789 lockstate = {lockstate = 1, env = {{_sjb = {5192296858134625181781816927844096, 3541774862153317871616, 281474976688960, 5192296858162646120099333491982448, 5192296858205737714255517928390658, 0, 19849560668804171569190382992, 85264437479619916114209298589211951104, 13844966854071681024, 4624070917402656768, 0, 20249735019133469358624320736, 5192296858152168369465466474459136, 85, 36893769622395796048, 36893488147419103233, 19951829344161559554618773900, 19999637623203440009873805312, 20147593866117581453433744528, 1084182528, 20147734725455328299569884368, 19851772949927161933322518824, 19995557498562240636417077544, 20250678256497908875248176320, 19849563546496247067880435504, 36893488147419103233, 5192296858142299027316480101316352, 11510768301995844169728, 281474976689376, 5192296858162646120099333491982448, 5192296858205737714255517928390658, 0}}}} rel = <optimized out> defobj = <optimized out> def = <optimized out> where = <optimized out> target = <optimized out> #3 0x000000004023007c in _rtld_bind_start () at /usr/src/libexec/rtld-elf/aarch64/rtld_start.S:93 No locals. #4 0x00000000416efd50 in pixman_image_composite32 (op=PIXMAN_OP_SRC, src=0x42631400, mask=0x0, dest=0x42631b00, src_x=0, src_y=0, mask_x=0, mask_y=0, dest_x=0, dest_y=0, width=11, height=11) at pixman.c:686 src_format = PIXMAN_a8 mask_format = 0 dest_format = PIXMAN_a8 region = {extents = {x1 = 0, y1 = 0, x2 = 11, y2 = 11}, data = 0x0} extents = {x1 = 0, y1 = 0, x2 = 11, y2 = 11} imp = 0x424efa00 func = 0x418efa74 <fast_composite_src_memcpy> info = {op = 1116379648, src_image = 0xffffffffb010, mask_image = 0x1, dest_image = 0x42631400, src_x = 0, src_y = 0, mask_x = 1113791232, mask_y = 0, dest_x = -2145384446, dest_y = -2145384446, width = 0, height = 1048576, src_flags = 0, mask_flags = 1073741824, dest_flags = 1074791425} pbox = 0xffffffffaff0 n = 0 #5 0x0000000041937648 in pixman_glyph_cache_insert (cache=0x428a9a00, font_key=0x42713e00, glyph_key=0x2a, origin_x=0, origin_y=11, image=0x42631400) at pixman-glyph.c:286 glyph = 0x4257fbd0 ---Type <return> to continue, or q <return> to quit--- width = 11 height = 11 #6 0x00000000410cebd0 in ?? () from /usr/local/lib/libcairo.so.2 No symbol table info available. #7 0x0000ffffffffbb04 in ?? () No symbol table info available. Backtrace stopped: not enough registers or memory available to unwind further
The command cat test.dot | dot -Tpng also not works on amd64. It does not segfault, but it "hangs" anywhere. The command dot -Tpng test.dot -o test.png works.
Uups cat test.dot | dot -Tpng works, but with -o cat test.dot | dot -Tpng -o test.png.
# dot -Tpng test.dot -o test.png Segmentation fault (core dumped) As a workaround I made a dot shell file what does ssh to my machine; ssh test@FreeBSDi7 -C clusterdot $* On the i7 I have clusterdot shell file with; dot $* What works fine but isn't the best solution I guess. :-) It's only on the ARM64 it fails, I could try with ARM also but that will take a while to setup.. --- # dot -v -Tpng test.dot -o test.png dot - graphviz version 2.40.1 (20161225.0304) Using render: cairo:cairo Using device: png:cairo:cairo libdir = "/usr/local/lib/graphviz" Activated plugin library: libgvplugin_dot_layout.so.6 Using layout: dot:dot_layout The plugin configuration file: /usr/local/lib/graphviz/config6 was successfully loaded. render : cairo dot dot_json fig gd json json0 map mp pic pov ps svg tk vml vrml xdot xdot_json layout : circo dot fdp neato nop nop1 nop2 osage patchwork sfdp twopi textlayout : textlayout device : canon cmap cmapx cmapx_np dot dot_json eps fig gd gd2 gif gv imap imap_np ismap jpe jpeg jpg json json0 mp pdf pic plain plain-ext png pov ps ps2 svg svgz tk vml vmlz vrml wbmp x11 xdot xdot1.2 xdot1.4 xdot_json xlib loadimage : (lib) eps gd gd2 gif jpe jpeg jpg png ps svg xbm fontname: "Helvetica" resolved to: (ps:pango DejaVu Sans, ) (PangoCairoFcFont) "DejaVu Sans, Book" /usr/local/share/fonts/dejavu/DejaVuSans.ttf pack info: mode undefined size 0 flags 0 margin 8 pack info: mode node size 0 flags 0 network simplex: 20 nodes 19 edges maxiter=2147483647 balance=2 network simplex: 20 nodes 19 edges 0 iter 0.00 sec network simplex: 4 nodes 4 edges maxiter=2147483647 balance=2 network simplex: 4 nodes 4 edges 0 iter 0.00 sec network simplex: 5 nodes 4 edges maxiter=2147483647 balance=2 network simplex: 5 nodes 4 edges 0 iter 0.00 sec network simplex: 4 nodes 5 edges maxiter=2147483647 balance=2 network simplex: 4 nodes 5 edges 0 iter 0.00 sec network simplex: 2 nodes 1 edges maxiter=2147483647 balance=1 network simplex: 2 nodes 1 edges 0 iter 0.00 sec Maxrank = 1, minrank = 0 mincross: pass 0 iter 0 trying 0 cur_cross 0 best_cross 0 mincross oneDegreeRelationshipsDiagram: 0 crossings, 0.00 secs. network simplex: 3 nodes 2 edges maxiter=2147483647 balance=2 network simplex: 3 nodes 2 edges 0 iter 0.00 sec routesplines: 1 edges, 3 boxes 0.00 sec Using render: cairo:cairo Using device: png:cairo:cairo dot: allocating a 1337K cairo image surface (657 x 521 pixels) Segmentation fault (core dumped)
First tests are on our Sopine cluster and I don't have a 32bit world for it yet. I do run a bit of customized kernel but nothing in that part of the kernel has changed but to be sure I got another brand of ARM64 CPU... So just tested this on the RPI-III with the corresponding aarch64 image and same problem. Also just tested on the latest ARM (32bit) image for RPI-II and that worked so it's aarch64 specific. Reproduce; Boot the aarch64 image on RPI3 or Pine64+/Sopine pkg install graphviz echo 'digraph "test" { "test":"testx" } ' | dot -Tpng -otest.png I think this should move to AARCH64 it seems arch specific.
Cf bug #233204 for a wip patch
(In reply to mikael.urankar from comment #5) You sir earned a cookie or a beer or something! That indeed fixed the issue, I will test further but the first tests seemed to work so I'll roll it out on our cluster and test some more in a bigger setting and cross my fingers that all nodes will stay stable. :-) I also have a feeling that will fix some other threading issues as well.. (i.e. Python < 3.7 issues) Fixed with: https://github.com/strejda/freebsd/commit/981459604061136fc68c020ff6124fab0d1196aa
(In reply to Stefan Rink from comment #6) Please don't mark FIXED until the fix actually lands in upstream FreeBSD
Final (and much more complex) version of this patch is under review now: https://reviews.freebsd.org/D18417 Michal
All 42 nodes are still up and running with https://github.com/strejda/freebsd/commit/981459604061136fc68c020ff6124fab0d1196aa! Total crashcounter: 0 Will test https://reviews.freebsd.org/D18417 when I have some spare time.
A commit references this bug: Author: mmel Date: Sat Dec 15 10:38:10 UTC 2018 New revision: 342113 URL: https://svnweb.freebsd.org/changeset/base/342113 Log: Improve R_AARCH64_TLSDESC relocation. The original code did not support dynamically loaded libraries and used suboptimal access to TLS variables. New implementation removes lazy resolving of TLS relocation - due to flaw in TLSDESC design is impossible to switch resolver function at runtime without expensive locking. Due to this, 3 specialized resolvers are implemented: - load time resolver for TLS relocation from libraries loaded with main executable (thus with known TLS offset). - resolver for undefined thread weak symbols. - slower lazy resolver for dynamically loaded libraries with fast path for already resolved symbols. PR: 228892, 232149, 233204, 232311 MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18417 Changes: head/libexec/rtld-elf/aarch64/reloc.c head/libexec/rtld-elf/aarch64/rtld_start.S head/libexec/rtld-elf/amd64/reloc.c head/libexec/rtld-elf/arm/reloc.c head/libexec/rtld-elf/i386/reloc.c head/libexec/rtld-elf/mips/reloc.c head/libexec/rtld-elf/powerpc/reloc.c head/libexec/rtld-elf/powerpc64/reloc.c head/libexec/rtld-elf/riscv/reloc.c head/libexec/rtld-elf/rtld.c head/libexec/rtld-elf/rtld.h head/libexec/rtld-elf/sparc64/reloc.c
Forgotten to close?
(In reply to Walter Schwarzenfeld from comment #11) yes, can you close it please?