Bug 187506 - [PATCH] graphics/opennurbs breaks cad/qcad with ancient ZLIB (was: SEGV when printing with cad/qcad on amd64 system)
Summary: [PATCH] graphics/opennurbs breaks cad/qcad with ancient ZLIB (was: SEGV when ...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: Normal Affects Only Me
Assignee: Michael Reifenberger
URL:
Keywords:
Depends on: 197135
Blocks:
  Show dependency treegraph
 
Reported: 2014-03-12 19:50 UTC by denverh
Modified: 2015-02-06 20:17 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description denverh 2014-03-12 19:50:00 UTC
The current version of cad/qcad, 3.4.6.0, will exit with a SEGV when attempting to print a drawing or export to PDF.  This happens on amd64 but not on i386.  In fact, if built in a i386 chroot (like wine) it will work on the same amd64 system where it fails if built natively.

This has been a problem for at least the last few versions, and possibly longer.

Fix: 

This is not a very easy work around, but cad/qcad can be built as a 32 bit application on an amd64 system, pretty much like one would do for emulators/wine.  You have to set up a 32 bit chroot, build and install it there, then fix up all the symbolic links that get created with absolute paths.  After all that you need to specify an appropriate LD_32_LIBRARY_PATH each time you want to run qcad.  But it will work that way.
How-To-Repeat: 1. Build and install cad/qcad on an amd64 system
2. Start it up
3. In the file menu select either "Print" or "PDF Export" - you don't have to actually create a drawing
4. If you selected "PDF Export" then follow the prompts for the file to save
5. qcad will exit with a SEGV
Comment 1 Edwin Groothuis freebsd_committer freebsd_triage 2014-03-12 21:55:51 UTC
Responsible Changed
From-To: freebsd-ports-bugs->mr

Over to maintainer (via the GNATS Auto Assign Tool)
Comment 2 Andrew 2014-06-11 00:40:59 UTC
Seeing the same problem in cad/qcad 3.5.1.0, I think (it also happens on the "Open File" operation). My machine had been running an i386 system build until fairly recently, and the SEGV didn't start happening until after I rebuilt to amd64.

I'm not sure that this is a fault in QCAD: backtracing puts the actual segfault in libz/deflate_slow. The call into libz is coming from libpng, which is coming from Qt4, which is coming from KDE. This makes it sound a little like bug 154073, which has a similar backtrace, though that was supposedly patched around a couple of years ago.

Here is what I'm getting:

$ gdb ./qcad
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)...
(gdb) run
Starting program: /usr/home/morrand/qcad/qcad/work/stage/usr/local/bin/qcad 
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...[New LWP 100462]
[New Thread 811006400 (LWP 100462/qcad-bin)]
Debug:    RDxfPlugin::init 
Debug:    RExamplePlugin::init 
Debug:    TransactionListenerPlugin::init 
Debug:    TIMER:  1196 ms -  "loading add-ons" 
Warning:  "QFormBuilder was unable to create a custom widget of the class 'QWebView'; defaulting to base class 'QWidget'." 
Debug:    TIMER:  4527 ms -  "initializing add-ons" 
Debug:    TransactionListenerPlugin::postInit 
Warning:  KGlobal::locale(): Warning your global KLocale is being recreated with a valid main component instead of a fake component, this usually means you tried to call i18n related functions before your main component was created. You should not do that since it most likely will not work

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 811006400 (LWP 100462/qcad-bin)]
0x0000000803314824 in deflate_slow (s=0x824f66000, flush=0)
    at /usr/src/lib/libz/deflate.c:1823
1823                _tr_tally_lit(s, s->window[s->strstart-1], bflush);
Current language:  auto; currently minimal
(gdb) bt full
#0  0x0000000803314824 in deflate_slow (s=0x824f66000, flush=0)
    at /usr/src/lib/libz/deflate.c:1823
No locals.
#1  0x0000000803313549 in deflate (strm=0x820f9e6c0, flush=0)
    at /usr/src/lib/libz/deflate.c:905
        bstate = <value optimized out>
        s = (deflate_state *) 0x824f66000
        old_flush = <value optimized out>
#2  0x000000080ea91b25 in png_write_find_filter ()
   from /usr/local/lib/libpng15.so.15
No symbol table info available.
#3  0x000000080ea919cc in png_write_find_filter ()
   from /usr/local/lib/libpng15.so.15
No symbol table info available.
#4  0x000000080ea885f5 in png_write_row () from /usr/local/lib/libpng15.so.15
No symbol table info available.
#5  0x000000080ea88121 in png_write_rows () from /usr/local/lib/libpng15.so.15
No symbol table info available.
#6  0x00000008081f5875 in QPixmap::fromX11Pixmap ()
   from /usr/local/lib/qt4/libQtGui.so.4
No symbol table info available.
#7  0x00000008081f60d7 in QPixmap::fromX11Pixmap ()
   from /usr/local/lib/qt4/libQtGui.so.4
No symbol table info available.
#8  0x00000008081c676c in QImageWriter::write ()
   from /usr/local/lib/qt4/libQtGui.so.4
No symbol table info available.
(and so on)

The only other obvious clue I'm getting is, from inside deflate():

(gdb) print s->l_buf
$9 = (uchf *) 0x25322000 <Address 0x25322000 out of bounds>

...which is the only one that is so flagged when printing *s.
Comment 3 denverh 2014-06-21 20:43:17 UTC
When I analyze a core dump here I see a similar trace.  So is it a problem with libz, qcad, or somewhere in between?  Answering that question is a bit beyond my abilities.
Comment 4 Andrew 2014-07-08 02:26:49 UTC
I'm still digging, but right now it looks like a fault somewhere between libpng and libz, both of which are really coming in more to support Qt4 than to do anything directly related to QCAD itself. (Point: If I open QCAD remotely, through an X Windows connection, the "Open File" dialog appears just fine. It doesn't quite work correctly, but that's another issue.)

What's directly throwing the SEGV is a call to _tr_tally_lit(s, c, flush) (in libz/trees.c, although not quite--see below), where s is the status struct for libz's data stream. _tr_tally_lit is inlined as:

  { uch cc = (c);
    s->d_buf[s->last_lit] = 0;
    s->l_buf[s->last_lit++] = cc;
    s->dyn_ltree[cc].Freq++;
    flush = (s->last_lit == s->lit_bufsize-1);
   }

but at the point it's being called, s->d_buf = 0x0, and s->last_lit = 0, so the code is trying to assign a value to location 0x0. This doesn't work. (As I mentioned above, s->l_buf is out of bounds according to gdb, too, so there seems to be a whole lot of wrong going around on this.)

I have traced it out, and libpng is trying to initialize struct s, so it's not a problem of s->d_buf being used without being initialized, as I'd originally thought. However, I haven't yet nailed down just where it is that s->d_buf turns into a null pointer between initialization and this call.

One other bit of weirdness, though. QCAD depends on graphics/opennurbs, which has its own version of zlib. Actually, it has version 1.2.3 (the current version is 1.2.8). Somehow or another, QCAD is linking to the OpenNURBS version of zlib, rather than the system one, for some things, including the trouble code above. I'd only discovered this after a portsnap run caused GDB to complain about missing zlib/trees.c. At the same time, QCAD also has its own copy of OpenNURBS (and, in turn, its own copy of zlib)! It turns out that our port of QCAD rightfully ignores its supplied copy of OpenNURBS in favor of the graphics/opennurbs port; unfortunately, that port doesn't ignore its own zlib in favor of the system libz. At least, not for some things. What we get is a mixture of calls to system libz and calls to OpenNURBS's zlib, and therefore between different versions of calls that are supposed to be in the same library. Specifically, libpng calls into deflate(), which calls deflate_slow() (both in the system libz/deflate.c), which eventually calls _tr_tally_lit() in OpenNURBS's zlib/trees.c. That seems fundamentally wrong.

This may explain why the problem only shows up in QCAD, even though it's in a system (or system-ish, anyway) library: if the problem really just comes down to an incompatibility between the OpenNURBS zlib and the FreeBSD libz, it may only show up in programs using OpenNURBS. (Right now, it looks like that consists of two ports: QCAD, and cad/openvsp.) It also suggests that the problem should be fixable by patching out graphics/opennurbs to drop its internal zlib in favor of the system's (like the Blender team apparently did quite recently). Haven't tried that one yet, but it might be worth a shot.
Comment 5 Poul-Henning Kamp freebsd_committer freebsd_triage 2015-01-20 21:37:48 UTC
Suffering from this problem, I looked into to it a bit today.

The easiest way to avoid OpenNurbs dragging in an old zlib, seems to me to move its sources for that copy of zlib out of the way and instead point it at the copy
in src/lib.

To that end I'll be trying this trick next time I build a system:

  Index: graphics/opennurbs/Makefile
  ===================================================================
  --- graphics/opennurbs/Makefile (revision 377539)
  +++ graphics/opennurbs/Makefile (working copy)
  @@ -31,6 +31,8 @@
          ${ICONV_CMD} -c -f utf-8 -t ascii ${WRKSRC}/opennurbs_version.h \
                  > ${WRKSRC}/opennurbs_version.h.tmp || ${TRUE}
          ${MV} ${WRKSRC}/opennurbs_version.h.tmp ${WRKSRC}/opennurbs_version.h
  +       ${MV} ${WRKSRC}/zlib ${WRKSRC}/zlib_
  +       ln -s /usr/src/lib/libz ${WRKSRC}/zlib
   
   do-install:
          @${MKDIR} ${STAGEDIR}${EXAMPLESDIR} \

(NB: Copy & Paste merely for illustration)
Comment 6 Poul-Henning Kamp freebsd_committer freebsd_triage 2015-01-21 08:15:14 UTC
My build completed and the fix seems to have worked:  I can export a PDF from qcad when I patch OpenNURBS in that way.

I have no idea where to even look if this has any downside for OpenNURBS.
Comment 7 denverh 2015-01-21 14:57:36 UTC
I just tried this and it seems to work fine.  Good job,  thanks!
Comment 8 Poul-Henning Kamp freebsd_committer freebsd_triage 2015-01-24 08:40:13 UTC
This patch also seems to solve a similar core-dump problem with openscad
Comment 9 commit-hook freebsd_committer freebsd_triage 2015-02-06 20:15:09 UTC
A commit references this bug:

Author: pi
Date: Fri Feb  6 20:14:08 UTC 2015
New revision: 378553
URL: https://svnweb.freebsd.org/changeset/ports/378553

Log:
  graphics/opennurbs: link opennurbs against system zlib

  Linking opennurbs against system zlib fixes other ports, see 187506

  PR:		197135, 187506
  Submitted by:	fernando.apesteguia@gmail.com (maintainer)

Changes:
  head/graphics/opennurbs/Makefile
  head/graphics/opennurbs/pkg-plist
Comment 10 Kurt Jaeger freebsd_committer freebsd_triage 2015-02-06 20:17:46 UTC
Fixed with patch from PR 197135