Bug 269167

Summary: converters/wkhtmltopdf crashes with "Bus error"
Product: Ports & Packages Reporter: SolarCatcher <solarcatcher>
Component: Individual Port(s)Assignee: Kurt Jaeger <pi>
Status: Open ---    
Severity: Affects Many People CC: freebsd-bugzilla, grahamperrin, jan.catrysse, me, pat
Priority: --- Keywords: crash, needs-qa
Version: LatestFlags: bugzilla: maintainer-feedback? (pi)
Hardware: amd64   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269313
Attachments:
Description Flags
Script to instal CentOS 7 version of wkhtmltopdf with all requirements in linux mode
none
Script to instal CentOS 7 version of wkhtmltopdf with all requirements in linux mode - corrected none

Description SolarCatcher 2023-01-26 12:56:55 UTC
Since the upgrade to the 2023Q1 quarterly packages from the FreeBSD project, I see a lot of crashes of wkhtmltopdf, both on the server and on the desktop.

This is on FreeBSD 13.1-RELEASE-p5 on amd64.

In /var/log/messages I get the info that wkhtmltopdf "exited on signal 10 (core dumped)"

When I run it on the command line, it errors out with the info "Bus error".

This is for normal use of converting publicly accessible web pages into PDF. The same processes have worked flawlessly for years and until the packages in the quarterly repo 2022Q4.

It continues to work for some web pages. But in most cases it just crashes.

Anyone else seeing these errors?
Comment 1 Kurt Jaeger freebsd_committer freebsd_triage 2023-01-26 14:28:22 UTC
Thanks for the bug report. Can you name URLs where this crash can be reproduced with wkhtmltopdf ?

My use case is rendering of static HTML to PDF, so I have not yet seen the problem.
Comment 2 SolarCatcher 2023-01-26 15:52:32 UTC
Thanks for answering so quickly.

This is one example of a page that leads to the bus error:
https://www.greenjobs.de/angebote/neueste.html?id=100117275&anz=html&frame=1
Comment 3 Kurt Jaeger freebsd_committer freebsd_triage 2023-02-06 16:49:57 UTC
(In reply to SolarCatcher from comment #2)
There are folks interested in deprecating wkhtmltopdf, and they
suggest alternatives, please have a look here:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269313#c3

I have found print/py-weasyprint to be able to cover my use-case,
can you investigate if one of the suggested tools covers your use-case ?
Comment 4 Kurt Jaeger freebsd_committer freebsd_triage 2023-02-06 17:22:07 UTC
(In reply to SolarCatcher from comment #2)
I reproduced the core-dump on 14.0.
Comment 5 Kurt Jaeger freebsd_committer freebsd_triage 2023-02-06 17:25:40 UTC
(In reply to SolarCatcher from comment #2)
I tested the page with print/py-weasyprint and it somewhat worked.

It was version 57.2, and I still need to update the port to that version.
Comment 6 SolarCatcher 2023-02-09 17:07:31 UTC
Thank you for the additional info. I was not aware that wkhtmltopdf was EOL. In this case, I will need to look for a suitable alternative. I have not yet tried print/py-weasyprint but from what I read this will not work for us, because we cannot control the HTML/CSS of the pages we need to convert to PDF. And weasyprint seems to be a great tool if you can, and build relatively simple pages. As I have all my machines on quarterly packages, I cannot easily test it for now.

For me the gold standard for HTML to PDF conversion is Prince, and it is even supported on FreeBSD. But, unfortunately, it is a bit too costly (prices for installation on a server start at 3,800 USD). So we will have to look a around for other options.

If you like, you can close this bug report. I understand there will not be path forward with this software. Thanks for your work on this port. I have used it for a long time.
Comment 7 SolarCatcher 2023-02-10 12:42:46 UTC
While looking around for alternatives, I had the idea that Chromium in headless mode could be a solution for us. I found this little how-to, which looks promising: https://blog.grio.com/2020/08/understanding-pdf-generation-with-headless-chrome.html

On the command-line it already does what it promises. However, the author points out how to create a more clever setup using Node and chromium-launcher. This would add quite a bit of overhead, of course (maybe I could shut this away in a separate jail). In any case, it could be a solution for us and others looking to replace wkhtmltopdf. That's why I wanted to mention it here.
Comment 8 Jan Catrysse 2023-03-08 09:06:58 UTC
I have similar issues.

Doing a simple: "wkhtmltopdf https://www.google.com google.pdf" results in a "Segmentation fault (core dumped)".

The issue seems to be related to JavaScript.

Adding the command line option: "--disable-javascript" resolves the issue. Of course, without the well-needed JavaScript…

This worked fine with previous builds of the same version 0.12.6. I do think this is due to the way this package is built and not per se the source code of wkhtmltopdf.

I do not well understand what would be the root cause... but as many many many people are using this package, a solution would greatly be appreciated.
Comment 9 Jan Catrysse 2023-03-08 11:53:41 UTC
I tried reverting some changes in the ports tree… but the result is identical. It would be nice to find the culprit, but I am afraid I am unable to.
Comment 10 Jan Catrysse 2023-03-10 14:04:42 UTC
Information for other users having the same issue:

For the moment, I am using the Linux emulation FreeBSD provides: https://docs.freebsd.org/en/books/handbook/linuxemu/

And the CentOS 8 executable from https://rubygems.org/gems/wkhtmltopdf-binary/
Comment 11 Henk de Leeuw 2024-08-14 18:57:25 UTC
Created attachment 252762 [details]
Script to instal CentOS 7 version of wkhtmltopdf with all requirements in linux mode
Comment 12 Henk de Leeuw 2024-08-14 19:04:23 UTC
Since it took me some time to figure out how to install wkhtmltopdf in Linux mode, I created a script that does it all when I finally had everything right.
What it does:
- enable and start Linux compatibility mode
- install CentOS 7 base system with X11 support and other requirements
- install latest CentOS 7 version of wkhtmltopdf rpm
- make a symlink in /usr/local/bin so other programs can find it

Download the attached script, make it executable and run as root.
Comment 13 Henk de Leeuw 2024-08-14 19:55:54 UTC
Created attachment 252763 [details]
Script to instal CentOS 7 version of wkhtmltopdf with all requirements in linux mode - corrected

I had experimented with different versions of the installed files, and did not check with a completely clean install at the end.
The script I posted in comment #11 did not work on a clean install; this one does.