Bug 221737 - [NEW PORT REQUEST] graphics/pdfsandwich: A tool to make "sandwich" OCR pdf files
Summary: [NEW PORT REQUEST] graphics/pdfsandwich: A tool to make "sandwich" OCR pdf files
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Yuri Victorovich
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-23 09:59 UTC by vermaden
Modified: 2017-12-31 20:34 UTC (History)
1 user (show)

See Also:


Attachments
patch (4.10 KB, patch)
2017-12-31 04:59 UTC, Yuri Victorovich
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description vermaden 2017-08-23 09:59:42 UTC
The PDFSANDWITCH ( http://www.tobias-elze.de/pdfsandwich/ ) works like a charm compiled from source, would be nice to have it in Ports.

% svn checkout svn://svn.code.sf.net/p/pdfsandwich/code/trunk/src pdfsandwich
A    pdfsandwich/pdfsandwich.ml
A    pdfsandwich/Makefile
A    pdfsandwich/changelog
A    pdfsandwich/manual.txt
A    pdfsandwich/pdfsandwich_version
A    pdfsandwich/make_control.pl
A    pdfsandwich/copyright
A    pdfsandwich/ebuild-stub
A    pdfsandwich/make_portfile.pl
A    pdfsandwich/changelog2deb.pl
A    pdfsandwich/configure
A    pdfsandwich/txt2man
Checked out revision 71.

% cd pdfsandwich 

% ./configure --prefix=/opt/pdfsandwich                                                               
./configure: ocamlc: not found
./configure: ocamlopt: not found

ocmalc not found, configuration failed.

# pkg install ocaml

% gmake clean
rm -f *.cmi *.cmo *.cmx *.cma *.cmxa *.o *.a *.so depend pdfsandwich
rm -f pdfsandwich_version.ml pdfsandwich.1.gz
rm -rf pdfsandwich-0.1.6 pdfsandwich-0.1.6.tar.bz2 pdfsandwich_0.1.6_i386 pdfsandwich_0.1.6_i386.deb

% ./configure --prefix=/opt/pdfsandwich

ocamlc found in /usr/local/lib/ocaml
ocamlopt found in /usr/local/lib/ocaml

PREFIX=/opt/pdfsandwich
makefile.installprefix written.

% gmake                                
echo "let pdfsandwich_version=\"0.1.6\";; (*automatically generated from file pdfsandwich_version*)" > pdfsandwich_version.ml
ocamlopt -thread -w s  str.cmxa unix.cmxa threads.cmxa  -c pdfsandwich_version.ml
ocamlopt -thread -w s  str.cmxa unix.cmxa threads.cmxa  -o pdfsandwich pdfsandwich_version.cmx pdfsandwich.ml
# you need gawk for this:
./txt2man -t PDFSANDWICH manual.txt | gzip -9 > pdfsandwich.1.gz

# gmake install
(umask 0022; mkdir -p /opt/pdfsandwich/bin /opt/pdfsandwich/share/doc/pdfsandwich /opt/pdfsandwich/share/man/man1)
install -s pdfsandwich /opt/pdfsandwich/bin
cp copyright changelog /opt/pdfsandwich/share/doc/pdfsandwich
gzip -9 /opt/pdfsandwich/share/doc/pdfsandwich/changelog
cp pdfsandwich.1.gz /opt/pdfsandwich/share/man/man1
chmod 644 /opt/pdfsandwich/share/doc/pdfsandwich/* /opt/pdfsandwich/share/man/man1/*

% find /opt/pdfsandwich 
/opt/pdfsandwich
/opt/pdfsandwich/share
/opt/pdfsandwich/share/man
/opt/pdfsandwich/share/man/man1
/opt/pdfsandwich/share/man/man1/pdfsandwich.1.gz
/opt/pdfsandwich/share/doc
/opt/pdfsandwich/share/doc/pdfsandwich
/opt/pdfsandwich/share/doc/pdfsandwich/changelog.gz
/opt/pdfsandwich/share/doc/pdfsandwich/copyright
/opt/pdfsandwich/bin
/opt/pdfsandwich/bin/pdfsandwich

% ./pdfsandwich             
Fatal error: exception Failure("Could not find program unpaper. Make sure this program exists and can be found in your search path.\nUse command line options to specify a custom binary.")

# pkg install tesseract

% ./pdfsandwich 
Fatal error: exception Failure("Could not find program tesseract. Make sure this program exists and can be found in your search path.\nUse command line options to specify a custom binary.")

# pkg install unpaper

% /opt/pdfsandwich/bin/pdfsandwich 
Fatal error: exception Failure("Could not open file ")

Known dependencies are for sure: gmake/gawk/ocaml/tesseract/unpaper

Regards,
vermaden
Comment 1 vermaden 2017-08-23 10:02:24 UTC
Other needed: convert(ImageMagick)/gs
Comment 2 vermaden 2017-08-23 11:11:33 UTC
Also: tesseract-data

So to summarize: gmake/gawk/ocaml/tesseract/tesseract-data/unpaper/convert(ImageMagick)/gs
Comment 3 Yuri Victorovich freebsd_committer 2017-12-31 04:59:05 UTC
Created attachment 189268 [details]
patch
Comment 4 commit-hook freebsd_committer 2017-12-31 19:00:34 UTC
A commit references this bug:

Author: yuri
Date: Sun Dec 31 18:59:36 UTC 2017
New revision: 457713
URL: https://svnweb.freebsd.org/changeset/ports/457713

Log:
  New port: textproc/pdfsandwich: Command line tool generating "sandwich" OCR pdf files

  PR:		221737
  Submitted by:	myself
  Requested by:	vermaden@interia.pl
  Approved by:	tcberner (mentor)
  Differential Revision:	https://reviews.freebsd.org/D13708

Changes:
  head/textproc/Makefile
  head/textproc/pdfsandwich/
  head/textproc/pdfsandwich/Makefile
  head/textproc/pdfsandwich/distinfo
  head/textproc/pdfsandwich/files/
  head/textproc/pdfsandwich/files/patch-Makefile
  head/textproc/pdfsandwich/pkg-descr
Comment 5 commit-hook freebsd_committer 2017-12-31 20:34:14 UTC
A commit references this bug:

Author: yuri
Date: Sun Dec 31 20:33:08 UTC 2017
New revision: 457726
URL: https://svnweb.freebsd.org/changeset/ports/457726

Log:
  textproc/pdfsandwich: Correction of MASTER_SITES

  Previous commit, sadly, had a wrong SF MASTER_SITES URL.
  Now this is corrected. :-)

  PR:		221737
  Approved by:	tcberner (mentor)
  Differential Revision:	https://reviews.freebsd.org/D13708

Changes:
  head/textproc/pdfsandwich/Makefile