graphics/tesseract-data allows specifying for which languages the trainingdata should be built/included, via the environment variable TESSERACT_LANGS Works (bash): ============ # cd /usr/ports/graphics/tesseract-data # TESSERACT_LANGS="eng osd" make checksum ===> License APACHE20 accepted by the user ===> tesseract-data-3.02_2 depends on file: /usr/local/sbin/pkg - found The choice of language data to be installed may be overridden by defining TESSERACT_LANGS. ===> Fetching all distfiles required by tesseract-data-3.02_2 for building => SHA256 Checksum OK for tesseract/tesseract-ocr-3.01.osd.tar.gz. => SHA256 Checksum OK for tesseract/tesseract-ocr-3.02.eng.tar.gz. However, if one adds this variable to make.conf instead: ============ # cat /etc/make.conf TESSERACT_LANGS="eng osd" # make checksum ===> License APACHE20 accepted by the user ===> tesseract-data-3.02_2 depends on file: /usr/local/sbin/pkg - found The choice of language data to be installed may be overridden by defining TESSERACT_LANGS. ===> Fetching all distfiles required by tesseract-data-3.02_2 for building => No SHA256 checksum recorded for tesseract/. => No suitable checksum found for tesseract/. *** Error code 1 Stop. make: stopped in /usr/ports/graphics/tesseract-data The main problem is that you need to set this in the relevant poudriere make.conf file in order to have the port built with the needed languages only, and this currently does not work. My most recent Poudriere-built package of tesseract-data (using the make.conf configuration) is dated december 5th 2014, so at least back then it worked. (That is already version 3.02_2, poudriere never needed to update it, until I needed a different language (eng/osd moved out of graphics/tesseract) and asked it to rebuild).
You should specify TESSERACT_LANGS=eng osd instead of TESSERACT_LANGS="eng osd"