PDFBox is capable of handing advanced image formats such as JPEG2000. While the jar files related to these advanced capability are also open source, they differ enough that they can not be bundled and shipped with the default pdfbox.jar download. PDFBox has been designed to notice if these extra jar files are part of its class-path when run, and dynamically load in and use them. This directory in the pdf-box extension for Greenstone is designed to help with a user, who has installed Greenstone, to download the advanced image processing jars (accepting them under their specific software licences) and place them within the relevant lib/jar area within Greenstone. PDFConverterPlugin is coded to notice if these extra jars are present, and if so, update the '-cp' class-path argument it uses when launching pdfbox.jar. For more details about the JPEG2000 extra jar, and further advanced image formats/handling that could be included, see: https://pdfbox.apache.org/2.0/dependencies.html#optional-components (relevant snippet below) TLDR: ./DOWNLOAD-ADDITIONAL-JARS.sh --- JAI Image I/O PDF supports embedded image files, however support for some formats require third party libraries which are distributed under terms incompatible with the Apache 2.0 license: Reading JBIG2 images: JBIG2 ImageIO Reading JPEG 2000 (JPX) images: JAI Image I/O Tools Core Writing TIFF images requires JAI Image I/O Tools Core also. These libraries are optional and will be loaded if present on the classpath, otherwise support for these image formats will be disabled and a warning will be logged when an unsupported image is encountered. Maven dependencies for these components can be found in parent/pom.xml. Change the scope of the components if needed. Please make sure that any third party licenses are suitable for your project. To include the JBIG2 library the following part can be included in your project pom.xml: org.apache.pdfbox jbig2-imageio ... To include the JAI capabilities the following part can be included in your project pom.xml: com.github.jai-imageio jai-imageio-core ... com.github.jai-imageio jai-imageio-jpeg2000 ... For more reliable JPEG decoding the following part from the TwelveMonkeys library can be included in your project pom.xml: com.twelvemonkeys.imageio imageio-jpeg ... ----