pdftohtml(1) General Commands Manual pdftohtml(1)
NAME
pdftohtml - Portable Document Format (PDF) to HTML converter (version
4.00)
SYNOPSIS
pdftohtml [options] PDF-file HTML-dir
DESCRIPTION
Pdftohtml converts Portable Document Format (PDF) files to HTML.
Pdftohtml reads the PDF file, PDF-file, and places an HTML file for
each page, along with auxiliary images in the directory, HTML-dir. The
HTML directory will be created; if it already exists, pdftohtml will
report an error.
CONFIGURATION FILE
Pdftohtml reads a configuration file at startup. It first tries to
find the user's private config file, ~/.xpdfrc. If that doesn't exist,
it looks for a system-wide config file, typically /usr/local/etc/xpdfrc
(but this location can be changed when pdftohtml is built). See the
xpdfrc(5) man page for details.
OPTIONS
Many of the following options can be set with configuration file com-
mands. These are listed in square brackets with the description of the
corresponding command line option.
-f number
Specifies the first page to convert.
-l number
Specifies the last page to convert.
-z number
Specifies the initial zoom level. The default is 1.0, which
means 72dpi, i.e., 1 point in the PDF file will be 1 pixel in
the HTML. Using '-z 1.5', for example, will make the initial
view 50% larger.
-r number
Specifies the resolution, in DPI, for background images. This
controls the pixel size of the background image files. The ini-
tial zoom level is controlled by the '-z' option. Specifying a
larger '-r' value will allow the viewer to zoom in farther with-
out upscaling artifacts in the background.
-skipinvisible
Don't draw invisible text. By default, invisible text (commonly
used in OCR'ed PDF files) is drawn as transparent (alpha=0) HTML
text. This option tells pdftohtml to discard invisible text
entirely.
-allinvisible
Treat all text as invisible. By default, regular (non-invisi-
ble) text is not drawn in the background image, and is instead
drawn with HTML on top of the image. This option tells pdfto-
html to include the regular text in the background image, and
then draw it as transparent (alpha=0) HTML text.
-opw password
Specify the owner password for the PDF file. Providing this
will bypass all security restrictions.
-upw password
Specify the user password for the PDF file.
-q Don't print any messages or errors. [config file: errQuiet]
-cfg config-file
Read config-file in place of ~/.xpdfrc or the system-wide config
file.
-v Print copyright and version information.
-h Print usage information. (-help and --help are equivalent.)
BUGS
Some PDF files contain fonts whose encodings have been mangled beyond
recognition. There is no way (short of OCR) to extract text from these
files.
EXIT CODES
The Xpdf tools use the following exit codes:
0 No error.
1 Error opening a PDF file.
2 Error opening an output file.
3 Error related to PDF permissions.
99 Other error.
AUTHOR
The pdftohtml software and documentation are copyright 1996-2017 Glyph
& Cog, LLC.
SEE ALSO
xpdf(1), pdftops(1), pdftotext(1), pdfinfo(1), pdffonts(1), pdfde-
tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5)
http://www.xpdfreader.com/
10 Aug 2017 pdftohtml(1)