Install HTMLDoc on CentOS 7

Updated on January 4, 2019
Install HTMLDoc on CentOS 7 header image

HTMLDoc will dynamically parse Postscript (PDF 1.6) documents from correctly written Hypertext (HTML 3.2). This will allow you to generate PDF files on-the-fly, without having to spend hours setting up your server environment or having to pay enormous sums of money to acquire said capability. It can be used for all type of documents, from receipts and invoices, to brochures and documentation, and much more.

In this tutorial, you will learn what is needed to install HTMLDoc on CentOS 7.

Once HTMLDoc has been installed, we shall continue by creating a simple one-page document, with no headers, footers, borders or extras. In essence, an HTML template capable of taking advantage of the entire printable area of a PDF document.

Preparing CentOS 7 (x64) for HTMLDoc

For this tutorial, we will be working with Vultr’s CentOS 7 (x64) server with IPv4. Keep in mind, this works the same with IPv6 only servers as well. First things first, we need to check for updates for installed packages, more so considering most all distributions of Linux are not configured to install security patches or system updates automatically. Furthermore, installing updates to the software, as well as the kernel itself, is always advised, especially when dealing with a new installation.

Now, we need to check to see if there are any updates available for your installed packages, to do so we'll utilize the YUM package manager. The check-update subcommand of the YUM package manager will find all available package updates from all repositories.

yum check-update

Note: If presented with installation or configuration options simply use the default choices. Furthermore, when asked Y/N questions just answer Y on all prompts.

We have successfully checked for updates, now we will update all of our system’s software alongside their dependencies. To do that we will issue the following command:

yum update

Once the system has completed its update processes you will see the following:

Complete!

We have now fully updated our software packages, as well as the kernel itself, and we can now install HTMLDoc:

yum install htmldoc

You are now ready to start generating PDF documents from HTML markup.

Generating Your First PDF from HTML

Let’s quickly test this newfound capability from the command-line. Move over to the /tmp/ directory for testing:

cd /tmp/

Now, let’s create a simple HTML document, which we will use to generate a PDF document. We can call it markup-source.html:

nano markup-source.html

Add the following HTML markup to it:

<html>
<head>
<title>My first PDF from HTML</title>
</head>
<body>
This is the body of my first PDF document made from HTML.
</body>
</html>

Save it by hitting Ctrl + X to exit Nano editor, then input Y to save the changes. You can now instruct HTMLDoc to parse a PDF document from your markup-source.html file:

htmldoc --webpage -f postscript-output.pdf markup-source.html

You will now have a new file named postscript-output.pdf with a title of "My first PDF from HTML" and a body of "This is the body of my first PDF document made from HTML". Congratulations, you have learned how to turn simple HTML markup to highly transportable PostScript PDF documents.