Designly Blog

How to Manipulate, Split and Concatenate PDF Files Server-Side

How to Manipulate, Split and Concatenate PDF Files Server-Side

Posted in Back End Development by Jay Simons
Published on February 28, 2023

The Portable Document Format (PDF) was first developed by Adobe Systems in 1993, with the aim of creating a file format that could be easily shared and printed across different computer systems, software applications, and devices. At the time, documents were typically created in proprietary file formats that were specific to the software application used to create them, making it difficult to share and view documents across different platforms.

PDFs were designed to be a universal file format that could preserve the formatting, fonts, images, and other elements of a document, regardless of the software used to create it or the device used to view it. The format quickly gained popularity and became a standard for sharing documents online, particularly for academic journals, government reports, and other professional publications.

Serving PDF files from a web server is very commonplace, and there is even paid PDF server software available, but is quite pricey. In this article, I'll show you a free and easy way to manipulate, split and concatenate PDF documents on your web server.

The software library we are going to use is called qpdf, a very powerful C++ library for PDF manipulation. It also comes with a command-line binary tool which you can invoke via a system call from Node, NGINX, Apache, or whatever your weapon of choice may be.

Installing QPDF

You can download and compile the source code yourself, or if you can download the packages on Debian/Ubuntu or MacOS. There's no reliable Windows package currently, so you'd have to compile it manually.

Install on Ubuntu:

apt -y install qpdf

Install on MacOS:

brew install qpdf

If you don't have Homebrew installed, run this command:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Examples

Ok, now that that's installed, here's some examples of what you might want to do with QPDF.

Let's say you have a large PDF with hundreds or even thousands of pages and you want to be able to serve only one page, or a range of pages to the user. Here's how:

qpdf --empty --pages input_file.pdf 50,60-69 -- output_file.pdf

Or let's say you want to concatenate (merge) two PDF files as one:

qpdf --empty --pages input_file1.pdf input_file2.pdf -- output_file.pdf

You can even specify page ranges for multiple files:

qpdf --empty --pages input_file1.pdf 50,60-69 input_file2.pdf 1-10 -- output_file.pdf

Pretty neat huh? Let's say you have a password-protected file and you want to make a copy of it, decrypt it, and then send it to the client:

qpdf --passsword=password --decrypt secure.pdf unsecure.pdf

As you can see, QPDF is an amazing tool. There are many more things you can do with it. These examples merely scratched the surface. Not only is QPDF great for automated manipulation of PDF files, but you should also have it installed on your workstation as well for everyday PDF editing.

More Examples

Rotate specific page to a specified angle (clockwise):

qpdf --rotate=90:2,4,6 --rotate=180:7-8 input.pdf output.pdf

Split a PDF into individual enumerated pages:

qpdf --split-pages=n input.pdf out_%d.pdf

Well, I hope you enjoyed this article. If so, please give it a like and/or leave a comment. I'd love to hear your feedback. What PDF solutions do you use on your web server? Will you be utilizing this solution?

For more great information about web dev, systems administration and more, please read the Designly Blog.

Further Reading:

  1. QPDF Man Page
  2. QPDF Documentation

Loading comments...