Ghostscript is a powerful tool for manipulating PDF and PS files. But with great power comes great complexity. Here are examples on embedding fonts and reducing image size with it!

Embedding Fonts

Usually, your PDF typesetting program takes care of embedding fonts into a PDF document (PDFLaTeX does); but sometimes you have strange sources of PDFs: My ROOT-generated plots for example do not embed their fonts1.

In a blog post, Karl Rupp summarizes how to embed fonts into PDFs from different sources. To really embed ALL the fonts, also those usually ignored by Ghostscript, you have to dive in even deeper. Here is the command, which I found in a Stackoverflow reply:

gs -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dCompressFonts=true -dSubsetFonts=true -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf -c ".setpdfwrite <</NeverEmbed [ ]>> setdistillerparams" -f input.pdf

A quicker alternative to Ghostscript is the pdftocairo command of the poppler PDF library. The command enables conversion to different vector graphics formats2. But it can also convert from PDF to PDF, embedding the fonts in the process.

pdftocairo input.pdf -pdf output.pdf

Changing Image Quality

For printing a document, you probably want to have it available in the best quality possible. For uploading it somewhere for sharing with your friends, file size might be more important than quality. Usually, in best vector fashion, the bulk of bits of a LaTeX-set document are taken by bitmap images (or bitmap-like raster images like JPG, PNG, …). Ghostscript offers a batch way to reduce the size of all embedded bitmap-like images.

Everything revolves around the -dPDFSETTINGS=/ setting. It can take different values, e.g. screen from the command above (equivalent to 72 dpi images) to prepress (300 dpi). A one-liner to get all images of a document down to 150 dpi would be

gs -sDEVICE=pdfwrite -dCompabilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

Since I’m lazy and don’t want to memorize this, I made a small, encapsulating shell script a while ago to reduce the PDF’s size by means of image compression: reducePdfSize.sh.

Using pdfimages -all on my thesis, which is in total 41 MB of size, results in extraction of about 21 MB images – half of the data in the PDF of my thesis is for bitmap images. Using the above Ghostscript command on thesis.pdf reduces the 41 MB to 15 MB, using the printer option3.
Not bad, right?

  1. But then again, they use very basic fonts which should be available on any system. 

  2. It’s probably also the easiest way to convert your PDF vector graphics to SVG. 

  3. I don’t know what happens additionally to reduce the file size even beyond lossy image compression. If you know, tell me!