This is an old revision of the document!


PDF Workflow

This gives you the ability to work with each page on an individual basis. You can also see how large each page is by comparing file sizes. You can edit and compress each page efficiently prior to putting them together in a single pdf file. Even if you work in Adobe Acrobat, you don't know the size of individual pages. So, if you want to identify large pages and optimize them, you can use this procedure and replace the pages when you are finished.

  • requires pdftk
  • example:
    pdftk input.pdf burst
  • example:
    pdf2ps input.pdf output.ps
    
    or
    
    pdf2ps -dLanguageLevel=1 input.pdf output.ps

:!: This failed to work for me when text was over an underlying image (the file was very large as a result)

  • burst a multi-page pdf file into single files
  • options:
    • convert the pdf file to ps, then use ps2pdf to write a new pdf file with image compression optimizations: (note that ps2pdf is a shell script for gs)
      (force zip/flate image compression)
      ps2pdf -dAutoFilterColorImages=false -dColorImageFilter=/FlateEncode input.ps output.pdf
      
      or
      
      (force jpeg compression)
      ps2pdf -dAutoFilterColorImages=false -dColorImageFilter=/DCTEncode input.ps output.pdf
    • use GhostScript to convert and write a new pdf file with image compression optimizations (see section below):
      gs -dNOPAUSE -sDEVICE=pdfwrite -dAutoFilterColorImages=false -dColorImageFilter=/FlateEncode -dAutoFilterGrayImages=false -dGrayImageFilter=/FlateEncode -sOutputFile=output.pdf input.ps -c quit

:!: This failed to work for me when text was over an underlying image (the file was very large as a result)

The ghostscript defaults -dAutoFilterColorImages=true and -dAutoFilterGrayImages=true cause ghostscript to automatically detect whether JPEG or Flate compression is most suitable for each image. JPEG is good for photo images. Flate is good for line drawings, cartoons and computer screen shots.

The compression can be forced to JPEG with

    -dAutoFilterColorImages=false -dColorImageFilter=/DCTEncode

Other filters are /FlateEncode (zlib/gzip/pkzip) and /CCITTFaxEncode (ITU-T group 3 fax suitable for monochrome images).

To get smaller file sizes, enable image downsampling.

    -dDownsampleColorImages=true -dColorImageDownsampleType=/Average 
    -dColorImageDownsampleThreshold=1.5 -dColorImageResolution=72

This says that if the image resolution is greater than 72*1.5=108dpi, it should be resampled to 72dpi by averaging the pixels. There are similar settings for Gray and Mono images.

Using -dPDFSETTINGS=/screen will set color and gray image downsampling to 72dpi, -dPDFSETTINGS=/ebook will downsample to 150dpi, and -dPDFSETTINGS=/printer will downsample to 300dpi.

You can export all images from a PDF file using the Advanced → Export All Images… function in Adobe Acrobat 6

  • docs/pdf/pdf_workflow.1179271581.txt.gz
  • Last modified: 2008/08/03 00:25
  • (external edit)