Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
docs:pdf:pdf_workflow [2007/05/15 13:18] – created billhdocs:pdf:pdf_workflow [2009/04/24 22:53] (current) billh
Line 1: Line 1:
 ====== PDF Workflow ====== ====== PDF Workflow ======
 +
 +===== Undocumented PDF Common Workflow Issues =====
 +  * <del>split pdf into individual files (burst)</del>
 +  * <del>convert to Postscript</del>
 +  * remove all comments/annotations
 +  * remove or flatten forms
 +  * pdf/x standard (no transparency, no bookmarks/links?, no annotations/forms?)
 +  * document exact pdf procedure for creating valid files with every company program (create a standard)
 +  * flatten all comments/annotations
 +  * print bookmarks list
 +
 ===== Burst PDF pages into individual files ===== ===== Burst PDF pages into individual files =====
-This gives you the ability to work with each page on an individual basis.  You can also see how large each page is by comparing file sizes.  You can edit and compress each page efficiently prior to putting them together in a single pdf file.+This gives you the ability to work with each page on an individual basis.  You can also see how large each page is by comparing file sizes.  You can edit and compress each page efficiently prior to putting them together in a single pdf file.  Even if you work in Adobe Acrobat, you don't know the size of individual pages.  So, if you want to identify large pages and optimize them, you can use this procedure and replace the pages when you are finished.
   * requires pdftk   * requires pdftk
   * example:<code>   * example:<code>
 pdftk input.pdf burst pdftk input.pdf burst
 </code> </code>
 +
  
 ===== Convert PDF to PS (PostScript) ===== ===== Convert PDF to PS (PostScript) =====
 +  * these tools come with ghostscript
 +  * this step can be the source of problems, so make sure to read the "Troubleshooting / Large Files" section below
   * example:<code>   * example:<code>
 pdf2ps input.pdf output.ps pdf2ps input.pdf output.ps
 +
 +or
 +
 +pdf2ps -dLanguageLevel=1 input.pdf output.ps
 </code> </code>
  
 ===== Optimize individual PDF page ===== ===== Optimize individual PDF page =====
 +  * these tools come with ghostscript
   * burst a multi-page pdf file into single files   * burst a multi-page pdf file into single files
   * options:   * options:
-    * convert the pdf file to ps, then use ps2pdf to write a new pdf file with image compression optimizations:<code> +    * convert the pdf file to ps, then use ps2pdf to write a new pdf file with image compression optimizations: (note that ps2pdf is a shell script for gs)<code> 
-ps2pdf <options> input.ps output.pdf+(force zip/flate image compression) 
 +ps2pdf -dAutoFilterColorImages=false -dColorImageFilter=/FlateEncode input.ps output.pdf 
 + 
 +or 
 + 
 +(force jpeg compression) 
 +ps2pdf -dAutoFilterColorImages=false -dColorImageFilter=/DCTEncode input.ps output.pdf
 </code> </code>
-    * use GhostScript to convert and write a new pdf file with image compression optimizations (see section below)+    * use GhostScript to convert and write a new pdf file with image compression optimizations (see section below):<code> 
 +gs -dNOPAUSE -sDEVICE=pdfwrite -dAutoFilterColorImages=false -dColorImageFilter=/FlateEncode -dAutoFilterGrayImages=false -dGrayImageFilter=/FlateEncode -sOutputFile=output.pdf input.ps -c quit 
 +</code> 
  
 ===== Image Compression with Ghostscript ===== ===== Image Compression with Ghostscript =====
Line 25: Line 53:
   * http://cosmocoffee.info/viewtopic.php?p=213   * http://cosmocoffee.info/viewtopic.php?p=213
  
-The ghostscript defaults -dAutoFilterColorImages=true and -dAutoFilterGrayImage=true cause ghostscript to automatically detect whether JPEG or Flate compression is most suitable for each image. JPEG is good for photo images. Flate is good for line drawings, cartoons and computer screen shots.+The ghostscript defaults -dAutoFilterColorImages=true and -dAutoFilterGrayImages=true cause ghostscript to automatically detect whether JPEG or Flate compression is most suitable for each image. JPEG is good for photo images. Flate is good for line drawings, cartoons and computer screen shots.
  
 The compression can be forced to JPEG with<code> The compression can be forced to JPEG with<code>
Line 42: Line 70:
 Using -dPDFSETTINGS=/screen will set color and gray image downsampling to 72dpi, -dPDFSETTINGS=/ebook will downsample to 150dpi, and -dPDFSETTINGS=/printer will downsample to 300dpi.  Using -dPDFSETTINGS=/screen will set color and gray image downsampling to 72dpi, -dPDFSETTINGS=/ebook will downsample to 150dpi, and -dPDFSETTINGS=/printer will downsample to 300dpi. 
  
 +===== Troubleshooting / Large Files =====
 +This process failed to work for me on a particular document when text was over an underlying image (the file was very large as a result).  I was able to get everything to work if I opened the pdf file in Adobe Illustrator, then saved it as a PDF, and used this new PDF for the PDF -> PS -> PDF conversion process.  The options I have tested to fix the problems are:
 +  * open pdf file and save as pdf file using Adobe Illustrator, then pdf2ps, then ps2pdf
 +  * open pdf file and save as ps file (via Print -> Save PDF as Postscript) using Apple Preview, then ps2pdf
 +  * open pdf file and save as ps file using Adobe Acrobat, then ps2pdf
 +
 +It should also be noted that resaving the file (even using optimization) in Acrobat 6 did not make a usable pdf file.  The best indicator I can think of for this problem is to watch for unusually large Postscript files, when compared to others in the same file.
 +
 +===== Export Images =====
 +You can export all images from a PDF file using the Advanced -> Export All Images... function in Adobe Acrobat 6
  • docs/pdf/pdf_workflow.1179256728.txt.gz
  • Last modified: 2008/08/03 00:25
  • (external edit)