A basic (very) PDF to Tiff script I’ve developed

Why?

  • Creates a directory structure for ideal adding to Sanction/Trial Director.
  • Files converted to single page tiffs for fastest loading (especially of large documents)
  • This version maintains color (LZW compressed Tiff output), to keep up with modern discovery.
  • Designed to address weaknesses in other solutions (not litigation focused, too slow, single threaded)

Run this script by executing in a directory of PDFs, runs in Cygwin, using ghostscript to convert (a cygwin install of bash, and ghostscript should be the minimum required to run).

Developed because AdultPDF converter did not name files as I desired, could only run a single instance (this script can be run in multiple shells to take advantage of multiple CPUs). It is considerably faster than other solutions to this problem.

Currently it is being ported to Python with PyQT for a more friendly experience (no need for cygwin, or the command line at all)

#!/bin/bash

#alter feild separator for loops (makes it Windows EOL)
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
#make output directories, "To Add" is a folder to be drug into sanction and not have numbers incremented, or to TrialDirector using filename as ID.
mkdir "To Add"
mkdir PDFs

#loop through PDFs Convert to single page color LZW TIFs for fast loading and color
#first TIF in each folder and folder in To Add is named for the PDF, additional Tifs have 4 digit page numbers
for a in `ls |grep -i \.PDF` ; do
    echo "STARTING $a $IFS $IFS   $IFS  "
    b=$a
    a=`echo $a| sed 's/\(.*\)\..*/\1/g'`
    mkdir "To Add/$a"
#Convert PDF 300 DPI LZW
    gs -sDEVICE=tiff24nc -sCompression=lzw -r300x300  -dNOPAUSE -sOutputFile="./To Add/$a/$a%04d.tif" "$a.pdf" </dev/null
#Rename first page, no page number
    mv "./To Add/$a/`echo $a`0001.tif" "./To Add/$a/$a.tif"
#        rm "./To Add/$a/`echo $a`0001.tif"
#move PDF to subfolder
    mv "$b" PDFs

    done

#Restore argument separation
IFS=$SAVEIFS

One thought on “A basic (very) PDF to Tiff script I’ve developed

  1. Pingback: A script to name files by Bates number | Drake Dwornik

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>