Determining when a file is an Open Office Document
Ross Ridge
rridge at csclub.uwaterloo.ca
Fri Jan 19 15:48:14 EST 2007
tubby wrote:
> Now, If only I could something like that on PDF files :)
PDF files should begin with "%PDF-" followed by a version number, eg.
"%PDF-1.4". The PDF Reference notes that Adobe Acrobat Reader is a bit
more flexiable about what it will accept:
13. Acrobat viewers require only that the header appear
somewhere within the first 1024 bytes of the file.
14. Acrobat viewers also accept a header of the form
%!PS-Adobe-N.n PDF-M.m
So identifying PDF files is pretty easy. If you want to examine the
contents of a PDF file you're better off using Postscript, Ghostscript
specifically, since PDF is essentially Postscript with a special
dictionary of commands.
Ross Ridge
More information about the Python-list
mailing list