[Tutor] Regex help
Bill Burns
billburns at pennswoods.net
Mon Oct 10 06:10:58 CEST 2005
I'm looking to get the size (width, length) of a PDF file. Every pdf
file has a 'tag' (in the file) that looks similar to this
Example #1
MediaBox [0 0 612 792]
or this
Example #2
MediaBox [ 0 0 612 792 ]
I figured a regex might be a good way to get this data but the
whitespace (or no whitespace) after the left bracket has me stumped.
If I do this
pattern = re.compile('MediaBox \[\d+ \d+ \d+ \d+')
I can find the MediaBox in Example #1 but I have to do this
pattern = re.compile('MediaBox \[ \d+ \d+ \d+ \d+')
to find it for Example #2.
How can I make *one* regex that will match both cases?
Thanks for the help,
Bill
More information about the Tutor
mailing list