[Tutor] Regex help

Bill Burns billburns at pennswoods.net
Mon Oct 10 12:42:33 CEST 2005


>> I'm looking to get the size (width, length) of a PDF file. Every pdf
>> file has a 'tag' (in the file) that looks similar to this
>>
>> Example #1
>> MediaBox [0 0 612 792]
>>
>> or this
>>
>> Example #2
>> MediaBox [ 0 0 612 792 ]
>>
>> I figured a regex might be a good way to get this data but the
>> whitespace (or no whitespace) after the left bracket has me stumped.
>>
>> If I do this
>>
>> pattern = re.compile('MediaBox \[\d+ \d+ \d+ \d+')
>>
>> I can find the MediaBox in Example #1 but I have to do this
>>
>> pattern = re.compile('MediaBox \[ \d+ \d+ \d+ \d+')
>>
>> to find it for Example #2.
>>
>> How can I make *one* regex that will match both cases?
> 
> 
> pattern = re.compile('MediaBox \[ *\d+ \d+ \d+ \d+')

Bob,

Thanks that works perfectly!

Bill


More information about the Tutor mailing list