[Tutor] Regex help

bob bgailer at alum.rpi.edu
Mon Oct 10 08:23:52 CEST 2005


At 09:10 PM 10/9/2005, Bill Burns wrote:
>I'm looking to get the size (width, length) of a PDF file. Every pdf
>file has a 'tag' (in the file) that looks similar to this
>
>Example #1
>MediaBox [0 0 612 792]
>
>or this
>
>Example #2
>MediaBox [ 0 0 612 792 ]
>
>I figured a regex might be a good way to get this data but the
>whitespace (or no whitespace) after the left bracket has me stumped.
>
>If I do this
>
>pattern = re.compile('MediaBox \[\d+ \d+ \d+ \d+')
>
>I can find the MediaBox in Example #1 but I have to do this
>
>pattern = re.compile('MediaBox \[ \d+ \d+ \d+ \d+')
>
>to find it for Example #2.
>
>How can I make *one* regex that will match both cases?

pattern = re.compile('MediaBox \[ *\d+ \d+ \d+ \d+') 



More information about the Tutor mailing list