Help with regex

MRAB python at mrabarnett.plus.com
Thu Aug 6 12:34:59 EDT 2009


Robert Dailey wrote:
> On Aug 6, 11:02 am, MRAB <pyt... at mrabarnett.plus.com> wrote:
>> Robert Dailey wrote:
>>> Hey guys,
>>> I'm creating a python script that is going to try to search a text
>>> file for any text that matches my regular expression. The thing it is
>>> looking for is:
>>> FILEVERSION #,#,#,#
>>> The # symbol represents any number that can be any length 1 or
>>> greater. Example:
>>> FILEVERSION 1,45,10082,3
>>> The regex should only match the exact above. So far here's what I have
>>> come up with:
>>> re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )
>>> This works, but I was hoping for something a bit cleaner. I'm having
>>> to create a special case portion of the regex for the last of the 4
>>> numbers simply because it doesn't end with a comma like the first 3.
>>> Is there a better, more compact, way to write this regex?
>> The character class \d is equivalent to [0-9], and ',' isn't a special
>> character so it doesn't need to be escaped:
>>
>>      re.compile(r'FILEVERSION (?:\d+,){3}\d+')
> 
> But ',' is a special symbol It's used in this way:
> {0,3}
> 
> This will match the previous regex 0-3 times. Are you sure commas need
> not be escaped?
> 
> In any case, your suggestions help to clean it up a bit!

By 'special' I mean ones like '?', '*', '(', etc. ',' isn't special in
that sense.

In fact, the {...} quantifier is special only if it's syntactically
correct, otherwise it's just a literal, eg "a{," and a{} are just
literals.



More information about the Python-list mailing list