regex with specific list of string

Pablo Ziliani pablo at decode.com.ar
Wed Sep 26 13:00:09 EDT 2007


Carsten Haese wrote:
> On Wed, 2007-09-26 at 15:42 +0000, james_027 wrote:
>   
>> hi,
>>
>> how do I regex that could check on any of the value that match any one
>> of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
>> 'sep', 'oct', 'nov', 'dec'
>>     
>
> Why regex? You can simply check if the given value is contained in the
> set of allowed values:
>
>   
>>>> s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
>>>>         
> 'sep', 'oct', 'nov', 'dec'])
>   
>>>> 'jan' in s

Also, check calendar for a locale aware (vs hardcoded) version:

>>> import calendar
>>> [calendar.month_abbr[i].lower() for i in range(1,13)]
['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec']

If you still want to use regexes, you can do something like:
>>> import re
>>> pattern = '(?:%s)' % '|'.join(calendar.month_abbr[1:13])
>>> pattern
'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)'
>>> re.search(pattern, "we are in september", re.IGNORECASE)
<_sre.SRE_Match object at 0xb7ced640>
>>> re.search(pattern, "we are in september", re.IGNORECASE).group()
'sep'

If you want to make sure that the month name begins a word, use the following pattern instead:
>>> pattern = r'(?:\b%s)' % r'|\b'.join(calendar.month_abbr[1:13])
>>> pattern
'(?:\\bJan|\\bFeb|\\bMar|\\bApr|\\bMay|\\bJun|\\bJul|\\bAug|\\bSep|\\bOct|\\bNov|\\bDec)'

If in doubt, Google for "regular expressions in python" or go to http://docs.python.org/lib/module-re.html


Regards,
Pablo




More information about the Python-list mailing list