Composing regex from a list

Vlastimil Brom vlastimil.brom at gmail.com
Thu Jun 16 09:25:32 EDT 2011


2011/6/16 TheSaint <nobody at nowhere.net.no>:
> Hello,
> Is it possible to compile a regex by supplying a list?
>
> lst= ['good', 'brilliant'. 'solid']
> re.compile(r'^'(any_of_lst))
>
> without to go into a *for* cicle?
>

In simple cases, you can just join the list of alternatives on "|" and
incorporate it in the pattern  - e.g. in non capturing parentheses:
(?: ...)
cf.:
>>>
>>> lst= ['good', 'brilliant', 'solid']
>>> import re
>>> re.findall(r"^(?:"+"|".join(lst)+")", u"solid sample text; brilliant QWERT")
[u'solid']
>>>

[using findall just to show the result directly, it is not that usual
with starting ^ ...]

However, if there can be metacharacters like [ ] | . ? * + ... in the
alternative "words", you have to use re.escape(...) on each of these
before.

Or you can use a newer regex implementation with more features
http://pypi.python.org/pypi/regex

which was just provisionally enhanced with an option for exactly this usecase:
cf. Additional features: Named lists on the above page; in this case:

>>> import regex # http://pypi.python.org/pypi/regex
>>> regex.findall(r"^\L<options>", u"solid sample text; brilliant QWERT", options=lst)
[u'solid']
>>>

hth,
  vbr



More information about the Python-list mailing list