[Tutor] Regular Expression question
Jay Dorsey
python@jaydorsey.com
Fri Apr 18 14:48:02 2003
Scott Chapman wrote:
> Is it possible to make a regular expression that will match:
> '<html blah>' or '<html>'
> without having to make it into two complete expressions seperated by a pipe:
> r'<html[ \t].+?>|<html>'
>
> I want it to require a space or tab and at least one character before the
> closing bracket, after 'html', or just the closing bracket.
>
> Scott
How about
'<html([ \t][^>]+)?>'
>>> import re
>>> x = re.compile('<html([ \t][^>]+)?>')
>>> print x
<_sre.SRE_Pattern object at 0x008B63C0>
>>> y = '<html>'
>>> print x.search(y).group()
<html>
>>> z = '<html blah>'
>>> print x.search(z).group()
<html blah>
>>> a = '<html blah><test>'
>>> print x.search(a).group()
<html blah>
Jay