[Tutor] lstrip() question

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Mon Feb 2 18:58:27 EST 2004



> > But this doesn't seem to quite work if there are multiple leading <br>'s.
>
> >>>> tmp = '<br><br>real estate<br>broker<br>'
> >>>> import re
> >>>> re.sub('^<br>*','',tmp)
> > '<br>real estate<br>broker<br>'
>
>
> What Python version do you have; it seems to be broken.
>
> $ python
> Python 2.3.3 (#1, Dec 30 2003, 08:29:25)
> [GCC 3.3.1 (cygming special)] on cygwin
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import re
> >>> s = '<br><br>foobar'
> >>> re.sub('<br>*', '', s)
> 'foobar'
> >>> tmp = '<br><br>real estate<br>broker<br>'
> >>> re.sub('<br>*', '', tmp)
> 'real estatebroker'


Hi Karl,


No, the regular expression itself is broken.  Here is one counterexample
that should clearly show the problem:

     "<br>>>>hello"

Try running that regular expression on this string, and see what gets
replaced.  The problem with the regex should be a little clearer then.



To fix the problem, take a look at:

    http://www.amk.ca/python/howto/regex/

and look at section 4.2 on "Groups" --- using groups properly should fix
the issue.



Hope this helps!




More information about the Tutor mailing list