Python's regular expression help

goldtech goldtech at worldpost.com
Fri Apr 30 00:12:05 CEST 2010


On Apr 29, 11:49 am, Tim Chase <python.l... at tim.thechases.com> wrote:
> On 04/29/2010 01:00 PM, goldtech wrote:
>
> > Trying to start out with simple things but apparently there's some
> > basics I need help with. This works OK:
> >>>> import re
> >>>> p = re.compile('(ab*)(sss)')
> >>>> m = p.match( 'absss' )
>
> >>>> f=r'abss'
> >>>> f
> > 'abss'
> >>>> m = p.match( f )
> >>>> m.group(0)
> > Traceback (most recent call last):
> >    File "<pyshell#15>", line 1, in<module>
> >      m.group(0)
> > AttributeError: 'NoneType' object has no attribute 'group'
>
> 'absss' != 'abss'
>
> Your regexp looks for 3 "s", your "f" contains only 2.  So the
> regexp object doesn't, well, match.  Try
>
>    f = 'absss'
>
> and it will work.  As an aside, using raw-strings for this text
> doesn't change anything, but if you want, you _can_ write it as
>
>    f = r'absss'
>
> if it will make you feel better :)
>
> > How do I implement a regex on a multiline string?  I thought this
> > might work but there's problem:
>
> >>>> p = re.compile('(ab*)(sss)', re.S)
> >>>> m = p.match( 'ab\nsss' )
> >>>> m.group(0)
> > Traceback (most recent call last):
> >    File "<pyshell#26>", line 1, in<module>
> >      m.group(0)
> > AttributeError: 'NoneType' object has no attribute 'group'
>
> Well, it depends on what you want to do -- regexps are fairly
> precise, so if you want to allow whitespace between the two, you
> can use
>
>    r = re.compile(r'(ab*)\s*(sss)')
>
> If you want to allow whitespace anywhere, it gets uglier, and
> your capture/group results will contain that whitespace:
>
>    r'(a\s*b*)\s*(s\s*s\s*s)'
>
> Alternatively, if you don't want to allow arbitrary whitespace
> but only newlines, you can use "\n*" instead of "\s*"
>
> -tkc

Yes, most of my problem is w/my patterns not w/any python re syntax.

I thought re.S will take a multiline string with any spaces or
newlines and make it appear as one line to the regex. Make "/n" be
ignored in a way...still playing w/it. Thanks for the help!



More information about the Python-list mailing list