builtin regular expressions?

Nick Craig-Wood nick at craig-wood.com
Mon Oct 2 04:30:07 EDT 2006


Kay Schluehr <kay.schluehr at gmx.net> wrote:
>  I notice two issues here. Only one has anything to do with regular
>  expressions. The other one with 'explicit is better than implicit': the
>  many implicit passing operations of Rubys case statement. Using
>  pseudo-Python notation this could be refactored into a more explicit
>  and not far less concise style:
> 
>  if line.match( "title=(.*)" ) as m:
>     print "Title is %s"%m.group(1)
>  elif line.match( "track=(.*)" ) as m:
>     print "Track is %s"%m.group(1)
>  elif line.match( "artist=(.*)" ) as m:
>     print "Artist is %s"%m.group(1)
> 
>  Here the result of the test line.match( ) is assigned to the local
>  variable m if bool(line.match( )) == True. Later m can be used in the
>  subsequent block.

Interesting!

This is exactly the area that (for me) python regexps' become more
clunky that perl's - not being able to assign and test the match
object in one line.

This leads to the rather wordy

    m = re.match("title=(.*)", line)
    if m:
        print "Title is %s" % m.group(1)
    else:
        m =  re.match("track=(.*)", line)
        if m:
            print "Track is %s"%m.group(1)
        else:
            m = re.match("artist=(.*)", line)
            if m:
                print "Artist is %s"%m.group(1)
    

If you could write

    if re.match("title=(.*)", line) as m:
        print "Title is %s" % m.group(1)
    elif re.match("track=(.*)", line) as m:
        print "Track is %s" % m.group(1)
    elif re.match("artist=(.*)", line) as m:
        print "Artist is %s" % m.group(1)

that would be a significant benefit.

You can of course define a helper class like this

    class Matcher:
        """Regexp matcher helper"""
        def match(self, r,s):
            """Do a regular expression match and return if it matched."""
            self.value = re.match(r,s)
            return self.value
        def __getitem__(self, n):
            """Return n'th matched () item."""
            return self.value.group(n)

Which makes this bit really quite neat

    m = Matcher()
    if m.match("title=(.*)", line):
        print "Title is %s" % m[1]
    elif m.match("track=(.*)", line):
        print "Track is %s" % m[1]
    elif m.match("artist=(.*)", line):
        print "Artist is %s" % m[1]

> Moreover match becomes a string method. No need for extra importing
> re and applying re.compile(). Both can be done in str.match() if
> necessary.

I'm happy with the re module.  Having transitioned from perl to python
some time ago now, I find myself using many fewer regexps due to the
much better built in string methods of python.  This is a good thing,
because regexps should be used sparingly and they do degenerate into
line noise quite quickly...

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list