[Tutor] using regular expressions

Karl Pflästerer sigurd@12move.de
Tue Jul 8 17:05:09 2003


On  8 Jul 2003, Lance E Sloan <- lsloan@umich.edu wrote:

> I've converted a program from Perl that processes output from Whois
> servers.  Where the Perl code had read:

>   if(/Parent:\s+(\S+)/) {
>     [use $1 here]
>     ...
>   }

> I translated it to this Python code:

>   if ( re.search( r'Parent:\s+(\S+)', line ) ):
>     val = re.match( r'Parent:\s+(\S+)', line ).group( 1 )
>     [use val here]
>     ...

> The program's not terribly slow, but I feel bad about the inefficiency
> of using the same regex twice.  I tried this Perl-like code, but
> Python didn't like it:

>   if ( ( matches = re.search( r'Parent:\s+(\S+)', line ) ) ):
>     val = matches.group( 1 )
>     [use val here]
>     ...

> I get a SyntaxError at the equal-sign.

Right.  Python is not eg. C.  Assignment is in Python a statement not an
expression so it can't be written in an if statement.

> What's the proper Pythonish way to do this?

In python you can compile your regexp.  That gives faster results.  Then
it is often better to use a `try ...except' block instead of `if ...' if
you do the search in a loop.

import re
reg = re.compile(r'Parent:\s+(\S+)')
match = reg.search(line)
if match:
    val = match.group(1)


If you iterate over the lines you could write it like that:

import re
reg = re.compile(r'Parent:\s+(\S+)')
for line in output:
    match = reg.search(line)
    try:
        val = match.group(1)
        .
        .
        .
    except AttributeError:
        pass


   Karl
-- 
    'Twas brillig, and the slithy toves
        Did gyre and gimble in the wabe;
    All mimsy were the borogoves,
         And the mome raths outgrabe.   "Lewis Carroll" "Jabberwocky"