raw strings

Duncan Booth duncan at rcp.co.uk
Mon Oct 14 04:48:42 EDT 2002


mis6 at pitt.edu (Michele Simionato) wrote in
news:2259b0e2.0210111129.1dc80074 at posting.google.com: 

> Duncan Booth <duncan at rcp.co.uk> wrote in message 
> 
>>> s/regexp1/regexp2/
> 
>>... where regexp1 is a regular expression and regexp2 is a string.
> 
> Maybe regexp2 is not a regular expression, but certainly is not a
> standard string, since can contain grouping characters.

The replacement string can contain a backslash followed by a digit to
indicate a replacement group. So r'(\1)' or '(\\1)' is a plausible
replacement string. But saying this is 'not a standard string' doesn't
make sense. you might as well say that the format strings for the '%'
operator are not standard strings because that can contain sequences such
as '%s', or that DOS filenames are not standard strings because they can
contain backslash characters '\\1' (or r'\1') is a perfectly valid DOS
filename. 

<snip>
> I had the impression that the use of re.sub(), without compiling first 
> the regular expression, was quite inefficient. Now I did some profiling 
> and discovered that it is worse, but only by 10%, practically nothing. 
> Therefore I will use the non-compiled form in the future. 

Yes, the most recently used regular expressions are cached, so when you
pass in a string instead of a compiled expression the system first checks
whether the string is in the cache. If so it can simply retrieve the old
compiled expression and use that.

I think the choice between using strings or compiled expressions is much
more a choice of style than of speed. I would avoid using a literal 
string for a regular expression, instead I would take the pattern out of 
whatever function used it and declare a constant somewhere. It is then a
simple step to compile that constant where it is declared. If you
precompile the pattern then you can use methods on the pattern, but if 
you don't write in an object oriented manner then it doesn't make any 
great difference whether or not you compile it.




More information about the Python-list mailing list