RE strings (was: Variable Interpolation - status of PEP 215)

Cimarron Taylor cimarron+google at taylors.org
Fri Jun 21 13:44:50 EDT 2002


I should have written that to me

   def parseline1(line):
      m = re'^EMP:([^,]*),([^,]*),([^,]*),([^,]*)$'.match(line)
      if m:
          ...

is better than

   import re
   emp_re = re.compile(r'^EMP:([^,]*),([^,]*),([^,]*),([^,]*)$')

   def parseline2(line):
      m = emp_re.match(line)
      if m:
          ...
or

   import re
   def parseline3(line):
      m = re.match(r'^EMP:([^,]*),([^,]*),([^,]*),([^,]*)$', line)
      if m:
          ...


Unless I'm mistaken about how the re module works, parseline3 will
recompile the regex each time it is called.  parseline2 avoids this
but introduces an extra object.  parseline1 should compile the regex
once as well and avoid exposing the regex object.  

I should also write that while I personally like the appearance of
parseline1, I think the most significant benefit of an RE string
comes from the fact that python's parser/compiler would detect errors
in the regex before parseline1 was ever executed.  I'm not sure 
RE strings are a good idea if the implementation could not do this. 

Cim



More information about the Python-list mailing list