[Tutor] Regular Expressions: escaping in character classes/character sets

Josh Rosen rosenville at gmail.com
Mon Jul 7 07:44:56 CEST 2008


I was browsing through the source code of Django when I found the  
following regular expression:

tspecials = re.compile(r'[ \(\)<>@,;:\\"/\[\]\?=]')

As it turns out, this line from the message module in the Python  
standard library's  email module.  It seems to be used to determine if  
an email header parameter's value contains special characters to  
determine whether it should be wrapped in quotes.

What strikes me as odd about this regex is that the parentheses and  
the question-mark are escaped with backslashes.  I know that's  
necessary for including those literals in the rest of the expression,  
but they shouldn't need to be escaped within a character class,  
right?  Shouldn't this be functionally equivalent to the much more  
readable:

tspecials = re.compile(r'[ ()<>@,;:\\"/\[\]?=]')


More information about the Tutor mailing list