raw strings

Duncan Booth duncan at rcp.co.uk
Fri Oct 11 16:00:41 CEST 2002

mis6 at pitt.edu (Michele Simionato) wrote in
news:2259b0e2.0210110528.449ce434 at posting.google.com: 

> Suppose for instance I want to substitute regexp1 with regexp2 
in a
> text: in sed or perl I would give a command like
> s/regexp1/regexp2/

... where regexp1 is a regular expression and regexp2 is a string.

> In Python I must write 
> import re
> re.compile(r'regexp1').sub(r'regexp2',text)

You could try writing re.sub(regexp1, replacement, string), or 
your terminology: 
   re.sub(r'regexp1', r'regexp2', text)
where regexp2 is not a regular expression.


> For this to work I need a raw_string function such that
> raw_string('regexp')==r'regexp' 

I think you have a fundamental misunderstanding of what a 'raw 
actually is. 

When Python parses your program it converts the characters 
a string constant into a value of type str (or unicode). There are
several ways to write any given string value for example a single
character string containing a newline could be written as any of: 
(Not to mention others such as '''
''' or even '\

You are asking for a function which, given the string, works out 
how the
original constant was written and returns the string which would 
resulted if the original string had been preceded by a backslash. 
other words: 

    raw_string('\n') --> '\\n'
    raw_string('\x0a') --> '\\x0a'
    raw_string('\012') --> '\\012'

but in each case the parameter actually passed to raw_string is 
the same
value, so there is no way to tell which result is required. The 
for a single newline character could even be '\\\n\\n\\\n'. 

> Is there somebody else who thinks like me ?

There are other people who misunderstand raw strings.

More information about the Python-list mailing list