[Tutor] Python file escaping issue?

Steven D'Aprano steve at pearwood.info
Mon Feb 22 00:01:06 CET 2010


On Mon, 22 Feb 2010 05:22:10 am Sithembewena Lloyd Dube wrote:
> Hi all,
>
> I'm trying to read a file (Python 2.5.2, Windows XP) as follows:
>
> assignment_file = open('C:\Documents and Settings\coderoid\My
> Documents\Downloads\code_sample.txt', 'r+').readlines()
> new_file = open(new_file.txt, 'w+')
> for line in assignment_file:
>     new_file.write(line)
>
> new_file.close()
> assignment_file.close()
>
> When the code runs, the file path has the slashes converted to double
> slashes. When  try to escape them, i just seemto add more slashes.
> What am i missing?

An understanding of how backslash escapes work in Python.

Backslashes in string literals (but not in text you read from a file, 
say) are used to inject special characters into the string, just like C 
and other languages do. These backslash escapes include:

\t tab
\n newline
\f formfeed
\\ backslash

and many others. Any other non-special backslash is left alone.

So when you write a string literal including backslashes and a special 
character, you get this:

>>> s = 'abc\tz'  # tab
>>> print s
abc     z
>>> print repr(s)
'abc\tz'
>>> len(s)
5

But if the escape is not a special character:

>>> s = 'abc\dz'  # nothing special
>>> print s
abc\dz
>>> print repr(s)
'abc\\dz'
>>> len(s)
6

The double backslash is part of the *display* of the string, like the 
quotation marks, and not part of the string itself. The string itself 
only has a single backslash and no quote marks.

So if you write a pathname like this:

>>> path = 'C:\datafile.txt'
>>> print path
C:\datafile.txt
>>> len(path)
15

It *seems* to work, because \d is left as backlash-d. But then you do 
this, and wonder why you can't open the file:

>>> path = 'C:\textfile.txt'
>>> print path
C:      extfile.txt
>>> len(path)
14


Some people recommend using raw strings. Raw strings turn off backslash 
processing, so you can do this:

>>> path = r'C:\textfile.txt'
>>> print path
C:\textfile.txt

But raw strings were invented for the regular expression module, not for 
Windows pathnames, and they have a major limitation: you can't end a 
raw string with a backslash.

>>> path = r'C:\directory\'
  File "<stdin>", line 1
    path = r'C:\directory\'
                          ^
SyntaxError: EOL while scanning single-quoted string


The best advice is to remember that Windows allows both forward and 
backwards slashes as the path separator, and just write all your paths 
using the forward slash:

'C:/directory/'
'C:textfile.txt'



-- 
Steven D'Aprano


More information about the Tutor mailing list