[Tutor] Python file escaping issue?
Steven D'Aprano
steve at pearwood.info
Mon Feb 22 00:01:06 CET 2010
On Mon, 22 Feb 2010 05:22:10 am Sithembewena Lloyd Dube wrote:
> Hi all,
>
> I'm trying to read a file (Python 2.5.2, Windows XP) as follows:
>
> assignment_file = open('C:\Documents and Settings\coderoid\My
> Documents\Downloads\code_sample.txt', 'r+').readlines()
> new_file = open(new_file.txt, 'w+')
> for line in assignment_file:
> new_file.write(line)
>
> new_file.close()
> assignment_file.close()
>
> When the code runs, the file path has the slashes converted to double
> slashes. When try to escape them, i just seemto add more slashes.
> What am i missing?
An understanding of how backslash escapes work in Python.
Backslashes in string literals (but not in text you read from a file,
say) are used to inject special characters into the string, just like C
and other languages do. These backslash escapes include:
\t tab
\n newline
\f formfeed
\\ backslash
and many others. Any other non-special backslash is left alone.
So when you write a string literal including backslashes and a special
character, you get this:
>>> s = 'abc\tz' # tab
>>> print s
abc z
>>> print repr(s)
'abc\tz'
>>> len(s)
5
But if the escape is not a special character:
>>> s = 'abc\dz' # nothing special
>>> print s
abc\dz
>>> print repr(s)
'abc\\dz'
>>> len(s)
6
The double backslash is part of the *display* of the string, like the
quotation marks, and not part of the string itself. The string itself
only has a single backslash and no quote marks.
So if you write a pathname like this:
>>> path = 'C:\datafile.txt'
>>> print path
C:\datafile.txt
>>> len(path)
15
It *seems* to work, because \d is left as backlash-d. But then you do
this, and wonder why you can't open the file:
>>> path = 'C:\textfile.txt'
>>> print path
C: extfile.txt
>>> len(path)
14
Some people recommend using raw strings. Raw strings turn off backslash
processing, so you can do this:
>>> path = r'C:\textfile.txt'
>>> print path
C:\textfile.txt
But raw strings were invented for the regular expression module, not for
Windows pathnames, and they have a major limitation: you can't end a
raw string with a backslash.
>>> path = r'C:\directory\'
File "<stdin>", line 1
path = r'C:\directory\'
^
SyntaxError: EOL while scanning single-quoted string
The best advice is to remember that Windows allows both forward and
backwards slashes as the path separator, and just write all your paths
using the forward slash:
'C:/directory/'
'C:textfile.txt'
--
Steven D'Aprano
More information about the Tutor
mailing list