file read, binary or text mode

Peter Hansen peter at engcorp.com
Fri Sep 24 09:10:54 EDT 2004


Guyon Morée wrote:
> ok, i have huffman encoding code.
> 
> this is actually build for text, but because python can also read a binary
> file as a string, this applies equally well :)
> 
> but, i was just wondering if this gives any problems if I use text-mode read
> for the binary files and vice versa.
> 
> If I undertand correctly now, using binary mode is _always_ save, right?

You're not helping a whole lot here.  What platform are you using?
I'll assume from the headers in your message that it's Windows.
If that's true, then forget about text and binary and ASCII for
a moment, and just consider this.

If you open a file on Windows using "r" or "rt" or the default (which
is "r"), then when you read the file any occurrences of the byte
sequence 13 followed by 10 (that is, CR LF or \r\n or whatever you want
to call it) will be replaced as the file is read by just the 10, or the
LF, or the \n, or whatever you want to call it.

If you use "rb" instead of just "r" or the default, then this
translation will not occur and you will retrieve all bytes in
the file just as they are stored there.

It's up to you to pick the behaviour you need.  Saying it's
"huffman encoding code" doesn't really help, since that doesn't
refer to any universal standard representation data.  It
seems likely that it's binary (i.e. the translation provided by
not using "rb" is undesirable), but nobody here knows where you
got that file or what it contains.

And in case that doesn't answer the questions above: (1) yes,
it can definitely give problems reading text files as binary
and vice versa, and (2) binary mode applies whenever "b" is
used on Windows, and not otherwise, so if you save a file without
using "wb" you will get the same translation as above but in
the reverse direction (LF or \n gets turned into CR LF or \r\n
on output).

-Peter



More information about the Python-list mailing list