[Tutor] File transfer
steve at pearwood.info
Sun Oct 31 23:42:45 CET 2010
Chris King quoted Corey Richardson:
> On 10/31/2010 12:03 PM, Corey Richardson wrote:
>> To read from a file, you open it, and then read() it into a string
>> like this:
>> for line in file:
>> string += string + file.readline()
Aiieeee! Worst way to read from a file *EVAR*!!!
Seriously. Don't do this. This is just *wrong*.
(1) You're mixing file iteration (which already reads line by line) with
readline(), which would end in every second line going missing.
Fortunately Python doesn't let you do this:
>>> for line in file:
... print file.readline()
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
ValueError: Mixing iteration and read methods would lose data
(2) Even if Python let you do it, it would be slow. Never write any loop
with repeated string concatenation:
result = ''
for string in list_of_strings:
result += string # No! Never do this! BAD BAD BAD!!!
This can, depending on the version of Python, the operating system, and
various other factors, end up being thousands of times slower than the
result = ''.join(list_of_strings)
I'm not exaggerating. There was a bug reported in the Python HTTP
library about six(?) months ago where Python was taking half an hour to
read a file that Internet Explorer or wget could read in under a second.
You might be lucky and never notice the poor performance, but one of
your users will. This is a Shlemiel the Painter algorithm:
Under some circumstances, *some* versions of Python can correct for the
poor performance and optimize it to run quickly, but not all versions,
and even the ones that do sometimes run into operating system dependent
problems that lead to terrible performance. Don't write Shlemiel the
The right way to read chunks of data from a file is with the read method:
fp = open("filename", "rb") # open in binary mode
data = fp.read() # read the whole file
If the file is large, and you want to read it in small chunks, read()
takes a number of optional arguments including how many bytes to read:
fp.read(64) # read 64 bytes
If you want to read text files in lines, you can use the readline()
method, which reads up to and including the next end of line; or
readlines() which returns a list of each line; or just iterate over the
file to get
Chris King went on to ask:
> I don't think readline will work an image. How do you get raw binary
> from a zip? Also make sure you do reply to the tutor list too, not just me.
readline() works fine on binary files, including images, but it won't be
useful because binary files aren't split into lines.
readline() reads until end-of-line, which varies according to the
operating system you are running, but often is \n. A binary file may or
may not contain any end-of-line characters. If it does, then readline()
will read up to the next EOL perfectly fine:
and if it doesn't, readline() will happily read the entire file all the
way to the end:
To read a zip file as raw data, just open it as a regular binary file:
f = open("data.zip", "rb")
But this is the wrong way to solve the problem of transferring files
from one computer to another. The right way is to use a transport
protocol that already works, something like FTP or HTTP. The only reason
for dealing with files as bytes is if you want to create your own file
More information about the Tutor