urllib2 performance on windows, usb connection

dq dq at gmail.com
Fri Feb 6 18:36:02 EST 2009


dq wrote:
> MRAB wrote:
>> dq wrote:
>>  > Martin v. Löwis wrote:
>>  >>> So does anyone know what the deal is with this?  Why is the same 
>> code so
>>  >>> much slower on Windows?  Hope someone can tell me before a holy war
>>  >>> erupts :-)
>>  >>
>>  >> Only the holy war can give an answer here. It certainly has 
>> *nothing* to
>>  >> do with Python; Python calls the operating system functions to 
>> read from
>>  >> the network and write to the disk almost directly. So it must be the
>>  >> operating system itself that slows it down.
>>  >>
>>  >> To investigate further, you might drop the write operating, and 
>> measure
>>  >> only source.read(). If that is slower, then, for some reason, the
>>  >> network speed is bad on Windows. Maybe you have the network 
>> interfaces
>>  >> misconfigured? Maybe you are using wireless on Windows, but cable on
>>  >> Linux? Maybe you have some network filtering software running on
>>  >> Windows? Maybe it's just that Windows sucks?-)
>>  >>
>>  >> If the network read speed is fine, but writing slows down, I ask the
>>  >> same questions. Perhaps you have some virus scanner installed that
>>  >> filters all write operations? Maybe Windows sucks?
>>  >>
>>  >> Regards,
>>  >> Martin
>>  >>
>>  >
>>  > Thanks for the ideas, Martin.  I ran a couple of experiments to 
>> find the
>>  > culprit, by downloading the same 20 MB file from the same fast 
>> server. I
>>  > compared:
>>  >
>>  > 1.  DL to HD vs USB iPod.
>>  > 2.  AV on-access protection on vs. off
>>  > 3.  "source. read()" only vs.  "file.write( source.read() )"
>>  >
>>  > The culprit is definitely the write speed on the iPod.  That is,
>>  > everything runs plenty fast (~1 MB/s down) as long as I'm not writing
>>  > directly to the iPod.  This is kind of odd, because if I copy the file
>>  > over from the HD to the iPod using windows (drag-n-drop), it takes 
>> about
>>  > a second or two, so about 10 MB/s.
>>  >
>>  > So the problem is definitely partially Windows, but it also seems that
>>  > Python's file.write() function is not without blame.  It's the
>>  > combination of Windows, iPod and Python's data stream that is 
>> slowing me
>>  > down.
>>  >
>>  > I'm not really sure what I can do about this.  I'll experiment a 
>> little
>>  > more and see if there's any way around this bottleneck.  If anyone has
>>  > run into a problem like this, I'd love to hear about it...
>>  >
>> You could try copying the file to the iPod using the command line, or
>> copying data from disk to iPod in, say, C, anything but Python. This
>> would allow you to identify whether Python itself has anything to do
>> with it.
> 
> Well, I think I've partially identified the problem.  target.write( 
> source.read() ) runs perfectly fast, copies 20 megs in about a second, 
> from HD to iPod.  However, if I run the same code in a while loop, using 
> a certain block size, say target.write( source.read(4096) ), it takes 
> forever (or at least I'm still timing it while I write this post).
> 
> The mismatch seems to be between urllib2's block size and the write 
> speed of the iPod, I might try to tweak this a little in the code and 
> see if it has any effect.
> 
> Oh, there we go:   20 megs in 135.8 seconds.  Yeah... I might want to 
> try to improve that...

After some tweaking of the block size, I managed to get the DL speed up 
to about 900 Mb/s.  It's still not quite Ubuntu, but it's a good order 
of magnitude better.  The new DL code is pretty much this:

"""
blocksize = 2 ** 16    # plus or minus a power of 2
source = urllib2.urlopen( 'url://string' )
target = open( pathname, 'wb')
fullsize = float( source.info()['Content-Length'] )
DLd = 0
while DLd < fullsize:
	DLd = DLd + blocksize
	# optional:  write some DL progress info
	# somewhere, e.g. stdout
target.close()
source.close()
"""


	
	






More information about the Python-list mailing list