urllib2 performance on windows, usb connection
dq
dq at gmail.com
Fri Feb 6 18:36:02 EST 2009
dq wrote:
> MRAB wrote:
>> dq wrote:
>> > Martin v. Löwis wrote:
>> >>> So does anyone know what the deal is with this? Why is the same
>> code so
>> >>> much slower on Windows? Hope someone can tell me before a holy war
>> >>> erupts :-)
>> >>
>> >> Only the holy war can give an answer here. It certainly has
>> *nothing* to
>> >> do with Python; Python calls the operating system functions to
>> read from
>> >> the network and write to the disk almost directly. So it must be the
>> >> operating system itself that slows it down.
>> >>
>> >> To investigate further, you might drop the write operating, and
>> measure
>> >> only source.read(). If that is slower, then, for some reason, the
>> >> network speed is bad on Windows. Maybe you have the network
>> interfaces
>> >> misconfigured? Maybe you are using wireless on Windows, but cable on
>> >> Linux? Maybe you have some network filtering software running on
>> >> Windows? Maybe it's just that Windows sucks?-)
>> >>
>> >> If the network read speed is fine, but writing slows down, I ask the
>> >> same questions. Perhaps you have some virus scanner installed that
>> >> filters all write operations? Maybe Windows sucks?
>> >>
>> >> Regards,
>> >> Martin
>> >>
>> >
>> > Thanks for the ideas, Martin. I ran a couple of experiments to
>> find the
>> > culprit, by downloading the same 20 MB file from the same fast
>> server. I
>> > compared:
>> >
>> > 1. DL to HD vs USB iPod.
>> > 2. AV on-access protection on vs. off
>> > 3. "source. read()" only vs. "file.write( source.read() )"
>> >
>> > The culprit is definitely the write speed on the iPod. That is,
>> > everything runs plenty fast (~1 MB/s down) as long as I'm not writing
>> > directly to the iPod. This is kind of odd, because if I copy the file
>> > over from the HD to the iPod using windows (drag-n-drop), it takes
>> about
>> > a second or two, so about 10 MB/s.
>> >
>> > So the problem is definitely partially Windows, but it also seems that
>> > Python's file.write() function is not without blame. It's the
>> > combination of Windows, iPod and Python's data stream that is
>> slowing me
>> > down.
>> >
>> > I'm not really sure what I can do about this. I'll experiment a
>> little
>> > more and see if there's any way around this bottleneck. If anyone has
>> > run into a problem like this, I'd love to hear about it...
>> >
>> You could try copying the file to the iPod using the command line, or
>> copying data from disk to iPod in, say, C, anything but Python. This
>> would allow you to identify whether Python itself has anything to do
>> with it.
>
> Well, I think I've partially identified the problem. target.write(
> source.read() ) runs perfectly fast, copies 20 megs in about a second,
> from HD to iPod. However, if I run the same code in a while loop, using
> a certain block size, say target.write( source.read(4096) ), it takes
> forever (or at least I'm still timing it while I write this post).
>
> The mismatch seems to be between urllib2's block size and the write
> speed of the iPod, I might try to tweak this a little in the code and
> see if it has any effect.
>
> Oh, there we go: 20 megs in 135.8 seconds. Yeah... I might want to
> try to improve that...
After some tweaking of the block size, I managed to get the DL speed up
to about 900 Mb/s. It's still not quite Ubuntu, but it's a good order
of magnitude better. The new DL code is pretty much this:
"""
blocksize = 2 ** 16 # plus or minus a power of 2
source = urllib2.urlopen( 'url://string' )
target = open( pathname, 'wb')
fullsize = float( source.info()['Content-Length'] )
DLd = 0
while DLd < fullsize:
DLd = DLd + blocksize
# optional: write some DL progress info
# somewhere, e.g. stdout
target.close()
source.close()
"""
More information about the Python-list
mailing list