python dowload

Tim Chase python.list at tim.thechases.com
Sat Feb 27 21:19:03 EST 2010


Aahz wrote:
> monkeys paw  <monkey at joemoney.net> wrote:
>> On 2/23/2010 3:17 PM, Tim Chase wrote:
>>> Sure you don't need this to be 'wb' instead of 'w'?
>> 'wb' does the trick. Thanks all!
>>
>> import urllib2
>> a = open('adobe.pdf', 'wb')
>> i = 0
>> for line in 
>> urllib2.urlopen('http://www.whirlpoolwaterheaters.com/downloads/6510413.pdf'):
>>     i = i + 1
>>     a.write(line)
> 
> Using a for loop here is still a BAD IDEA -- line could easily end up
> megabytes in size (though that is statistically unlikely).

Just so the OP has it, dealing with binary files without reading 
the entire content into memory would look something like

   from urllib2 import urlopen
   CHUNK_SIZE = 1024*4  # 4k, why not?
   OUT_NAME = 'out.pdf'
   a = open(OUT_NAME, 'wb')
   u = urlopen(URL)
   bytes_read = 0
   while True:
     data = u.read(CHUNK_SIZE)
     if not data: break
     a.write(data)
     bytes_read += len(data)
   print "Wrote %i bytes to %s" % (
     bytes_read, OUT_NAME)

-tkc



More information about the Python-list mailing list