How to overcome the incomplete download with urllib.urlretrieve ?

Matt Nordhoff mnordhoff at mattnordhoff.com
Mon Feb 18 10:47:28 EST 2008


This isn't super-helpful, but...

James Yu wrote:
> This is part of my code that invokes urllib.urlretrieve:
> 
>     for i in link2Visit:
>         localName = i.split('/')
>         i = i.replace(' ', '%20')

You should use urllib.quote or urllib.quote_plus (the latter replaces
spaces with "+" instead of "%20") rather than a half-solution that only
escapes one character.

>         tgtPath = ['d:\\', 'work', 'python', 'grab_n_view']
>         localPath = ''
>         for j in tgtPath:
>             localPath = os.path.join(localPath, j)

You could do something like:

tgtPath = os.path.join('d:\\', 'work', 'python', 'grab_n_view')

(and don't repeat it on every loop).

>         localPath = os.path.join(localPath, localName[-1])

What about using os.path.basename() instead of the localName business?

>         info = urllib.urlretrieve(i, localPath)
> 
> link2Visit stores the url to some photos.
> After the script finishes running, I got a pile of incomplete JPG files,
> each takes only 413 bytes of my disk space.
> Did I miss something before using urllib.urlretrieve ?

Try opening one of the files in a text editor. It could be a "404 Not
Found" error document or something.
-- 



More information about the Python-list mailing list