download complete webpage with python

Larry Bates larry.bates at websafe.com
Sat Dec 8 15:38:25 EST 2007


Gabriel Genellina wrote:
> En Fri, 07 Dec 2007 17:58:43 -0300, yi zhang <zhang1025 at yahoo.com> 
> escribió:
> 
>> The urllib.urlretrieve() can only download the text part of a webpage, 
>> not the image associated. How can I download the whole, complete 
>> webpage with python? Thanks!
> 
> The images are separate from the html document. You have to parse the 
> html text, find the <img> tags, and retrieve them.
> 
Actually IMHO this is even more difficult than it sounds.  Javascript can change 
the webpage after it loads.

Larry



More information about the Python-list mailing list