Downloading/Saving to a Directory
MRAB
python at mrabarnett.plus.com
Thu Nov 28 11:51:53 EST 2013
On 28/11/2013 15:19, TheRandomPast . wrote:
> Hi,
>
> I've created a script that allows me to see how many images are on a
> webpage and their URL however now I want to download all .jpg images
> from this website and save them onto my computer. I've never done this
> before and I've become a little confused as to where I should go next.
> Can some kind person take a look at my code and tell me if I'm
> completely in the wrong direction?
>
> Just to clarify what I want to do is download all .jpg images on
> dogpicturesite.com <http://dogpicturesite.com> and save them to a
> directory on my computer.
>
> Sorry if this is a really stupid question.
>
> import traceback
> import sys
> from urllib import urlretrieve
>
> try:
>
> print ' imagefiles()'
The regex matches only the names of the images. Try matching their
entire URLs.
> images = re.findall(r'([-\w]+\.(?:jpg))', webpage)
For each URL, download the image and save it into the folder. You can
make a path for each image by joining (That's a hint! Look in os.path)
the path of the folder with the name of the image.
> urlretrieve('http://dogpicturesite.com/', 'C:/images)
> print "Downloading Images....."
> time.sleep(5)
> print "Images Downloaded."
Don't use a 'bare' except. It swallows EVERY exception. Catch only what
you're willing to handle, and let the other exceptions just show
themselves.
> except:
> print "Failed to Download Images"
> raw_input('Press Enter to exit...')
> sys.exit()
>
> def main():
> sys.argv.append('http://dogpicturesite.com/')
> if len(sys.argv) != 2:
> print '[-] Image Files'
> return
> page = webpage.webpage(sys.argv[1])
> imagefiles(webpage)
>
More information about the Python-list
mailing list