Retrieve url's of all jpegs at a web page URL
Chris Rebert
clp2 at rebertia.com
Tue Sep 15 19:43:23 EDT 2009
On Tue, Sep 15, 2009 at 7:28 AM, grimmus <graham.colmer at gmail.com> wrote:
> Hi,
>
> I would like to achieve something like Facebook has when you post a
> link. It shows images located at the URL you entered so you can choose
> what one to display as a summary.
>
> I was thinking i could loop through the html of a page with a regex
> and store all the jpeg url's in an array. Then, i could open the
> images one by one and save them as thumbnails with something like
> below.
0. Install BeautifulSoup (http://www.crummy.com/software/BeautifulSoup/)
1:
#untested
from BeautifulSoup import BeautifulSoup
import urllib
page_url = "http://the.url.here"
with urllib.urlopen(page_url) as f:
soup = BeautifulSoup(f.read())
for img_tag in soup.findAll("img"):
relative_url = img_tag.src
img_url = make_absolute(relative_url, page_url)
save_image_from_url(img_url)
2. Write make_absolute() and save_image_from_url()
3. Profit.
Cheers,
Chris
--
http://blog.rebertia.com
More information about the Python-list
mailing list