[Tutor] weather scraping with Beautiful Soup

Che M pine508 at hotmail.com
Fri Jul 17 19:50:42 CEST 2009




> Date: Fri, 17 Jul 2009 08:09:10 -0400
> Subject: Re: [Tutor] weather scraping with Beautiful Soup
> From: kent37 at tds.net
> To: pine508 at hotmail.com
> CC: tutor at python.org
> 
> On Thu, Jul 16, 2009 at 11:21 PM, Che M<pine508 at hotmail.com> wrote:
> > Hi,
> >
> > I am interested in gathering simple weather data using Beautiful Soup, but
> > am having trouble understanding what I'm doing.  I have searched the
> > archives and so far haven't found enough to get me moving forward.
> >
> > Basically I am trying to start off this example:
> >
> > Grabbing Weather Underground Data with BeautifulSoup
> > http://flowingdata.com/2007/07/09/grabbing-weather-underground-data-with-beautifulsoup/
> >
> > But I get to the exact same problem that this other person got to in this
> > post:
> > http://groups.google.com/group/beautifulsoup/browse_thread/thread/13eb3dbf713b8a4a
> >
> > Unfortunately, that post never gives enough help for me to understand how to
> > solve that person's or my problem.
> >
> > What I want to understand is how to find the bits of data you want--in this
> > case, say, today's average temperature and whether it was clear or
> > cloudy--within a web page, and then indicate that to Beautiful Soup.
> 
> One thing that might help is to use the Lite page, if you are not
> already. It has much less formatting and extraneous information to
> wade through. 


I was not aware Weather Underground had a Lite page; thank you, that
is good to know.  It was easier to figure things out in that HTML.

I am getting closer, but still a bit stuck.  Here is my code for the Lite page:

------------
import urllib2
from BeautifulSoup import BeautifulSoup

url = "http://www.wund.com/cgi-bin/findweather/getForecast?query=Worthington%2C+OH"
page = urllib2.urlopen(url)

soup = BeautifulSoup(page)
daytemp = soup.find("div",id="main").findNext("h3").renderContents()

print "Today's temperature in Worthington is: ", daytemp
-------------

This works, but gives this output:

>>> 
Today's temperature in Worthington is:  
<span>75</span>&nbsp;&#176;F

Of course, I just want the 75, not the HTML tags, etc. around it.  But I am not sure
how to indicate that in Beautiful Soup.  So, for example, if I change the soup.find
line above to this (to incorporate the <span>):

daytemp = soup.find("div",id="main").findNext("h3", "span").renderContents()

then I get the following error:

AttributeError: 'NoneType' object has no attribute 'renderContents'

(I also don't understand what the point of having a <span> tag with no style
content in the page?)

Any help is appreciated.  This still feels kind of arcane, but I want to understand
the general approach to doing this, as later I want to try other weather facts
or screen scraping generally.

Thanks.
CM




You might also look for a site that has weather data
> formatted for computer. For example the NOAA has forcast data
> available as plain text:
> http://forecast.weather.gov/product.php?site=NWS&issuedby=BOX&product=CCF&format=txt&version=1&glossary=0
> 
> Kent

_________________________________________________________________
Lauren found her dream laptop. Find the PC that’s right for you.
http://www.microsoft.com/windows/choosepc/?ocid=ftp_val_wl_290
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090717/aea58b7a/attachment-0001.htm>


More information about the Tutor mailing list