[Tutor] reading web page with BeautifulSoup
Ed Owens
eowens0124 at gmx.com
Thu Dec 13 03:11:56 CET 2012
On 12/12/12 9:03 PM, Dave Angel wrote:
> On 12/12/2012 08:47 PM, Ed Owens wrote:
>>>>> from urllib2 import urlopen
>>>>> page = urlopen('w1.weather.gov/obhistory/KDCA.html')
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> File
>> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
>> line 126, in urlopen
>> return _opener.open(url, data, timeout)
>> File
>> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
>> line 386, in open
>> protocol = req.get_type()
>> File
>> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
>> line 248, in get_type
>> raise ValueError, "unknown url type: %s" % self.__original
>> ValueError: unknown url type: w1.weather.gov/obhistory/KDCA.html
>> Can anyone see what I'm doing wrong here? I have bs4 and urllib2
>> imported, and get the above error when trying to read that page. I
>> can copy the url from the error message into my browser and get the page.
> Like the error says, unknown type. Prepend the type of the url, and it
> should work fine:
>
> page = urlopen('http://w1.weather.gov/obhistory/KDCA.html')
>
> Yep, that was it. Thanks for the help. Now on to fight with BeautifulSoup
Ed
More information about the Tutor
mailing list