[Tutor] using BeautifulSoup
jonasmg at softhome.net
jonasmg at softhome.net
Mon Mar 27 22:55:31 CEST 2006
jonasmg at softhome.net writes:
>>jonasmg at softhome.net wrote:
>>> Hi!
>>>
>>> I'm trying to use BeautifulSoup for get data from a table (on right) from:
>>> http://en.wikipedia.org/wiki/United_states
>>>
>>> i.e. i would get data from 'Calling code' that it would be '+1'
>>>
>>> ----------------------
>>>
>>> import urllib2
>>> from BeautifulSoup import BeautifulSoup
>>>
>>> url="http://en.wikipedia.org/wiki/United_states"
>>> html = urllib2.urlopen(url).read()
>>> soup = BeautifulSoup()
>>> soup.feed(html)
>
>> You just have to find some kind of ad hoc search that gets you to where
>> you want to be. I would try something like this:
>
>> anchor = soup.fetch('a', dict(href="/wiki/List_of_country_calling_codes"))
>
>> code = anchor.findNext('code')
>> print code.string
>
>> Presumably you want this to work for other country pages as well; you
>> will have to look at the source, see what they have in common and search
>> on that.
>
>> Kent
>
> anchor.findNext('code') fails:
>
> anchor = soup.fetch('a', {'href': '/wiki/List_of_country_calling_codes'})
> print anchor
>
> [<a href="/wiki/List_of_country_calling_codes" title="List of country
> calling codes">Calling code</a>]
>
> anchor.findNext('code')
> []
>
> P.S. : Sorry for my last email, I was wrong with the subject
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
Solution _there is that using findChild instead of fetch_:
anchor = soup.findChild('a',
dict(href="/wiki/List_of_country_calling_codes"))
print anchor.findNext('code')
More information about the Tutor
mailing list