[Tutor] using BeautifulSoup

jonasmg at softhome.net jonasmg at softhome.net
Mon Mar 27 22:55:31 CEST 2006


jonasmg at softhome.net writes: 

>>jonasmg at softhome.net wrote:
>>> Hi!    
>>> 
>>> I'm trying to use BeautifulSoup for get data from a table (on right) from:
>>> http://en.wikipedia.org/wiki/United_states    
>>> 
>>> i.e. i would get data from 'Calling code' that it would be '+1'    
>>> 
>>>  ----------------------    
>>> 
>>> import urllib2
>>> from BeautifulSoup import BeautifulSoup    
>>> 
>>> url="http://en.wikipedia.org/wiki/United_states"
>>> html = urllib2.urlopen(url).read()
>>> soup = BeautifulSoup()
>>> soup.feed(html) 
> 
>> You just have to find some kind of ad hoc search that gets you to where 
>> you want to be. I would try something like this:
> 
>> anchor = soup.fetch('a', dict(href="/wiki/List_of_country_calling_codes"))
> 
>> code = anchor.findNext('code')
>> print code.string
> 
>> Presumably you want this to work for other country pages as well; you 
>> will have to look at the source, see what they have in common and search 
>> on that.
> 
>> Kent
> 
> anchor.findNext('code') fails:  
> 
> anchor = soup.fetch('a', {'href': '/wiki/List_of_country_calling_codes'})
> print anchor  
> 
>  [<a href="/wiki/List_of_country_calling_codes" title="List of country
> calling codes">Calling code</a>]  
> 
> anchor.findNext('code')
> []  
> 
> P.S. : Sorry for my last email, I was wrong with the subject
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor

Solution _there is that using findChild instead of fetch_: 

anchor = soup.findChild('a', 
dict(href="/wiki/List_of_country_calling_codes")) 

print anchor.findNext('code') 


More information about the Tutor mailing list