beutifulsoup
Kay Schluehr
kay.schluehr at gmx.net
Thu Oct 30 02:39:08 EDT 2008
On 29 Okt., 17:45, luca72 <lucabe... at libero.it> wrote:
> Hello
> I try to use beautifulsoup
> i have this:
> sito = urllib.urlopen('http://www.prova.com/')
> esamino = BeautifulSoup(sito)
> luca = esamino.findAll('tr', align='center')
>
> print luca[0]
>
> >><tr align="center"><th width="5%"><a onclick="t('Only|G|BoT|05','#1');" href="#">#1</a></th><td width="10%">44.4MB</td><td width="90%" align="left"><font color="orange"> Pc-prova.rar </font></td></tr>
>
> I need to get the following information:
> 1)Only|G|BoT|05
> 2)#1
> 3)44.4MB
> 4)Pc-prova.rar
> with: print luca[0].a.string i get #1
> with print luca[0].td.string i get 44.4MB
> can you explain me how to get the others two value
> Thanks
> Luca
The same way you got `luca`
1,2) luca.find("a")["onclick"].split("'") and search through the
result list
3) luca.find("td").string
4) luca.find("font").string
More information about the Python-list
mailing list