BeautifulSoup

Brian Brian.Mingus at colorado.edu
Wed Sep 2 03:37:14 EDT 2009


>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup("""<area coords="427,724,432,732" href="
http://BioCyc.org/ECOLI/NEW-IMAGE?
... type=GENE-IN-CHROM-BROWSER&object=EG12309" onmouseover="return
... overlib('<b>Gene:</b> yjtD<BR><b>Product:</
... b> predicted rRNA methyltransferase, subunit of predicted rRNA
... methyltransferase<BR><b>Intergenic distances (bp):</
... b> yjjY< +400 yjtD +214 >thrL');"><b>Gene:</b> yjtD<br /
... ><b>Product:</b> predicted rRNA methyltransferase, subunit of
... predicted rRNA methyltransferase<br /><b>Intergenic distances (bp):</
... b> yjjY< +400 yjtD +214 >thrL');" onmouseout="return nd();">
... </area>""")
>>> soup.area["href"]
u'http://BioCyc.org/ECOLI/NEW-IMAGE
?\ntype=GENE-IN-CHROM-BROWSER&object=EG12309'


On Wed, Sep 2, 2009 at 1:25 AM, elsa <kerensaelise at hotmail.com> wrote:

> Hi all,
>
> if I have some HTML that looks like this:
>
> <area coords="427,724,432,732" href="http://BioCyc.org/ECOLI/NEW-IMAGE?
> type=GENE-IN-CHROM-BROWSER&object=EG12309" onmouseover="return
> overlib('<b>Gene:</b> yjtD<BR><b>Product:</
> b> predicted rRNA methyltransferase, subunit of predicted rRNA
> methyltransferase<BR><b>Intergenic distances (bp):</
> b> yjjY< +400 yjtD +214 >thrL');"><b>Gene:</b> yjtD<br /
> ><b>Product:</b> predicted rRNA methyltransferase, subunit of
> predicted rRNA methyltransferase<br /><b>Intergenic distances (bp):</
> b> yjjY< +400 yjtD +214 >thrL');" onmouseout="return nd();">
> </area>
>
> is there an easy way to use BeautifulSoup to extract just the value of
> the href attribute?
>
> Thanks,
>
> elsa
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090902/89487c4a/attachment.html>


More information about the Python-list mailing list