Using Xpath to parse a Yahoo Finance page

Jason Hsu jhsu802701 at
Mon Dec 3 02:23:13 CET 2012

I'm trying to extract the data on "total assets" from Yahoo Finance using Python 2.7 and lxml. 

Here is a special test script I set up to work on this issue:

    import urllib
    import lxml
    import lxml.html 

    url_local1 = "" 
    result1 = urllib.urlopen(url_local1)
    element_html1 =
    doc1 = lxml.html.document_fromstring (element_html1)
    list_row1 = doc1.xpath(u'.//th[div[text()="Total Assets"]]/following-sibling::td/text()')
    print list_row1

    url_local2 = "" 
    result2 = urllib.urlopen(url_local2)
    element_html2 =
    doc2 = lxml.html.document_fromstring (element_html2)
    list_row2 = doc2.xpath(u'.//td[strong[text()="Total Assets"]]/following-sibling::td/strong/text()')
    print list_row2

I'm able to get the row of data on total assets from the Smartmoney page, but I get just an empty list when I try to parse the Yahoo Finance page.

More information about the Python-list mailing list