Issues a longer xpath expression
python
mailtomanage at 163.com
Thu Feb 21 20:24:09 EST 2013
I am having issues with the urllib and lxml.html modules.
Here is my original code:
import urllib
import lxml.html
down='http://v.163.com/special/visualizingdata/'
file=urllib.urlopen(down).read()
root=lxml.html.document_fromstring(file)
xpath_str="//div[@class='down s-fc3 f-fl']/a"
urllist=root.xpath(xpath_str)for url in urllist:print url.get("href")
When run, it returns this output:
http://mov.bn.netease.com/movieMP4/2012/12/A/7/S8H1TH9A7.mp4
http://mov.bn.netease.com/movieMP4/2012/12/D/9/S8H1ULCD9.mp4
http://mov.bn.netease.com/movieMP4/2012/12/4/P/S8H1UUH4P.mp4
http://mov.bn.netease.com/movieMP4/2012/12/B/V/S8H1V8RBV.mp4
http://mov.bn.netease.com/movieMP4/2012/12/6/E/S8H1VIF6E.mp4
http://mov.bn.netease.com/movieMP4/2012/12/B/G/S8H1VQ2BG.mp4
But, when I change the line
xpath_str='//div[@class="down s-fc3 f-fl"]//a'
into
xpath_str='//div[@class="col f-cb"]//div[@class="down s-fc3 f-fl"]//a'
that is to say,
urllist=root.xpath('//div[@class="col f-cb"]//div[@class="down s-fc3 f-fl"]//a')
I do not receive any output. What is the flaw in this code?
it is so strange that the shorter one can work,the longer one can not,they have the same xpath structure!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20130222/daaf2f4e/attachment.html>
More information about the Python-list
mailing list