Hello:
I try to parse the page: http://www.liqucn.com.
My py-code:
#coding=utf-8
from lxml import etree,tostring
import requests
resp = requests.get('http://www.liqucn.com')
html = resp.text
page = etree.HTML(html)
print tostring(page)
#-------------------------------------
I found many missing html-code where affter '<span class="app_ico"><a
href="http://www.liqucn.com/yx/17773.shtml" target="_blank"
lz_src="http://images.liqucn.com/mini/60x60/h005/h79/img201208220736340_60x60.png
" '.I am doubt this result very confused.You can you open this url, see the
source code, and then run my code.Thank you!
Hi all,
I have started to port[0] the documentation to sphinx. As I have
read[1] on the mailling list that's a task where help would be
appreciated.
Because the documentation is already in rst format, an initial build
with sphinx was easy and the result[2] looks quite promising.
The next steps will be:
* remove markup specific to the current build system and replace it
with sphinx idioms
* fix pdf generation
* use sphinx for api documentation (with sphinx autodoc where possible)
* theme/layout
If you have additional ideas or things I should pay attention to
please let me know.
Peter
[0]: https://github.com/hoffmann/lxml
[1]: http://thread.gmane.org/gmane.comp.python.lxml.devel/5821/focus=5822
[2]: http://vps.peter-hoffmann.com/lxml-sphinx/