Alexander Kozlovsky wrote:
Hello all!
I'm very new with lxml. Probably, I find a bug.
AFAIK, lxml does not expose direct interface to CDATA sections. But, when I use etree.HTML function I get content of <script> as CDATA section!
>>> html = etree.HTML('<script> alert("Hello!"); </script>') >>> etree.tostring(html) '<html><head><script><![CDATA[ alert("Hello!"); ]]></script></head></html>'
The problem is, I cannot retrieve content of <script> tag because lxml does not allow this:
>>> script = html.find('.//script') >>> len(script) 0 >>> print script.text None
EXPECTED: >>> print script.text alert("Hello!");
Is it really a bug, or I don't understand something?
This is a bug in libxml2 -- if you update to the latest version (nightly build?) it has been fixed. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org