What is the difference between etree.XML and etree.HTML?
data:image/s3,"s3://crabby-images/68281/682811131061ddf0a8ae288d02efca5f138e45a0" alt=""
Hi, In the following example, XML and HTML work equally well. Does anybody has an example showing when they will be different? Thanks. from lxml import etree tree = etree.XML('<foo><bar>abc</bar></foo>') tree = etree.HTML('<foo><bar>abc</bar></foo>') print type(tree) r = tree.xpath('//bar') print [x.tag for x in r] -- Regards, Peng
data:image/s3,"s3://crabby-images/7adbf/7adbf55de38b7cc0e6f490d108417f6d6653ace6" alt=""
Am 20.12.2017 um 00:55 schrieb Peng Yu:
from lxml import etree tree = etree.XML('<html><p>abc</html>') print type(tree) r = tree.xpath('//p') print [x.tag for x in r] gives: line 1, column 20 whereas from lxml import etree tree = etree.HTML('<html><p>abc</html>') print type(tree) r = tree.xpath('//p') print [x.tag for x in r] gives
data:image/s3,"s3://crabby-images/7adbf/7adbf55de38b7cc0e6f490d108417f6d6653ace6" alt=""
Am 20.12.2017 um 00:55 schrieb Peng Yu:
from lxml import etree tree = etree.XML('<html><p>abc</html>') print type(tree) r = tree.xpath('//p') print [x.tag for x in r] gives: line 1, column 20 whereas from lxml import etree tree = etree.HTML('<html><p>abc</html>') print type(tree) r = tree.xpath('//p') print [x.tag for x in r] gives
participants (2)
-
Markus Schöpflin
-
Peng Yu