[lxml-dev] Working with <?xml-stylesheet ... ?>
Hi, I'd like to quickly/efficiently get a list of all <?xml-stylesheet ?> processing instructions in a given document. I have managed to find it via root_tree.getprevious(), but it seems I need to search through the siblings here to find the <?xml-stylesheet ?> if indeed there is one. I'm using the HTML parser. Is there a more natural API? Also, serialising using lxml.html.tostring() seems to lose the <?xml-stylesheet ?> PI. Is this by design? Cheers, Martin -- Author of `Professional Plone Development`, a book for developers who want to work with Plone. See http://martinaspeli.net/plone-book
Martin Aspeli wrote:
Hi,
I'd like to quickly/efficiently get a list of all <?xml-stylesheet ?> processing instructions in a given document.
I have managed to find it via root_tree.getprevious(), but it seems I need to search through the siblings here to find the <?xml-stylesheet ?> if indeed there is one.
I'm using the HTML parser.
Is there a more natural API?
Also, serialising using lxml.html.tostring() seems to lose the <?xml-stylesheet ?> PI. Is this by design?
Mmmm.... and another thing: once I get the HtmlProcessingInstruction node, how can I get the value of its pseudo-attributes (href and type, in this case)? The attr dict is empty... Martin -- Author of `Professional Plone Development`, a book for developers who want to work with Plone. See http://martinaspeli.net/plone-book
Hi, Martin Aspeli wrote:
once I get the HtmlProcessingInstruction node, how can I get the value of its pseudo-attributes (href and type, in this case)? The attr dict is empty...
As you say, they are not attributes. The content of a processing instruction is application specific plain text, according to the XML specification. http://www.w3.org/TR/REC-xml/#sec-pi While there is some simple support for the xml-stylesheet processing instruction in plain lxml.etree, it's not currently enabled in lxml.html, and it's not available for any other PI target. Your best bet is to parse the PI content yourself (.target and .text properties). Stefan
Hi, Martin Aspeli wrote:
I'd like to quickly/efficiently get a list of all <?xml-stylesheet ?> processing instructions in a given document.
reversed( el for el in root.itersiblings(preceding=True) if el.tag is etree.ProcessingInstruction and el.target == "xml-stylesheet" )
Also, serialising using lxml.html.tostring() seems to lose the <?xml-stylesheet ?> PI. Is this by design?
You need to wrap the root element in an ElementTree and serialise that. Stefan
participants (2)
-
Martin Aspeli
-
Stefan Behnel