[lxml-dev] should lxml.html have the same API as lxml.etree?
![](https://secure.gravatar.com/avatar/cc8334869c9d2a9e603017f2da805eb3.jpg?s=120&d=mm&r=g)
I've noticed in html5lib that they do (in html5lib.treebuilder.etree_lxml): try: import lxml.html as etree except ImportError: from lxml import etree with the expectation that the two work the same way. They don't work the same, specifically there's no etree.Comment. Is it a reasonable expectation that they act the same? (I think they haven't tested the code much with lxml 2, so basically they haven't exercised the first case... though looking at the code some I'm not sure it works with the second case either) Ian
![](https://secure.gravatar.com/avatar/8b97b5aad24c30e4a1357b38cc39aeaa.jpg?s=120&d=mm&r=g)
Hi Ian, Ian Bicking wrote:
I've noticed in html5lib that they do (in html5lib.treebuilder.etree_lxml):
try: import lxml.html as etree except ImportError: from lxml import etree
with the expectation that the two work the same way. They don't work the same, specifically there's no etree.Comment.
Is it a reasonable expectation that they act the same? (I think they haven't tested the code much with lxml 2, so basically they haven't exercised the first case... though looking at the code some I'm not sure it works with the second case either)
I thought about that for lxml.objectify, too. I mean, you could import basically everything from etree into the package/module namespace and be done. The question of having an "__all__" or not is related to this. The thing that made me think about this was tostring(). There is no tostring() in objectify, so it's unlikely that you will ever be able to use it without also importing etree. But on the other hand, if you agree to import some names, where do you draw the line? Would you want to provide XSLT and RelaxNG as well? What about all the exceptions? And I'm absolutely sure there will be the day where I forget to add an import (or assignment) somewhere when I add a new name to etree. Currently, lxml.objectify is positioned as an API *on-top* of etree, although things that behave differently are duplicated already. I haven't made up my mind yet. However, I do feel that there may be more things that might want to behave different in the future, so having them duplicated from the beginning makes it a) easier to grow the different APIs in their respective direction, and b) easier for users to use them consistently inside the module/package API that they are using, without caring about the interaction with etree if they don't use it. Any more opinions on this? Stefan
![](https://secure.gravatar.com/avatar/a72e34437ff2a048717b7b964940b0b3.jpg?s=120&d=mm&r=g)
Hi,
Is it a reasonable expectation that they act the same? (I think they haven't tested the code much with lxml 2, so basically they haven't exercised the first case... though looking at the code some I'm not sure it works with the second case either)
[...]
Currently, lxml.objectify is positioned as an API *on-top* of etree, although things that behave differently are duplicated already. I haven't made up my mind yet. However, I do feel that there may be more things that might want to behave different in the future, so having them duplicated from the beginning makes it a) easier to grow the different APIs in their respective direction, and b) easier for users to use them consistently inside the module/package API that they are using, without caring about the interaction with etree if they don't use it.
Mirroring the etree API completely in objectify might make the incautious user think that these modules can be used completely interchangeable, while they are not. And the difference are subtle, e.g. the indexing behaviour (sibling access in objectify, children access in etree), and will not necessarily produce easily detectable errors. So for an lxml.objectify that exposes a full etree-API, it should be stated very prominently that you can't just take your existing etree worker code and start using objectify instead. Holger -- Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten Browser-Versionen downloaden: http://www.gmx.net/de/go/browser
![](https://secure.gravatar.com/avatar/8b97b5aad24c30e4a1357b38cc39aeaa.jpg?s=120&d=mm&r=g)
Hi, jholg@gmx.de wrote:
Mirroring the etree API completely in objectify might make the incautious user think that these modules can be used completely interchangeable, while they are not.
And the difference are subtle, e.g. the indexing behaviour (sibling access in objectify, children access in etree), and will not necessarily produce easily detectable errors.
So for an lxml.objectify that exposes a full etree-API, it should be stated very prominently that you can't just take your existing etree worker code and start using objectify instead.
That's not really something I'm worried about. I think it's clear from the docs (and intuitively from etree and objectify being separate modules) that they are not interchangeable. I think the real question is: should users need to import etree, when what they actually work with is objectify or lxml.html? And I think duplicating the module content and having to import only one module/package makes it even clearer that they /have/ separate APIs than the current mix of "partly etree, partly objectify" can. It allows users to stay inside the world of one tool, without even having to think about the differences to another tool that they do not care about (or at least should not have to). Stefan
![](https://secure.gravatar.com/avatar/8b97b5aad24c30e4a1357b38cc39aeaa.jpg?s=120&d=mm&r=g)
Hi Ian, Ian Bicking wrote:
I think they haven't tested the code much with lxml 2, so basically they haven't exercised the first case... though looking at the code some I'm not sure it works with the second case either
I guess you are referring to fromstring()? There /is/ a fromstring() in etree, which does the same as XML(). The reasoning is that for XML literals it reads well to write XML("< ... >") while when parsing from a string variable, it's more readable to write fromstring(some_content) Stefan
participants (3)
-
Ian Bicking
-
jholg@gmx.de
-
Stefan Behnel