[Python-Dev] Fixing the XML batteries
Stefan Behnel
stefan_ml at behnel.de
Mon Dec 12 10:59:23 CET 2011
"Martin v. Löwis", 11.12.2011 23:03:
> Am 09.12.2011 10:09, schrieb Xavier Morel:
>> On 2011-12-09, at 09:41 , Martin v. Löwis wrote:
>>>> a) The stdlib documentation should help users to choose the right
>>>> tool right from the start. Instead of using the totally
>>>> misleading wording that it uses now, it should be honest about
>>>> the performance characteristics of MiniDOM and should actively
>>>> suggest that those who don't know what to choose (or even *that*
>>>> they can choose) should not use MiniDOM in the first place.
>>>
> [...]
>>
>> Minidom is inferior in interface flow and pythonicity, in terseness,
>> in speed, in memory consumption (even more so using cElementTree, and
>> that's not something which can be fixed unless minidom gets a C
>> accelerator), etc… Even after fixing minidom (if anybody has the time
>> and drive to commit to it), ET/cET should be preferred over it.
>
> I don't mind pointing people to ElementTree, despite that I disagree
> whether the ET interface is "superior" to DOM.
Yes, that's clearly a point where we agree to disagree, and I understand
that you are as biased towards minidom as I am biased towards ElementTree.
However, I think I made it clear that the implementation of cElementTree
(and lxml.etree as well, for that purpose) is largely superiour to MiniDOM
in terms of performance, for any sensible meaning of the word performance.
And I'm also convinced that the API is largely superiour in terms of
usability. ET certainly matches Python as a language much better than
MiniDOM. But that's just my personal opinion.
> It's Stefan's reasoning
> as to *why* people should be pointed to ET, and what words should be
> used to do that. IOW, I detest bashing some part of the standard
> library, just to urge users to use some other part of the standard library.
I'm all for finding a good way of putting it into words, as long as it
keeps uninformed users from taking the wrong decision and getting the wrong
idea of how complicated and slow Python is.
> People are still using PyXML, despite it's not being maintained anymore.
My experience with that is that it's only *new* users that are still
running into PyXML by accident, because they didn't see that it's a dead
project and they find it through ancient web pages that tell them that they
need it because "it's the way to do XML in Python" and "if minidom is not
enough, use PyXML". Maybe we should "misuse" the stdlib documentation to
clear that up as well. "PyXML" is just too attractive a name for a dead
project.
Just look through the xml-sig page, basically all requests regarding PyXML
during the last five years deal with problems in installing it, i.e.
*before* even starting to use it. So you can't use this to claim that
people really *are* still using it.
> Telling them to replace 4DOM with minidom is much more appropriate
Do you actually have any evidence that anyone is still actively using 4DOM?
> than telling them to rewrite in ET.
I usually encourage people to rewrite minidom code for ET. It makes the
code simpler, more readable, more maintainable and much faster.
Stefan
More information about the Python-Dev
mailing list