[issue11379] Remove "lightweight" from minidom description

New submission from Stefan Behnel <scoder@users.sourceforge.net>: http://docs.python.org/library/xml.dom.minidom.html presents MiniDOM as a "Lightweight DOM implementation". The word "lightweight" is easily misunderstood as meaning "efficient" or "memory friendly". MiniDOM is well known to be neither of the two. The first paragraph then continues: """ xml.dom.minidom is a light-weight implementation of the Document Object Model interface. It is intended to be simpler than the full DOM and also significantly smaller. """ Again, "smaller" can be misread as "low memory footprint", whereas it is actually supposed to refer to an incomplete DOM API implementation. And "simpler" is also clearly exaggerated when compared to the alternative ElementTree package. I would like to see this changed and combined with a clear and visible comment that MiniDOM has very high resource profile, e.g. """ 19.7. xml.dom.minidom — Pure Python DOM implementation xml.dom.minidom is a pure Python implementation of the Document Object Model interface, as known from other programming languages. It is intended to provide a smaller API than the full DOM. Note, however, that MiniDOM has a very large memory footprint compared to other Python XML libraries. If you need a fast and memory friendly XML tree implementation with a vastly simpler API, use the xml.etree package instead. """ ---------- assignee: docs@python components: Documentation messages: 129914 nosy: docs@python, scoder priority: normal severity: normal status: open title: Remove "lightweight" from minidom description versions: Python 2.7, Python 3.2, Python 3.3 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Martin v. Löwis <martin@v.loewis.de> added the comment: -1. The description is factually correct - minidom *does* have a lower footprint than other Python DOM implementations (such as 4DOM). ---------- nosy: +loewis _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: Well, I'm not aware of many people who use 4DOM these days, and if that's what it's meant to refer to, maybe that should be made more obvious, because it currently is not at all. Even cDomlette uses only half of the memory according to http://effbot.org/zone/celementtree.htm When you say that the description is "factually correct", that does by no means imply that the average reader will understand how it's meant. My point is that almost everyone who reads this will draw the wrong conclusions. Also, when you say "lower footprint", that does not yet make it "light weight" in any way. It still uses something like ten times as much memory as cElementTree or lxml in Python 2 (and likely much more than even that in Python 3), and still something like 4-5 times as much as plain Python ElementTree. That's a huge difference. What about this phrasing then: """ MiniDOM has a smaller memory footprint than some of the other DOM compliant implementations for Python (such as 4DOM), but uses about 10x more memory than the faster and simpler xml.etree.cElementTree module. """ ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Martin v. Löwis <martin@v.loewis.de> added the comment:
What about this phrasing then:
""" MiniDOM has a smaller memory footprint than some of the other DOM compliant implementations for Python (such as 4DOM), but uses about 10x more memory than the faster and simpler xml.etree.cElementTree module. """
But that's not a DOM implementation - so it would be comparing apples and oranges. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: It's the tree based API most python users are parsing XML with, though. So I do not agree that it's comparing apples and oranges, not at all. It's comparing tree based XML libraries, only one of which is worth being called "light weight", and that's not the one that is currently carrying that name. I think it's worth telling new users what they are committing to when they write code that uses MiniDOM. The documentation should allow them to understand that. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Martin v. Löwis <martin@v.loewis.de> added the comment:
It's the tree based API most python users are parsing XML with, though. So I do not agree that it's comparing apples and oranges, not at all. It's comparing tree based XML libraries, only one of which is worth being called "light weight", and that's not the one that is currently carrying that name.
If that is a real concern, I'd rather reduce the memory footprint of minidom than put actual performance figures into the documentation that will likely outdate over time. Notice that the documentation doesn't claim that it is a lightweight XML library, only that it's a ligthweight DOM implementation. SAX is, of course, even lighter-weight. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment:
If that is a real concern, I'd rather reduce the memory footprint of minidom than put actual performance figures into the documentation that will likely outdate over time.
Personally, I do not think it's worth putting much work into MiniDOM. I'd rather deprecate it to prevent new code from being written for it, but that's just my personal opinion, and this is the wrong place to discuss that. Given the current performance characteristics, I wouldn't be surprised if there was quite some room for improvements left in the xml.dom package. If you dislike the "10x", feel free to use "several times". I doubt that MiniDOM will ever get so much closer to cET and lxml to prove that phrasing wrong.
Notice that the documentation doesn't claim that it is a lightweight XML library, only that it's a ligthweight DOM implementation.
I imagine that you are as aware as I am that this nuance is easy to miss, especially for a new user. From my experience, it is very common for users, especially those with a Java-ish background, to confuse the terms "DOM" and "XML tree API/library". Hence my push to change the documentation.
SAX is, of course, even lighter-weight.
Not so much more light weight than cET's iterparse(), but that's getting OT here. Stefan ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Antoine Pitrou <pitrou@free.fr> added the comment: Agreed with Stefan's concern. ---------- nosy: +pitrou _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: Ok, so, what do we make of this? I proposed improvements to the wording in the documentation, which make it much clearer for users what they are buying into when they start using minidom. I still think that "factually correct" but clearly misleading documentation is not helpful and that it needs fixing. Here is an updated phrasing that I hope we can settle on: """ :mod:`xml.dom.minidom` --- Pure Python DOM implementation [...] :mod:`xml.dom.minidom` is a pure Python implementation of the Document Object Model interface, as known from other programming languages. It is intended to provide a smaller and simpler API than the full W3C DOM. Note that MiniDOM has a several times larger memory footprint than :mod:`xml.etree.ElementTree`, the light-weight Python XML library in the standard library. If you do not need a (mostly) compliant W3C DOM implementation, but a fast and memory friendly XML tree implementation with an easy to learn API, use that instead. """ ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: Is memory footprint something important enough to put in the doc? Ease of use is IMO more important, but then it becomes subjective.. ---------- nosy: +eric.araujo _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: I find a factor of an order of magnitude worth mentioning, because it prevents certain kinds of usages. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Ezio Melotti <ezio.melotti@gmail.com> added the comment: Usually we don't talk about performance in the doc, and in my personal experience I didn't notice any major difference between the different implementations (but than again I haven't used them much). Talking about the other implementations and their advantages/disadvantages is fine, but things like "MiniDOM has a several times larger memory footprint" seems like FUD to me (see also http://docs.python.org/dev/documenting/style.html#affirmative-tone). ---------- nosy: +ezio.melotti _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Fred L. Drake, Jr. <fred@fdrake.net> added the comment: Removing "Lightweight" and changing the first paragraph to (something like) :mod:`xml.dom.minidom` is an implementation of the Document Object Model interface. The API is slightly simpler than the full W3C DOM, but the implementation has a significantly higher memory footprint than :mod:`xml.dom.etree`. would be entirely reasonable. (I don't think it's wrong to discuss relative memory footprints in comparison to other modules in the standard library.) ---------- nosy: +fdrake _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: I don't think "FUD" is a suitable term for the rather minidom-friendly wording in my last proposal. Seriously, minidom is widely known for being extremely slow and extremely memory hungry. And that is backed by basically any benchmark that has ever been done on the subject. If 4DOM, which Martin cites, is really worse in terms of performance (I never used it), it must truly be the only existing species of that kind. Still, here's a cleaned up version of Fred's proposal that I could live with: """ :mod:`xml.dom.minidom` --- Pure Python DOM implementation :mod:`xml.dom.minidom` is an implementation of the Document Object Model interface. The API is (intentionally) slightly simpler than the full W3C DOM, but the implementation has a significantly higher memory footprint than the XML tree library in :mod:`xml.etree.ElementTree`. """ ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Antoine Pitrou <pitrou@free.fr> added the comment:
I don't think "FUD" is a suitable term for the rather minidom-friendly wording in my last proposal. Seriously, minidom is widely known for being extremely slow and extremely memory hungry. And that is backed by basically any benchmark that has ever been done on the subject.
If it's both slow and memory-hungry, perhaps use the more generic "performance" instead of "memory footprint"? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Ezio Melotti <ezio.melotti@gmail.com> added the comment:
Seriously, minidom is widely known for being extremely slow and extremely memory hungry. And that is backed by basically any benchmark that has ever been done on the subject.
Do you have any link? My point is that if you say thing like "significantly/several times higher memory footprint than X" you are basically scaring the users away from the module. If for an average documents it takes, say, 30-50MB of memory, it seems perfectly reasonable to me, even if ElementTree takes 3-5MB. I would actually consider 100-200MB still ok too, unless I have to parse lot of documents or I'm running low of memory for other reasons. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Antoine Pitrou <pitrou@free.fr> added the comment:
My point is that if you say thing like "significantly/several times higher memory footprint than X" you are basically scaring the users away from the module.
Only those users who know they'll be processing significantly large documents. I don't think "scaring away people" is a good enough reason *not* to document performance characteristics. For example, we already mention that string joining is faster than repeated concatenation; I haven't heard anyone complain that it scared people away from string concatenation. And while it's true that we shouldn't try to document performance characteristics *too precisely*, it is still a good thing to document the most outstanding facts (for examples, C accelerator modules are clearly superior in performance to pure Python modules; should we shy away from documenting that, and instead present it as some kind of neutral choice?). And, of course, if minidom gets some serious performance attention, the claims will have to be revisited. But given the amount of attention minidom gets at all, it sounds rather implausible.
If for an average documents it takes, say, 30-50MB of memory, it seems perfectly reasonable to me, even if ElementTree takes 3-5MB. I would actually consider 100-200MB still ok too
Some use cases would not really like a 100-200MB memory consumption, or even 50MB. Think a long-running daemon, for instance. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: Ezio Melotti, 29.11.2011 16:26:
Seriously, minidom is widely known for being extremely slow and extremely memory hungry. And that is backed by basically any benchmark that has ever been done on the subject.
Do you have any link?
I just did a quick Google search for "python minidom benchmark" and found these: http://www.opensourcetutorials.com/tutorials/Server-Side-Coding/Python/xml-m... http://effbot.org/zone/celementtree.htm#benchmarks http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/ Note that all three authors risk being biased, but given how similar the results are, I tend to believe them. Stefan ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Antoine Pitrou <pitrou@free.fr> added the comment:
I just did a quick Google search for "python minidom benchmark" and found these:
http://www.opensourcetutorials.com/tutorials/Server-Side-Coding/Python/xml-m...
http://effbot.org/zone/celementtree.htm#benchmarks
http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/
Note that all three authors risk being biased, but given how similar the results are, I tend to believe them.
Thanks for the links. The performance gap looks significant enough to be mentioned, at least generically. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: Given that the links were generally somewhat dated and used Py2.x instead of the post-PEP393 Py3.3, here is another little benchmark, comparing the parser performance of minidom to lxml.etree (latest), ElementTree and cElementTree (stdlib) in a recent Py3.3 build (e66b7c62eec0), everything properly optimised for my platform (Linux 64bit). I used os.fork() to start a new process after importing everything and reading the file a couple of times, and before parsing. The memory usage is measured inside of the forked child using the resource module's ru_maxrss value, so it correlates with the growth of CPython's memory heap after parsing, thus giving an estimate of the maximum amount of memory used during parsing and tree building. Parsing hamlet.xml in English, 274KB: Memory usage: 7284 xml.etree.ElementTree.parse done in 0.104 seconds Memory usage: 14240 (+6956) xml.etree.cElementTree.parse done in 0.022 seconds Memory usage: 9736 (+2452) lxml.etree.parse done in 0.014 seconds Memory usage: 11028 (+3744) minidom tree read in 0.152 seconds Memory usage: 30360 (+23076) Parsing the old testament in English (ot.xml, 3.4MB) into memory: Memory usage: 20444 xml.etree.ElementTree.parse done in 0.385 seconds Memory usage: 46088 (+25644) xml.etree.cElementTree.parse done in 0.056 seconds Memory usage: 32628 (+12184) lxml.etree.parse done in 0.041 seconds Memory usage: 37500 (+17056) minidom tree read in 0.672 seconds Memory usage: 110428 (+89984) A 25MB XML file with Slavic Unicode text content: Memory usage: 57368 xml.etree.ElementTree.parse done in 3.274 seconds Memory usage: 223720 (+166352) xml.etree.cElementTree.parse done in 0.459 seconds Memory usage: 154012 (+96644) lxml.etree.parse done in 0.454 seconds Memory usage: 135720 (+78352) minidom tree read in 6.193 seconds Memory usage: 604860 (+547492) And a contrived 4.5MB XML file with lot more structure than data: Memory usage: 13308 xml.etree.ElementTree.parse done in 4.178 seconds Memory usage: 222088 (+208780) xml.etree.cElementTree.parse done in 0.478 seconds Memory usage: 103056 (+89748) lxml.etree.parse done in 0.199 seconds Memory usage: 101860 (+88552) minidom tree read in 8.705 seconds Memory usage: 810964 (+797656) Things to note: The factor of 5-10 for the memory overhead compared to cET depends heavily on the data. Also, minidom is consistently slower by more than a factor of 10 compared to the fastest parser (apparently the one in libxml2/lxml.etree, both of which surely can't be said to provide less features than the DOM that minidom implements). ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Changes by Florent Xicluna <florent.xicluna@gmail.com>: ---------- nosy: +flox type: -> performance _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Changes by Florent Xicluna <florent.xicluna@gmail.com>: ---------- components: +XML _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: Hmm, looks like I messed up the last example. I accidentally left in the formatting whitespace, thus growing the file to 6.2 MB. Removing that, I get this for the (now really) 4.5 MB XML file with lots of structure and very little data: Memory usage: 11600 xml.etree.ElementTree.parse done in 3.374 seconds Memory usage: 203420 (+191820) xml.etree.cElementTree.parse done in 0.192 seconds Memory usage: 36444 (+24844) lxml.etree.parse done in 0.131 seconds Memory usage: 62648 (+51048) minidom tree read in 5.935 seconds Memory usage: 527684 (+516084) It's actually surprising how much of a difference trailing whitespace content makes in minidom (from 2MB on disk to 300MB in memory???), most likely due to the usage of dedicated DOM text nodes in the tree. PS: I think the "XML/performance" tags on this bug would hint at a separate ticket. This is really meant as a documentation bug. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: I started a mailing list thread on the same topic: http://thread.gmane.org/gmane.comp.python.devel/127963 Especially see http://thread.gmane.org/gmane.comp.python.devel/127963/focus=128162 where I extract a proposal from the discussion. Basically, there should be a note at the top of the xml.dom documentation as follows: """ [[Note: The xml.dom.minidom module provides an implementation of the W3C-DOM whose API is similar to that in other programming languages. Users who are unfamiliar with the W3C-DOM interface or who would like to write less code for processing XML files should consider using the xml.etree.ElementTree module instead.]] """ I think this should go on the xml.dom.minidom page as well as the xml.dom package page. Hand-wavingly, users who are new to the DOM are more likely to hit the package page first, whereas those who know it already will likely find the MiniDOM page directly. Note that I'd still encourage the removal of the misleading word "lightweight" until it makes sense to put it back in a meaningful way. I therefore propose the following minimalistic changes to the first paragraph on the minidom page: """ xml.dom.minidom is a [-XXX: light-weight] implementation of the Document Object Model interface. It is intended to be simpler than the full DOM and also [+XXX: provide a] significantly smaller [+XXX: API]. """ Additionally, the documentation on the xml.sax page would benefit from the following paragraph: """ [[Note: The xml.sax package provides an implementation of the SAX interface whose API is similar to that in other programming languages. Users who are unfamiliar with the SAX interface or who would like to write less code for efficient stream processing of XML files should consider using the iterparse() function in the xml.etree.ElementTree module instead.]] """ ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Ezio Melotti <ezio.melotti@gmail.com> added the comment:
xml.dom.minidom is a [-XXX: light-weight] implementation of the Document Object Model interface.
This is ok.
It is intended to be simpler than the full DOM and also [+XXX: provide a] significantly smaller [+XXX: API].
Doesn't "simpler" here refer to the API already? Another option is to add somewhere a section like: "If you have to work with XML, ElementTree is usually the best choice, because it has a simple API and it's efficient [or whatever]. xml.dom.minidom provides a subset of the W3C-DOM API, and xml.sax a SAX interface.", possibly expanding a bit on the differences and showing a minimal example with the 3 different implementations, and then link to it from the other modules' pages. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Martin v. Löwis <martin@v.loewis.de> added the comment:
"If you have to work with XML, ElementTree is usually the best choice, because it has a simple API and it's efficient [or whatever].
I still object such a wording, for many reasons. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: IMHO this wording proposed by Stefan: """ [[Note: The xml.dom.minidom module provides an implementation of the W3C-DOM whose API is similar to that in other programming languages. Users who are unfamiliar with the W3C-DOM interface or who would like to write less code for processing XML files should consider using the xml.etree.ElementTree module instead.]] """ Sounds very reasonable. Perhaps something about a more Pythonic API can also be added there, in addition to "to write less code". Any objections? ---------- nosy: +eli.bendersky _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Changes by Tshepang Lekhonkhobe <tshepang@gmail.com>: ---------- nosy: +tshepang _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: +1 to the suggested wording. -1 to talking about a more pythonic API. (Want a nit? s/W3C-DOM/W3C DOM/) ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Martin, do you find the wording I quoted (*without* the reference to a more Pythonic API) acceptable? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Changes by Ezio Melotti <ezio.melotti@gmail.com>: ---------- stage: -> needs patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Martin v. Löwis <martin@v.loewis.de> added the comment: The wording in msg152836 is fine with me, in particular as it doesn't make any performance claims. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: I'm attaching a patch for Doc/library/xml.dom.minidom.rst It adds the note as phrased by Stefan, with a tiny wording change to make the first sentence less ambiguous. ---------- keywords: +patch Added file: http://bugs.python.org/file24686/issue_11379.1.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: I’m not sure I would use note markup, though (cf. Raymond’s aversion to littering the doc with note and warning boxes). ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment:
I’m not sure I would use note markup, though (cf. Raymond’s aversion to littering the doc with note and warning boxes).
I also dislike box littering, but this one seems like a really good fit for a note, since it's completely outside the flow of that documentation page. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Raymond Hettinger <raymond.hettinger@gmail.com> added the comment: This is a reasonable case for a note. ---------- nosy: +rhettinger _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Roundup Robot <devnull@psf.upfronthosting.co.za> added the comment: New changeset 81e606862a89 by Eli Bendersky in branch '3.2': Issue #11379: add a note in xml.dom.minidom suggesting to use etree in some cases http://hg.python.org/cpython/rev/81e606862a89 ---------- nosy: +python-dev _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Roundup Robot <devnull@psf.upfronthosting.co.za> added the comment: New changeset ccd16ad37544 by Eli Bendersky in branch '2.7': Issue #11379: add a note in xml.dom.minidom suggesting to use etree in some cases http://hg.python.org/cpython/rev/ccd16ad37544 ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Committed to 2.7, 3.2 and 3.3 I suppose this issue can be closed now? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: Thanks Eli. What about the "Lightweight DOM implementation", though? Following Martin's comment that performance characteristics (like "fast", "memory friendly" or "lightweight") should normally not be documented, I'm still suggesting to replace it with a less easily misinterpreted phrase like "W3C DOM implementation". ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Stefan, frankly I'm not familiar enough with either xml.dom or xml.dom.minidom to have a solid opinion at this point. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: I think I’ve always understood “lightweight” to mean “minimal”. xml.dom provides minidom, a basic implementation, pulldom, a different implementation, and other libraries such as 4Dom are full-fledged implementations. So “lightweight” is not a problem to me (but I acknowledge that it might be misleading for other people), especially given that I think that DOM itself is not elegant or lightweight (as in “conceptually small”). ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Antoine Pitrou <pitrou@free.fr> added the comment:
I think I’ve always understood “lightweight” to mean “minimal”.
Then how about saying "minimal" instead of "lightweight"? (also, it seems it really means "incomplete" or "partial", which are of course less positive sounding) ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Ezio Melotti <ezio.melotti@gmail.com> added the comment: "Minimal" sounds good to me, it also matches the name of the module. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: Right, patch for 3.2. Also edited the module docstring (info taken from the docstring of xml.dom). BTW I really think we could have avoided some verbosity by adding the recommendation to use xml.etree in the first paragraph of Doc/library/xml.dom.minidom.rst. ---------- Added file: http://bugs.python.org/file24707/minidom-desc.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: s/Mininal/Minimal/ in the synopsis ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: Yes, I think that's better. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: This alternate version of my patch (a) merges the first two paragraphs to make the intro less redundant and heavy, and (b) reorganizes a bit the list of modules in Doc/library/markup.rst to have xml.etree first and pyexpat (less interesting for most people) at the end. Tell me if you prefer this version, or if I should commit the first one (possibly with the (b) change). ---------- Added file: http://bugs.python.org/file24732/minidom-desc-2.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Roundup Robot <devnull@psf.upfronthosting.co.za> added the comment: New changeset d99c0a4b66f3 by Éric Araujo in branch '3.2': Move xml.etree higher and xml.parsers.expat lower in the markup ToC. http://hg.python.org/cpython/rev/d99c0a4b66f3 ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Roundup Robot <devnull@psf.upfronthosting.co.za> added the comment: New changeset fc32753feb0a by Éric Araujo in branch '2.7': Move xml.etree higher and xml.parsers.expat lower in the markup ToC. http://hg.python.org/cpython/rev/fc32753feb0a ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: FYI, note that http://wiki.python.org/moin/MiniDom says this about minidom: “slow and very memory hungry DOM implementation”. As you have seen, I have applied my ToC order change. Now in order to commit my s/lightweight/minimal/ change and close this report, can you Eli say if minidom-desc-2 is okay (I’m asking you because this patch touches text you just added, contrary to minidom-desc)? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Martin v. Löwis <martin@v.loewis.de> added the comment:
FYI, note that http://wiki.python.org/moin/MiniDom says this about minidom: “slow and very memory hungry DOM implementation”.
Thanks for the notice; I have now fixed that wording. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Éric, I'm ok with replacing "lightweight" by "minimal", unless others have objections. Regarding the specifics of the minidom-desc-2.diff patch: "proficient with the DOM" I'm not sure "the DOM" is semantically correct. "the W3C-DOM interface" is more precise. Also, I still think that a note would be more appropriate, but I don't care enough to argue about it :) ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel <scoder@users.sourceforge.net> added the comment: Oh, right, I missed that part. I also think that a visible note is better. And +1 for "W3C DOM interface". ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Éric, what else would you like to do here? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: I’ll soon have a revised version of my patch to address your feedback. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Ezio Melotti <ezio.melotti@gmail.com> added the comment: Any news on this? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo <merwok@netwok.org> added the comment: I’ve been unresponsive of late, sorry, but I’m still here. Will see if I have time tomorrow. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Changes by Eli Bendersky <eliben@gmail.com>: ---------- nosy: -eli.bendersky _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel added the comment: Any news on this? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Stefan Behnel added the comment: I'm not sure if it's a good idea to keep bikeshedding about this for another two years. Personally, I would prefer having someone with commit rights fix this and be done with it. Eric's last patch looks ok and parts of it went in already, so it's mostly just the heading that remains to be fixed. ---------- versions: +Python 3.4 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Antoine Pitrou added the comment: Someone should go ahead and apply this. Éric, perhaps? ---------- stage: needs patch -> commit review _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Éric Araujo added the comment: Sure, feel free to commit this. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Roundup Robot added the comment: New changeset c2ae1ed03853 by Ezio Melotti in branch '2.7': #11379: rephrase minidom documentation to use the term "minimal" instead of "lightweight". Patch by Éric Araujo. http://hg.python.org/cpython/rev/c2ae1ed03853 New changeset b9c0e050c935 by Ezio Melotti in branch '3.2': #11379: rephrase minidom documentation to use the term "minimal" instead of "lightweight". Patch by Éric Araujo. http://hg.python.org/cpython/rev/b9c0e050c935 New changeset 8ff512910338 by Ezio Melotti in branch '3.3': #11379: merge with 3.2. http://hg.python.org/cpython/rev/8ff512910338 New changeset 9a0cd5363c2a by Ezio Melotti in branch 'default': #11379: merge with 3.3. http://hg.python.org/cpython/rev/9a0cd5363c2a ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Ezio Melotti added the comment: Fixed, thanks for the patch! ---------- assignee: docs@python -> ezio.melotti resolution: -> fixed stage: commit review -> committed/rejected status: open -> closed type: performance -> enhancement _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________

Roundup Robot added the comment: New changeset 39ea24aaf0e7 by Antoine Pitrou in branch '2.7': s/lightweight/minimal/, as per issue #11379. http://hg.python.org/cpython/rev/39ea24aaf0e7 New changeset b63258b6eb4d by Antoine Pitrou in branch '3.3': s/lightweight/minimal/, as per issue #11379. http://hg.python.org/cpython/rev/b63258b6eb4d New changeset d659e7761d59 by Antoine Pitrou in branch 'default': s/lightweight/minimal/, as per issue #11379. http://hg.python.org/cpython/rev/d659e7761d59 ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue11379> _______________________________________
participants (12)
-
Antoine Pitrou
-
Eli Bendersky
-
Ezio Melotti
-
Florent Xicluna
-
Fred L. Drake, Jr.
-
Martin v. Löwis
-
Raymond Hettinger
-
Roundup Robot
-
Senthil Kumaran
-
Stefan Behnel
-
Tshepang Lekhonkhobe
-
Éric Araujo