Problem round-tripping with xml.dom.minidom pretty-printer
ben.butlercole at gmail.com
Fri Feb 29 18:21:05 CET 2008
> The last line of p() calls itself: it is an unconditional recursive call
> so, no matter what it does, it will never stop. And since p() also
> prints something, calling it will print endlessly.
Sorry, I wasn't clear. I realize that this recurses endlessly. The
problem is that it also adds blank lines endlessly.
> By removing this line, you get something like:
> <?xml version="1.0" ?>
> That seems sensible, imo. Was that what you wanted?
Sure. That's fine unless you then re-parse this out put and print it
again in which case you get the behaviour you describe:
> An additional thing to keep in mind is that toprettyxml does not print
> an XML identical to the original DOM tree: it adds newlines and tabs.
> When parsed again these blank characters are inserted in the DOM tree as
> character nodes. If you toprettyxml an XML document twice in a row, then
> the second one will also add newlines and tabs around the newlines and
> tabs added by the first. Since you call toprettyxml an infinite number
> of times, it is expected that lots of blank characters appear.
Right. That's the behaviour I'm asking about, which I consider to be
problematic. I would expect a module providing a parser and pretty-
printer (not just for XML parsers) to be able to conservatively round-
As far as I can see (and your comments back this up) minidom doesn't
have this property. Unless anyone knows how to get it to behave that
More information about the Python-list