[Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

Raymond Hettinger raymond.hettinger at gmail.com
Wed Mar 20 20:26:18 EDT 2019


> On Mar 19, 2019, at 4:53 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
> 
> None of this is impossible, but please try not to preach to us maintainers that we are doing it wrong, that it will be easy to fix, etc

There's no preaching and no judgment.  We can't have a conversation though if we can't state the crux of the problem: some existing tests in third-party modules depend on the XML serialization being byte-for-byte identical forever. The various respondents to this thread have indicated that the standard library should only make that guarantee within a single feature release and that it may to vary across feature releases.

For docutils, it may end-up being an easy fix (either with a semantic comparison or with regenerating the target files when point releases differ).  For Coverage, I don't make any presumption that reengineering the tests will be easy or fun.  Several mitigation strategies have been proposed:

* alter to element creation code to create the attributes in the desired order
* use a canonicalization tool to create output that is guarantee not to change
* generate new baseline files when a feature release changes
* apply Stefan's recipe for reordering attributes
* make a semantic level comparison

Will any other these work for you?


Raymond









More information about the Python-Dev mailing list