[Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

Tue Mar 19 14:30:07 EDT 2019

On Tue, Mar 19, 2019 at 6:15 AM Serhiy Storchaka <storchaka at gmail.com>
wrote:

> 19.03.19 13:53, Ned Batchelder пише:
> > Option 4 is misleading.  Is anyone here really offering to "fix the
> > tests in third-party modules"?  Option 4 is actually, "do nothing, and
> > let a multitude of projects figure out how to fix their tests, slowing
> > progress in those projects as they try to support Python 3.8."
>
> Any option except option 1 (and option 2 with sorting by default)
> requires changing third-party code. You should either pass additional
> argument to serialization functions, or use special canonization functions.
>
> We should look at the problem from long perspective. Freezing the
> current behavior forever does not look good. If we need to break the
> user code, we should minimize the harm and provide convenient tools for
> reproducing the current behavior. And this is an opportunity to rewrite
> user tests in more appropriate form. In your case textual comparison may
> be the most appropriate form, but this may be not so in other cases.
>

In situations like this I think it's best to bite the bullet sooner rather
than later while acknowledging that folks like Ned are in a bind when they
have support older versions and thus have long-term support costs, too, and
try to make the transition as painless as possible (my guess is Ned's need
to support older versions will drop off faster than us having to support
the xml libraries in the stdlib going forward, hence my viewpoint).

>
> > Now in Python 3.8, because Python doesn't want to add an
> > optional flag to continue doing what it has always done, I need to
> > re-engineer my tests.
>
> Please wait yet some time. I hope to add canonicalization before the
> first beta.
>

For me I think canonicalization/stable pretty-print is the best option,
especially if we can put the canonicalization code up on PyPI for
supporting older versions of Python. Otherwise a function that does
something like an XOR to help diagnose what differs between 2 XML documents
is also seems like a good option to me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20190319/79f682a8/attachment.html>