[Python-ideas] PEP 426, YAML in the stdlib and implementation discovery
Masklinn
masklinn at masklinn.net
Fri May 31 21:14:46 CEST 2013
On 2013-05-31, at 20:43 , Andrew Barnert wrote:
>
> For example, the third-party lxml library provides an implementation of the ElementTree API. For some use cases, it's better than the stdlib one. So, a lot of programs start off with this:
>
> try:
> from lxml import etree as ET
> except ImportError:
> from xml.etree import ElementTree as ET
>
> Your registration mechanism would mean they don't have to do this; they just import from the stdlib, and if lxml is present and registered, it would be loaded instead.
That seems rife with potential issues and unintended side-effects e.g.
while lxml does for the most part provide ET's API, it also extends it
and I do not know if it can run ET's testsuite. It also doesn't handle
ET's implementation details for obvious reasons[0].
So while a developer who will try lxml and fallback on ET will keep in
mind to stay compatible and not use ET implementation details, one
who expects ET and gets lxml on client machine will likely be slightly
disappointed in the system.
[0] and one of those implementation details can turn out deadly in
unskilled hands: by default ElementTree will *not* remember
namespace aliases and will rename most things from ns0 onwards:
>>> ET.tostring(ET.fromstring('<foo xmlns:bar="noop"><bar:baz/></foo>'))
'<foo><ns0:baz xmlns:ns0="noop" /></foo>'
whereas lxml *will* save parsed namespace aliases in its
internal namespace map(s?) and reuse them:
>>> lxml.tostring(lxml.fromstring('<foo xmlns:bar="noop"><bar:baz/></foo>'))
'<foo xmlns:bar="noop"><bar:baz/></foo>'
if a developer expects a re-prefixed output and get lxml's,
things are going to blow up. Yeah it's bad to do that, but
I've seen enough supposedly-XML-based software which cared
less for the namespace itself than for the alias to know
that it's way too common.
More information about the Python-ideas
mailing list