[XML-SIG] PyXML, a modest proposal
Mon, 17 Feb 2003 20:26:16 +0100
As a developer using some of PyXML's components with other users who
use my software (in a Zope context) I would like to
kick off a discussion on how to improve PyXML for the developer and
end-user experience from a distribution/packaging point of view. I'm a
relative outsider and may get things wrong, which is why discussion is needed.
I get *many* questions from users trying to install PyXML. There is a range
* People trying to install PyXML for a binary version of Zope on Windows.
Binary Zope distributions include their own Python version. This Python
version is not found by PyXML's installer. I hack around this by
telling people to *unzip the exe file*, and then copy the _xmlplus file
manually to the right place in the Zope distribution.
* The most recent version of PyXML doesn't get distributed for Python 2.1
in binary form, while Zope still requires Python 2.1.
* Hacks to get stuff working by linking _xmlplus or xml to other
places aggravate matters.
Usually the errors result in rather obscure tracebacks that don't make
it very clear PyXML is involved.
I think the current setup with Python in the core and _xmplus in site-packages
contributes to the problems. It is confusing for developers, and it's hard to
debug for people trying to install it.
The motivation for this setup seems to be the ability to upgrade the Python
library's XML support while actually not upgrading Python (or its library).
I believe that this ability is not worth that much by itself and that this
approach should never have been adopted; we may not have known better then
but I think experience teaches us clearly enough that it's not working.
It would be much better to be explicit here, and distribute
PyXML explicitly as a 'pyxml' top level package and let developers decide
on what they want to import.
Sometimes PyXML ships with broken code (I realize this is volunteer work and
I know what is needed is a more extensive unit test suite, I'll try to look
into it). If PyXML were a standalone package that didn't try to integrate into
the 'xml' top level package namespace, that wouldn't be as big a problem.
Code written against the Python library's xml package would still continue
to work. Now however code sometimes breaks if you install PyXML.
Sometimes this is not even due to PyXML breaking anything, it's because of PyXML
*fixing* something, but I still believe code using a Python core library should
only break if you upgrade the core library. If the developer had the
explicit ability to determine which package gets imported this wouldn't
Theoretically PyXML makes a backwards compatibility guarantee. In practice
this is very hard to manage right.
Explicit is better than implicit.
A counterargument could be that since code is planned to eventually move from
pyxml to the core library, users will eventually have to modify their
code in order to start using the core code.
I think that is fine; it's just switching one import and this happens whenever
any library makes it into the core. A more implicit solution that may
sometimes be legitimate is to import core code into the pyxml package
namespace, though I'd be wary of that too (though it's far less risky
than the current situation where the reverse happens).
So, can we still change this? I propose a new 'pyxml' top level package
for PyXML code. An argument could be made that this is too late in the
development cycle because lots of code already depends on it. I think
that solving the confusion and pain would be worth it. Perhaps a
transition strategy can be devided where importing PyXML code through the
'xml' package will issue a warning so that developers can adjust their code.
If Python can change its division operator though, a 0.x package can
certainly shift around its APIs some.
Of course I may be blundering into a non-problem and everybody else is
entirely happy with the current situation and thinks the proposed situation
would make things much worse. If so, I'd be curious to find out why you