[Python-Dev] Python XML Validator

Mike Meyer mwm at mired.org
Mon Mar 3 05:07:08 CET 2008


On Thu, 28 Feb 2008 23:42:49 +0000 (UTC) Medhat Gayed <medhat.gayed at gmail.com> wrote:
> lxml is good but not written in python and difficult to install and didn't work
> on MacOS X.

lxml is built on top of libxml2/libxslt, which are bundled with most
Unix-like OS's (including Mac OS X), or available in their package
systems. Trying to install it from the repository is a PITA, because
it uses both the easyinstall and Pyrex (later Cython) packages - which
aren't bundled with anything. On the other hand, if it's in the
package system (I no longer have macports installed anywhere, but
believe it was there at one time), that solves all those problems. I
believe they've excised the easyinstall source dependencies, though.

Using lxml on OS X Tiger was problematical, because the versions of
python, libxml2 and libxslt provided with Tiger were pretty much all
older than lxml supported; I built python from macports, including
current versions of libxml2, libxslt and lxml, and everything worked
with no problems. (I later stopped working with this on the Mac
because I need cx_Oracle as well, which doesn't exist for intel macs).

On Leopard, Python is up to date, but libxml/libxslt seems a bit
behind for lxml 2.0.x (no schematron support being the obvious
problem). I went back to the 1.3.6 source tarball (which is what I'm
using everywhere anyway), and "python setup.py install" worked like a
charm. (So it looks the easyinstall dependency is gone).

Of course, the real issue here is that, while Python may come with
"batteries included" you only get the common sizes like A, C and D. If
you need a B cell, you're on your own. In XML land, validation is one
such case. Me, I'd like complete xpath support, and xslt as well. But
this happens with other subsystems, like doing client-side SSL support,
but not server-side (at least, not as of 2.4; I haven't checked 2.5).

If you just want an xml module in the standard library that's more
complete, I'd vote for the source distribution of lxml, as that's C +
Python and built on top of commonly available libraries. The real
issue would be making current lxml work with the "outdated" versions
of those libraries found in current OS distributions.

    <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.


More information about the Python-Dev mailing list