[Python-Dev] Integrating Expat

Fred L. Drake, Jr. fdrake@acm.org
Mon, 1 Oct 2001 10:17:06 -0400


Martin von Loewis writes:
 > [I know I've asked this before, but Fred wanted me to ask it again :-]

  Actually, I think I simply suggested the forum so that others could
comment as well.  ;-)

 > What do you think about an integration of Expat into Python, to be
 > always able to build pyexpat (and with the same version also)?
 > Which version of Expat would you use? Would you put the expat files
 > into a separate directory, or all into modules?

  I have mixed feelings.  There are really two things that we could
do:  We could add Expat to our CVS repository, which means syncing a
bunch of files everytime a new Expat release comes out, or we could
bundle the Expat sources with the Python source distribution when the
distribution is built, but not add them to CVS.  This avoids the extra
files in CVS, but complicates construction of the distribution and
adds a new wrinkle to the configuration management.

 > Here is my proposal: Integrate Expat 2.95.2 for release together with
 > Python 2.2; into an expat subdirectory of Modules (taking only the lib
 > files of expat).
 > 
 > This would affect build procedures on all targets; in particular,
 > pyexpat would not link to a shared expat DLL, but incorporate the
 > object files.

  For the "Parsed XML" Zope product, we included the sources for the
Expat library in our CVS, but added our own configure.in and other
build-control files, which are simpler than those included with Expat
(since it only needs to build the static library).  This seems to work
reasonably well, and does not introduce new wrinkles to the
configuration management.
  So I think we agree on the approach to take.

M.-A. Lemburg writes:
 > Are you sure that we should choose expat as "native" XML parser ?
 > 
 > There are other candidates which would fit this role just
 > as well (in particular, Fredrik's sgmlop looks like a nice
 > extension since it not only works with XML but also many
 > other meta languages).

  See Martin's comments about this.  I think this precludes inclusion
of sgmlop until the problems it has have been addressed in the
implementation.
  I'm not sure what "meta languages" it handles; I thought it only
dealt with XML/XHTML and HTML document markup.

 > If you want a very fast validating XML parser, RXP would also
 > be a good choice -- AFAIK, the RXP folks would allow us to
 > ship RXP under a different license than GPL which is then
 > bound to Python.

  Agreed.  I think it would be really nice to have an interface for
RXP that was easy to build and use.  I haven't looked at PyLTXML in a
long time, so I'm not sure what state it's in.

 > Given the many alternatives, I am not sure whether going with
 > expat is the right path... may be wrong though.

  As Martin said, RXP and Expat together don't really qualify as
"many".  sgmlop just isn't robust enough (yet), and it's not clear
there are other alternatives.
  There is libxml (a.k.a. gnome-xml), which is licensed under the
LGPL; Python bindings for that are described as being in the alpha
stage, but I haven't had time to play with them myself.  


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation