[Python-Dev] Integrating Expat

Greg Stein gstein@lyra.org
Sun, 7 Oct 2001 16:59:39 -0700


On Sun, Sep 30, 2001 at 04:53:06PM +0200, Martin von Loewis wrote:
> [I know I've asked this before, but Fred wanted me to ask it again :-]
> 
> What do you think about an integration of Expat into Python, to be
> always able to build pyexpat (and with the same version also)?
> Which version of Expat would you use? Would you put the expat files
> into a separate directory, or all into modules?
> 
> Here is my proposal: Integrate Expat 2.95.2 for release together with
> Python 2.2; into an expat subdirectory of Modules (taking only the lib
> files of expat).
> 
> This would affect build procedures on all targets; in particular,
> pyexpat would not link to a shared expat DLL, but incorporate the
> object files.


Speaking from the experience of bundling Expat directly into the Apache
binaries (also using a subset of the original source) ...

    I think bundling the sources is fine, but it should *ONLY* be a fallback
    if you do not find the Expat library installed on the system. *ALWAYS*
    link against a system-installed library first.


We ran into a problem that has bothered some Perl users for a long while
now. Specifically: Apache 1.3 would get loaded and export the Expat symbols
to the rest of the process. Any third-party module that was built *against
Apache* (obviously the case since they are Apache modules) and needed Expat
would immediately resolve upon loading and be happy. But! What we ran into
is mod_perl (linked against Apache) running a Perl script which, in turn,
loaded XML::Parsers::Expat. That Perl module linked against *Expat*, not
Apache (it is a standard module and has nothing to do with Apache). Well...
when the Perl module was loaded, you now had *two* sets of Expat symbols in
the process space. Segfaults, bugs, and madness ensued.

I just made some fixes this past week to Apache 1.3 to fix the situation
somewhat. The basic answer is to always grab a system (.so) library when
possible. When the shared lib is present, then both Apache and
XML::Parsers::Expat would link against the same thing about loading. And
Apache still has the feature of exposing XML to its third-party modules.


This situation could easily happen to Python, too. Imagine building Expat
directly into pyexpat. Some Python script loads pyexpat and the Expat
symbols come with it. Now, some *other* module is loaded and dynamically
links against /usr/lib/libexpat.so. Now you have *two* sets of Expat symbols
and crashes are going to start happening.


-1 on *always* using bundles sources -- they should only be a fallback. +1
on including it as a fallback.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/