[DOC-SIG] XML Extension Module?

Fred L. Drake Fred L. Drake, Jr." <fdrake@acm.org
Fri, 19 Dec 1997 11:08:18 -0500

Sean Mc Grath wrote:
> I would like to see basic XML support provided as a portable C extension module.
> I believe XML will take off and that the XML support in Python
> will be sufficently useful to go into the standard

Paul Prescod writes:
 > First, Timing. We're on the verge of releasing Python 1.5. Can we get
 > something in there? If not, how long do we have to wait for an
 > upgrade/update?

  Expect the final release this year.  Even if the code were written
and tested already, it's basically too late to add modules to 1.5.  A
separate package isn't such a bad thing, though, with the 1.5
Misc/Makefile.pre.in, a package can be built *and installed* into an
installed Python very easily.

 > I'm also not sure about whether the idea will be popular on Unix
 > platforms. Are Python extensions usually linked statically or
 > dynamically on those platforms? 

  I think most of the "major" modern UNIXes support dynamic modules,
but I don't know how people actually build it.  Without uncommenting
the *shared* line in Setup, all of the extensions are built
statically.  The best approach would probably be to move all the most
commonly used C modules above the *shared* line (things like strop
could certainly be there), and keep the less often used modules below
the line (database interfaces, curses, etc.).  Support for SGML/XML
should be dynamic where that's available.
  If a module is installed as an extension, it can be built
dynamically if the architecture supports it, regardless of the Python

 > In other words, if I'm on a standard site with Python installed, and I
 > use XML in one out of 99 scripts, do the other 99 scripts usually take
 > the binary size hit of having the extension module linked in? I know
 > that theoretically dynamic linking is possible, but I'm not sure about
 > the amount this feature is actually used in the community. 

  The effect of a statically linked module that isn't used is very
system dependent.  Some systems (Linux I know) won't even allocate
swap space for the unused pages from the executable, so it's pretty
cheap.  Some platforms probably copy all the pages into the VM space
before starting the process, but I expect this is changing somewhat.
Some require static linking and full loading, and that's just a nasty
can of worms.  (It also suggests that extensions should be avoided if

 > I am very interested in feedback on that score. The nice thing about
 > using Python code modules is that you only incur the hit when you use
 > them. And presumably the same holds for binary extensions on Windows.

  That is correct.

 > Just for reference, James' module is about 100K binary on Windows, does
 > full well formedness checking and handles Unicode. Without well
 > formedness checking, it seems to be about 80K.

  I am presuming we're talking about material in xmltok.zip?  This
still appears to be pretty preliminary (which might be why I found it
in the test directory at ftp.jclark.com).  Just running a simple make
on Solaris/SPARC gives me an xmltok.o of about 100K, with the
additional .o files in the xmlwf/ directory, about 116K.  Add to this
Python extension code, and I'll guess it's under 125K.


Fred L. Drake, Jr.
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org