[XML-SIG] Re: Using Installer with PyXML

Dan Rolander dan.rolander@marriott.com
Sat, 20 Jan 2001 18:25:21 -0500

Thank you for your assistance Martin.  Although your analysis of the problem
("operator error") is close, I would probably more correctly identify it as
operator ignorance.  I'm still trying to figure out how to effectively use
PyXML and build standalone executables with it.  Since the Installer seems
to play by its own rules when it comes to imports, it is especially

I have not found the xml-sig documentation, or the python library reference,
to be too helpful for someone new to xml processing, so I bought Sean
McGrath's book "XML Processing with Python" and have found that to be *very*
helpful. But his examples, which I was testing, use references to pyexpat.

I tested your suggestion of using "from xml.parsers import expat" vs.
"import pyexpat" and that works fine, but I'm not sure what the benefit of
using that form is.

I haven't quite grok'd all of this yet, but once I do I would have no
problem with writing a mini-howto.

Thanks again,

----- Original Message -----
From: "Martin v. Loewis" <martin@mira.cs.tu-berlin.de>
To: <dan.rolander@marriott.com>
Cc: <python-list@python.org>; <xml-sig@python.org>; <gmcm@hypernet.com>
Sent: Saturday, January 20, 2001 5:25 PM
Subject: Re: [XML-SIG] Re: Using Installer with PyXML

>     from xml.sax import saxexts, saxlib, saxutils
> the packager (Gordon McMillan's Installer) is able to find
> but is not able to find xml.sax.saxexts or xml.sax.saxlib which actually
> reside in _xmlplus.sax.

I was going to claim this to be a bug in the installer, but it now
rather seems like an operator error: The installer has now way of
knowing that it ought to load the _xmlplus.sax.saxexts into the
distribution, since there is no import statement for it.

So announcing the full _xmlplus package to it is the right thing to

> I can force builder.py to include the entire
> _xmlplus tree by adding a packages=_xmlplus line to the [APPZLIB] section
> the .cfg file, but the exe still fails because it is looking for
>     ImportError: cannot import name xml.sax.saxexts

It's not clear what is causing that. It could be a bug in the
installer, or it could be the distribution contains no pyexpat.pyd. In
that case, you'll have to explicitly request inclusion of pyexpat.pyd.

It would be good to check what files are actually included.

> When I rename _xmlplus to xml and then run builder again without
> any additional packages, the EXE fails because it can't find an available
> parser:
>   File "c:\program files\python20\_xmlplus\sax\saxexts.py", line 77, in
> make_parser
>     xml.sax._exceptions.SAXReaderNotAvailable: No parsers found

No surprise. The installer is looking at import statements, but there
are no import statements for xml.sax.drivers.*; instead, they are
imported by calling __import__ for a computed string. So again, that
is an operator error: everything imported "by magic" must be announced
explicitly to such a packager.

> If I manually import the entire PyXML tree (now named 'xml') by adding a
> packages=xml line to the .cfg file, I get a little farther but now the exe
> isn't able to find pyexpat.
>     ImportError: cannot import name xml.parsers.pyexpat
> I then try to manually import pyexpat by adding xml.parsers.pyexpat to the
> misc line in the [MYCOLLECT] section, but finder.py is not able to find
>       File
> line 121, in identify
>       ValueError: xml.parsers.pyexpat.pyd not found

You did not say *how* you specified it - it might be that Installer
mistook your command as trying to import a module named "pyd" from a
package named "pyexpat" - that is not available.

> If I changed the .cfg line in [MYCOLLECT] to misc=pyexpat.pyd then the
> \DLLs version of pyexpat.pyd is found and put into the dist directory. Now
> when the exe is run I get a Windows error stating that the xmlparse.dll
> couldn't be located.
> I add xmlparse.dll to the misc= line and then I get an error stating that
> the xmltok.dll couldn't be found.
> I add xmltok.dll to the misc= line and voila! it works!

When you use the pyexpat from PyXML, the difference should be that
xmlparse.dll and xmltok.dll are not required.

> I then replaced the core version of pyexpat.pyd in \DLLs with the
> PyXML version and found that I could build a good exe without having
> to manually include the xmlparse.dll and xmltok.dll.

Not only do you not need to include them manually - they are not
needed at all. Care to write a small howto document for the XML topic

> I start with the _xmlplus directory renamed to xml, because I know that's
> necessary, and I build a new standalone installation. This time the
> file is imported to the dist directory as xml.parsers.pyexpat.pyd but the
> exe won't import it:
>     ImportError: cannot import name xml.parsers.pyexpat

Do you have a traceback for that? All applications should import
xml.parsers.expat, which should have

from pyexpat import *

so there should be no request to load xml.parsers.pyexpat. Older PyXML
versions had such code, but it should have been wrapped with catching
and ImportError, which then should fall back to load pyexpat

> The only fix I can figure out is to change the import statement to:
>     import pyexpat
> and that works.

As I said, the real solution is to write

  from xml.parsers import expat

or, if you need to keep the pyexpat name,

 from xml.parsers import expat as pyexpat

> 1.  Replace the core xml directory with the _xmlplus directory, by
> _xmlplus to xml.

I'm not entirely sure *why* this is needed, but it certainly can't hurt.

> 2.  Copy the PyXML pyexpat.pyd file from the xml.parsers directory to the
> <python_root>\DLLs directory.

That is a good idea, yes.

> 3.  If pyexpat is needed, either explicitly import it in your script, or
> manually include it in the standalone installation by adding an entry to
> misc line in the COLLECT section of the builder .cfg file.

pyexpat should always be included in PyXML applications, so that is
also fine.

> 4.  If importing from xml.sax, manually import the entire PyXML tree
> files only) by specifying either packages=xml or directories=xml in the
> section of the builder .cfg file.

I would guess the same applies when importing DOM stuff - the DOM
readers also use make_parser at some point.