[XML-SIG] py2exe and switching from PyXML 4Suite
Uche Ogbuji
uche.ogbuji@fourthought.com
Mon, 16 Dec 2002 06:55:59 -0700
> On Sun, Dec 15, 2002 at 02:25:04PM -0700, Uche Ogbuji wrote:
> > > > Because you're such a nice guy, I'll ditch the tease for you :-)
> > > You can also do these much more easily using XPath...
> > > from Ft.Xml import XPath
> > > def getElementsByTagName(node,name)
> > > return XPath.evaluate(node,".//" + name)
> > I think my generator approach would be faster, but this XPath approach has the
> > advantage of working in Python 2.0 and 2.1. Mine is 2.2+.
>
> Okay, I have now read up on iterators and generator - WOW! that's so
> cool!.. I immediately rewrote the search part of my program that had
> been giving me problems - with generators it just works perfectly and
> the code is so much simpler!
>
> I have implemented Uche's generators now and the function that loads an
> XML file into my programs internal format has now gone from 6-7 seconds
> load time to 1-2 second for the 219Kb XML file I am using for testing on
> my Athlon XP 1800+.
Generators do absolutely rock. I'm glad you got a good opportunity to learn
them as a side effect of trying to figure out Python/XML.
> Now - my program also saves it's data to disk by creating a dom and
> filling it with the right nodes and then writing it to disk with
> PrettyPrint (which takes over 20 seconds on my machine with the 219Kb
> file). I am sure this can be done much faster with 4Suite somehow,
> but how?
Rather than
from xml.dom.ext import PrettyPrint
use
from Ft.Xml.Domlette import PrettyPrint
Domlette's PrettyPrint is written in C (with Python fall-back), and is much
faster.
> right now I use xml.dom.implementation to create the empty dom, I am
> betting it is possible to do the same using domlette, I just can't grok
> how without a little helping hand :)
Ouch. So you're using 4DOM, not Domlette, which is probably a bigger reason
for slow performance than PrettyPrint
Instead of
xml.dom.implementation
Use
from Ft.Xml.Domlette import implementation
Again I must warn you that Domlette approximates DOM and is not a full DOM.
You already found the lack of getElementsByTagName. There are other subtle
differences. That having been said, I have not run into a situation where
Domlette is not a suitable DOM substitute.
> Also it would be great if there was a way to ditch PrettyPrint as I
> would then be completely rid of PyXML, which seems to be giving py2exe
> some problems.
Well, the above would remove dependency from PyXML. However, I think we would
all like at least a report of the problems py2exe is having with PyXML, since
it would be nice if that combo worked.
> Once again, thanks for the help so far, you guys are great! :)
I'm glad we've been of help.
We would in turn be grateful at some point if you were to write up some of
your experiences and techniques in order to help others through the same
issues.
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
A Python & XML Companion - http://www.xml.com/pub/a/2002/12/11/py-xml.html
XML class warfare - http://www.adtmag.com/article.asp?id=6965
MusicBrainz metadata - http://www-106.ibm.com/developerworks/xml/library/x-thi
nk14.html