[XML-SIG] Reconsidering the DOM API

Fredrik Lundh fredrik@pythonware.com
Thu, 29 Jun 2000 15:43:29 +0200


paul wrote:
> I'm not yet ready to accept that no Python-coded parser could parse
> XPath efficiently. I want to propose that it is impossible to Fredrick
> and see what happens in the next beta of SRE. :)
> 
> http://www.w3.org/TR/xpath

challenge accepted.

david:
> I made a post some months back pointing out a "shallow parsing" regular
> expression that can break an XML document into a list of its markup and
> text items. I thought it was pretty interesting example of just what you
> can do with a regex! I put some details at 
> 
> http://starship.python.net/crew/dni/REX/index.html

the thing I call SREX is something similar (the pattern isn't as
complete as REX, and it squeezes some extra performance out
of SRE by using something called "template mode").

some notes on xmllib/sgmlop/sre performance can be found here:
http://hem.passagen.se/eff/2000_06_01_bot-archive.htm#397730
and:
http://hem.passagen.se/eff/2000_06_01_bot-archive.htm#399596

> Incidently (for Fredrik), sre included with python 1.6a fails to compile
> the regex mentioned above (although re is able to compile it) -- I was
> hoping to see just how much faster it is! There have been a couple sre
> patches cheked into cvs since I last compiled the source.

I've spent the last three days working on SRE -- the current
snapshot is *much* better.

> The regex is pretty large and might make a good test case.

I'll take a look at it; if it doesn't work, I'll consider that as a
critical bug.

cheers /F