[XML-SIG] XPath in Python 2
Uche Ogbuji
uogbuji@fourthought.com
Mon, 10 Jul 2000 12:14:00 -0600
> Python is delayed and we don't know how long it will be so.
Bloody hell! That's what happens when gurus get married.
Note: I hate smileys, but I guess I'd better throw one in, just in case: %^)
> Why XPath? XPath is the W3C-provided mechanism for navigating XML
> documents in a declarative way. That means that rather than specifying
> an exact path to a node, you describe the relationship between the node
> you are on and the node you want to get to. This makes the creation of
> complex applications much easier and allows for more efficiency "under
> the hood" of the XPath implementation.
I agree that XPath is a big nice-to-have. Microsoft's GetByQuery (or
something like that) is a very popular addition to their DOM and IBM et al are
being forced to imitate, even in advance of DOM Level 3, which might address
query.
> 4XPath is cleaner from a user's point of view, but it requires a lot bit
> of C/lex code for parsing the XPaths. I don't know if we would have to
> go back to the BDFL to get permission for that code to go into Python.
Would it be good enough for us just to check in ANSI C code from FLEX/Bison?
> We also have the option of creating a new XPath implementation also. The
> primary virtue of doing so would be the opportunity to implement a tiny
> subset of XPath in a much smaller amount of code. The two existing
> implementations probably have more code than the rest of the Python 1.6
> XML package. And in 4XPath's case, a lot of that is C code.
>
> ---
>
> My feeling is that implementing 10% of XPath in 10% of the code would
> get us 80% of the benefit. Those that need the rest can download 4XPath.
OK.
> I also think that 4XPath should be part of the pyxml distribution.
Already in motion.
> The 10% that is most interesting:
>
> * a/b/c
> * a//b
> * ../
>
> Actually, that's probably not even 10% and it can be "parsed" mostly
> with a "string.split" on "/". Things like positional predicates can be
> implemented with Python sequence syntax. Attribute access can use DOM
> syntax. All in all, this looks like an afternoon's work, if we agree
> that it should go into Python.
I don't know. I agree that most people don't need XPath's zoo of axes, but I
think predicates would be sorely missed very quickly.
I should note that other benefits of the mini-xpath -> 4XPath migration would
be indexing and extension functions.
--
Uche Ogbuji Principal Consultant
uche.ogbuji@fourthought.com +01 303 583 9900 x 101
Fourthought, Inc. http://Fourthought.com
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python