[lxml-dev] Doctest for Namespace class
Hi! The Namespace class now has a doctest in scoder2/doc/namespace_extensions.txt. I also gave a quick example on how to use XPath extension functions, but I still somewhat feel that the prefixes should not have to be specified on evaluation. My current example looks like this: .>>> from lxml.etree import Namespace .>>> def tag_of(context, elem): ... return elem[0].tag .>>> namespace = Namespace('myfunctions') .>>> namespace['tagname'] = tag_of .>>> element = XML('<test><honk honking="true"/></test>') .>>> element.xpath('f:tagname(//honk)', {'f' : 'myfunctions'}) 'honk' What bothers me is the {'f' : 'myfunctions'} in there. The user shouldn't have to care about that. What about supporting this: .>>> element.xpath('{myfunctions}tagname(//honk)') My XPath class already supports that for elements, but it is not currently done for extension functions (and not in the two original evaluators either, for backward compatibility). Personally, I'd prefer removing the current extension support completely, since it makes the code rather complicated. Providing two different APIs to the user and then merging the functions from both of them internally is not exactly elegant. Stefan
Hi, On Wed, 2005-12-21 at 08:59 +0100, Stefan Behnel wrote: [...]
.>>> element = XML('<test><honk honking="true"/></test>') .>>> element.xpath('f:tagname(//honk)', {'f' : 'myfunctions'}) 'honk'
What bothers me is the {'f' : 'myfunctions'} in there. The user shouldn't have to care about that. What about supporting this:
.>>> element.xpath('{myfunctions}tagname(//honk)')
My XPath class already supports that for elements, but it is not currently done for extension functions (and not in the two original evaluators either, for backward compatibility).
[...] XPath's syntax does not support Clark's notation. This was discussed here a few months ago; a QName cannot be expressed this way in XPath. Additionally, Clark's notation becomes difficult to read when multiple QNames are involved: '{myfuncions}do-this({myfunctions}do-that({myotherfunctions}do-it(//honk)))' Regards, Kasimier
Kasimier Buchcik wrote:
On Wed, 2005-12-21 at 08:59 +0100, Stefan Behnel wrote:
.>>> element = XML('<test><honk honking="true"/></test>') .>>> element.xpath('f:tagname(//honk)', {'f' : 'myfunctions'}) 'honk'
What bothers me is the {'f' : 'myfunctions'} in there. The user shouldn't have to care about that. What about supporting this:
.>>> element.xpath('{myfunctions}tagname(//honk)')
My XPath class already supports that for elements, but it is not currently done for extension functions (and not in the two original evaluators either, for backward compatibility).
[...]
XPath's syntax does not support Clark's notation. This was discussed here a few months ago; a QName cannot be expressed this way in XPath.
Additionally, Clark's notation becomes difficult to read when multiple QNames are involved: '{myfuncions}do-this({myfunctions}do-that({myotherfunctions}do-it(//honk)))'
I guess you're refering to this: http://codespeak.net/pipermail/lxml-dev/2005-August/000379.html I do see some arguments in that thread that basically call that "different from XPath", and I guess that's a ligitimate counter argument. Maybe a better way of handling that would then be an extension to the Namespace class, like a property that keeps an associated prefix. That way, we could internally scan(*) the XPath expression for prefixes and then register the corresponding namespaces before evaluation. .>>> ns = Namespace('myns') .>>> ns.prefix = 'f' .>>> ns['tagname'] = my_tagname_function .>>> element.xpath('f:tagname(//honk)') Although, I'm not sure if it's a good idea to use globally defined prefixes. It may be OK for functions, but less OK for element classes. Maybe we should call the property 'function_prefix' to restrict its usage to XPath extension fucntions? Any comments on that? Stefan (*): Note that speed is not an argument against scanning for prefixes. If you care about speed, use the XPath class.
Stefan Behnel wrote:
Kasimier Buchcik wrote:
On Wed, 2005-12-21 at 08:59 +0100, Stefan Behnel wrote:
.>>> element = XML('<test><honk honking="true"/></test>') .>>> element.xpath('f:tagname(//honk)', {'f' : 'myfunctions'}) 'honk'
What bothers me is the {'f' : 'myfunctions'} in there. The user shouldn't have to care about that. What about supporting this:
.>>> element.xpath('{myfunctions}tagname(//honk)')
My XPath class already supports that for elements, but it is not currently done for extension functions (and not in the two original evaluators either, for backward compatibility).
[...]
XPath's syntax does not support Clark's notation. This was discussed here a few months ago; a QName cannot be expressed this way in XPath.
Additionally, Clark's notation becomes difficult to read when multiple QNames are involved: '{myfuncions}do-this({myfunctions}do-that({myotherfunctions}do-it(//honk)))'
I guess you're refering to this:
http://codespeak.net/pipermail/lxml-dev/2005-August/000379.html
I do see some arguments in that thread that basically call that "different from XPath", and I guess that's a ligitimate counter argument.
Yes, I'd like to keep support for XPath as per spec. We could consider an alternative API that supports this pattern too, though, as long as it's clear that this is not (exactly) XPath. Perhaps we can introduce a little helper function called 'clarke()' or something, that transforms clark notation into an xpath expression + namespace dictionary.
Maybe a better way of handling that would then be an extension to the Namespace class, like a property that keeps an associated prefix. That way, we could internally scan(*) the XPath expression for prefixes and then register the corresponding namespaces before evaluation.
.>>> ns = Namespace('myns') .>>> ns.prefix = 'f' .>>> ns['tagname'] = my_tagname_function .>>> element.xpath('f:tagname(//honk)')
Although, I'm not sure if it's a good idea to use globally defined prefixes. It may be OK for functions, but less OK for element classes. Maybe we should call the property 'function_prefix' to restrict its usage to XPath extension fucntions?
Any comments on that?
I don't really like globally defined prefixes, but when namespace classes are used it isn't that bad. I guess functions would be in their own namespace typically, so it might be we want to define a special namespace class just for this purpose? Anyway, I will read your namespace class doctest and think it over a bit more about christmas. I really like the things that become possible with them though, so I'm happy you introduced them and are continuing to innovate with them! Regards, Martijn
Martijn Faassen wrote:
Stefan Behnel wrote:
Kasimier Buchcik wrote:
XPath's syntax does not support Clark's notation. This was discussed here a few months ago; a QName cannot be expressed this way in XPath.
Additionally, Clark's notation becomes difficult to read when multiple QNames are involved: '{myfuncions}do-this({myfunctions}do-that({myotherfunctions}do-it(//honk)))'
http://codespeak.net/pipermail/lxml-dev/2005-August/000379.html
I do see some arguments in that thread that basically call that "different from XPath", and I guess that's a ligitimate counter argument.
Yes, I'd like to keep support for XPath as per spec. We could consider an alternative API that supports this pattern too, though, as long as it's clear that this is not (exactly) XPath. Perhaps we can introduce a little helper function called 'clarke()' or something, that transforms clark notation into an xpath expression + namespace dictionary.
I don't really like globally defined prefixes, but when namespace classes are used it isn't that bad. I guess functions would be in their own namespace typically, so it might be we want to define a special namespace class just for this purpose?
Or a subclass of XPath, like ClarkXPath, that accepts Clark's notation. A helper function would then do the trick internally. What you're proposing would be something like 'FunctionNamespace'. Nothing wrong with that. Actually, it's even (more or less) handled that way internally. Don't know if it's worth a separate API, though. Also, note that sometimes functions do not have a separate namespace, look at XSLT, for example. Stefan
Responding to myself to continue this discussion from last year... Stefan Behnel wrote:
Martijn Faassen wrote:
Yes, I'd like to keep support for XPath as per spec. We could consider an alternative API that supports this pattern too, though, as long as it's clear that this is not (exactly) XPath. Perhaps we can introduce a little helper function called 'clarke()' or something, that transforms clark notation into an xpath expression + namespace dictionary.
I don't really like globally defined prefixes, but when namespace classes are used it isn't that bad. I guess functions would be in their own namespace typically, so it might be we want to define a special namespace class just for this purpose?
Or a subclass of XPath, like ClarkXPath, that accepts Clark's notation. A helper function would then do the trick internally.
I think that's a good solution. If people want to use Clark's notation in XPath expressions, they can use a subclass of the XPath class. This also reduces the performance penalty of having to parse the XPath expression for namespaces: it's only done at XPath object creation time.
What you're proposing would be something like 'FunctionNamespace'. Nothing wrong with that. Actually, it's even (more or less) handled that way internally. Don't know if it's worth a separate API, though. Also, note that sometimes functions do not have a separate namespace, look at XSLT, for example.
I started writing a module function called 'ExtensionNamespace' that can be used for extensions. It has a prefix associated with it, so that you can do what I proposed before: .>>> ns = ExtensionNamespace('myns') .>>> ns.prefix = 'f' .>>> ns['func'] = my_function .>>> element.xpath('f:func(//honk)') or call an XSLT that uses this function like in ... <xsl:value-of select="f:func(.)"/> ... It becomes somewhat problematic, however, when different namespaces become associated with the same prefix, like this: .>>> ns1 = ExtensionNamespace('myns1') .>>> ns1.prefix = 'f' .>>> ns2 = ExtensionNamespace('myns2') .>>> ns2.prefix = 'f' I don't like putting this in the hands of the user. We could use a dictionary internally, but that would still be a global one, so if some module happens to register a prefix that is already used by a different module - bad luck. And I guess this is much more likely to happen with prefixes than with namespaces. So, I don't know. Maybe the best solution is still to simply register all namespace functions internally on each XPath/XSLT evaluation. It is rather unlikely that use cases will deploy more than, say, 20 extension functions and those could be registered pretty quickly, without a big performance hit. Another point of discussion: Should the current support for extension functions be dropped? This would obviously break current code, but it would make both the internal implementation and the API cleaner by removing the 'extensions' argument from the respective method calls and by removing the need for merging dictionaries of functions. I'm especially looking at lxml V1.0, which should have a clean and stable API. Any comments? Stefan
Stefan Behnel wrote:
Another point of discussion: Should the current support for extension functions be dropped? This would obviously break current code, but it would make both the internal implementation and the API cleaner by removing the 'extensions' argument from the respective method calls and by removing the need for merging dictionaries of functions. I'm especially looking at lxml V1.0, which should have a clean and stable API.
I would be surprised if many were depending on the current extension function support being stable. I'd say, feel free to do whatever you think best on the way to 1.0. Today I wanted to try out some of the ideas from your project at Berlios. However, building from scoder2 (which previously worked ok) now gives: File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/distutils/command/build_ext.py", line 442, in build_extension sources = self.swig_sources(sources, ext) TypeError: swig_sources() takes exactly 2 arguments (3 given) make: *** [inplace] Error 1 --Paul
Paul Everitt wrote:
Today I wanted to try out some of the ideas from your project at Berlios. However, building from scoder2 (which previously worked ok) now gives:
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/distutils/command/build_ext.py", line 442, in build_extension sources = self.swig_sources(sources, ext) TypeError: swig_sources() takes exactly 2 arguments (3 given) make: *** [inplace] Error 1
Hmm, I remember that I had the same problem a while ago, but I can't remember what I did to solve it. I think it wasn't related to lxml, but an inheritance problem somewhere between Pyrex and distutils. Something imported something from somewhere :) that was accidentally used as a baseclass instead of the intended one. That's why swig_sources has the wrong number of arguments, it is defined in two different places. Have you tried "make clean" and all? Note also that lxml now uses setuptools if available. I tried building with and without setuptools and couldn't reproduce it now. Stefan
participants (4)
-
Kasimier Buchcik
-
Martijn Faassen
-
Paul Everitt
-
Stefan Behnel