I need to make some thread-specific data available to my xslt function callbacks. I think I've sort of reverse-engineered a way to do it, by creating my own parser class, stuffing the data I need into an instance of that class, using the parser instance to parse the XSL/T filter, and then in the extension function following the context argument's context_node to its root tree, which in turn gets me back to my own parser object (and my thread-specific data). Is there a more elegant way to do it that I just haven't yet stumbled across in reading through the API docs? Thanks! -- Bob Kline http://www.rksystems.com mailto:bkline@rksystems.com
On Tue, Mar 21, 2017 at 7:02 PM, Bob Kline <bkline@rksystems.com> wrote:
... Is there a more elegant way to do it ...?
Hmm. It appears that I'm in need of more than additional elegance here. When I actually tried to apply the approach I just described, I discovered that I'm not able to get to the context_node member of the context argument at all (not even to print its type), even though I can see that it's there (by looking at the output of print(dir(context))). def ext_function(context, arg=None): print(dir(context)) print(type(context.context_node)) .... The first print statement shows [..., 'context_node', 'eval_context'] but the second triggers an exception: .... print(type(context.context_node)) File "src\lxml\extensions.pxi", line 315, in lxml.etree._BaseContext.context_node.__get__ (src\lxml\lxml.etree.c:159971) File "src\lxml\lxml.etree.pyx", line 1617, in lxml.etree._elementFactory (src\lxml\lxml.etree.c:59684) File "src\lxml\classlookup.pxi", line 407, in lxml.etree._parser_class_lookup (src\lxml\lxml.etree.c:92440) File "src\lxml\classlookup.pxi", line 259, in lxml.etree._callLookupFallback (src\lxml\lxml.etree.c:90672) File "src\lxml\classlookup.pxi", line 338, in lxml.etree._lookupDefaultElementClass (src\lxml\lxml.etree.c:91744) AssertionError: Unknown node type: 9 I can't recall ever failing to print out the type of something which I can see is there. Any idea what's going on? Thanks, Bob
On Tue, Mar 21, 2017 at 7:41 PM, Bob Kline <bkline@rksystems.com> wrote:
On Tue, Mar 21, 2017 at 7:02 PM, Bob Kline <bkline@rksystems.com> wrote:
Any idea what's going on?
This is lxml 3.7.3 on Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)] on win32 Thanks, Bob
The subject of my post was misleading. I'm aware that I could use a threading.local object to isolate thread-specific data. What I'm really hoping for is stack-based data which carries information about the individual XSL/T filtering call. The way Sablotron (R.I.P.) allowed an opaque pointer to a structure of user data to be passed into SablotRegHandler(). I could implement my own stack (so that if a callback ended up invoking another filtering operation it wouldn't obliterate the data specific to the original XSL/T job), but it would be cleaner if lxml provides a way to register user data for the filtering call. Thanks, Bob
On Tue, Mar 21, 2017 at 7:49 PM, Bob Kline <bkline@rksystems.com> wrote:
On Tue, Mar 21, 2017 at 7:41 PM, Bob Kline <bkline@rksystems.com> wrote:
On Tue, Mar 21, 2017 at 7:02 PM, Bob Kline <bkline@rksystems.com> wrote:
This is lxml 3.7.3 ...
More version info: Python : sys.version_info(major=3, minor=6, micro=0, releaselevel='final', serial=0) lxml.etree : (3, 7, 3, 0) libxml used : (2, 9, 4) libxml compiled : (2, 9, 4) libxslt used : (1, 1, 29) libxslt compiled : (1, 1, 29) Thanks, Bob
I combed through the mailing list archives and confirmed that I never got a response to this inquiry. To re-cap, I'm looking for a way to pass in a reference to user data when invoking the XSLT object to transform a tree, and to have that user data object accessible to the resolver callbacks. Is there support somewhere for this that I just haven't find? If not, how hard would it be to implement this? I'm willing to contribute toward such an effort. Thanks! Bob On Wed, Mar 22, 2017 at 8:32 AM, Bob Kline <bkline@rksystems.com> wrote:
The subject of my post was misleading. I'm aware that I could use a threading.local object to isolate thread-specific data. What I'm really hoping for is stack-based data which carries information about the individual XSL/T filtering call. The way Sablotron (R.I.P.) allowed an opaque pointer to a structure of user data to be passed into SablotRegHandler(). I could implement my own stack (so that if a callback ended up invoking another filtering operation it wouldn't obliterate the data specific to the original XSL/T job), but it would be cleaner if lxml provides a way to register user data for the filtering call.
Thanks, Bob
-- Bob Kline http://www.rksystems.com mailto:bkline@rksystems.com
Bob Kline schrieb am 22.03.2017 um 00:41:
When I actually tried to apply the approach I just described, I discovered that I'm not able to get to the context_node member of the context argument at all (not even to print its type), even though I can see that it's there (by looking at the output of print(dir(context))).
def ext_function(context, arg=None): print(dir(context)) print(type(context.context_node)) ....
The first print statement shows [..., 'context_node', 'eval_context'] but the second triggers an exception:
.... print(type(context.context_node)) File "src\lxml\extensions.pxi", line 315, in lxml.etree._BaseContext.context_node.__get__ (src\lxml\lxml.etree.c:159971) File "src\lxml\lxml.etree.pyx", line 1617, in lxml.etree._elementFactory (src\lxml\lxml.etree.c:59684) File "src\lxml\classlookup.pxi", line 407, in lxml.etree._parser_class_lookup (src\lxml\lxml.etree.c:92440) File "src\lxml\classlookup.pxi", line 259, in lxml.etree._callLookupFallback (src\lxml\lxml.etree.c:90672) File "src\lxml\classlookup.pxi", line 338, in lxml.etree._lookupDefaultElementClass (src\lxml\lxml.etree.c:91744) AssertionError: Unknown node type: 9
I can't recall ever failing to print out the type of something which I can see is there. Any idea what's going on?
This fails because the context_node is actually the document node, which does not have a Python representation in lxml. I guess it shouldn't fail to access that, though... Stefan
Hi! Bob Kline schrieb am 12.10.2017 um 15:34:
On Wed, Mar 22, 2017 at 8:32 AM, Bob Kline wrote:
The subject of my post was misleading. I'm aware that I could use a threading.local object to isolate thread-specific data. What I'm really hoping for is stack-based data which carries information about the individual XSL/T filtering call. The way Sablotron (R.I.P.) allowed an opaque pointer to a structure of user data to be passed into SablotRegHandler(). I could implement my own stack (so that if a callback ended up invoking another filtering operation it wouldn't obliterate the data specific to the original XSL/T job), but it would be cleaner if lxml provides a way to register user data for the filtering call.
I combed through the mailing list archives and confirmed that I never got a response to this inquiry. To re-cap, I'm looking for a way to pass in a reference to user data when invoking the XSLT object to transform a tree, and to have that user data object accessible to the resolver callbacks. Is there support somewhere for this that I just haven't find? If not, how hard would it be to implement this? I'm willing to contribute toward such an effort.
You likely didn't get a response because it's not entirely clear from your description what your use case is. In what way do you want to influence the behaviour of the resolvers based on that XSLT state information? Could you give an example of what that information looks like? PEP 550 would probably solve this, but otherwise, a stackish Python list in thread-local storage doesn't sound all wrong as an approach. Maybe hidden behind a context manager. Stefan
On Fri, Oct 13, 2017 at 1:43 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
You likely didn't get a response because it's not entirely clear from your description what your use case is. In what way do you want to influence the behaviour of the resolvers based on that XSLT state information? Could you give an example of what that information looks like?
Sure. Many of the resolver callbacks are requesting metadata about the document being filtered. By this I mean information about the document which is not found in the XML for the document which is fed to the XSLT engine, such as * which version of the document are we filtering? * what is the official title of the document? * when was the document first published? * from which server/respository/database did we fetch this copy of the document? * what is the repository-specific unique ID for the document we're filtering? There are other examples, but this is a fairly representative list.
PEP 550 would probably solve this, but otherwise, a stackish Python list in thread-local storage doesn't sound all wrong as an approach.
Yes, I agree that it's possible to use that approach, and in fact that's what I'm doing now. It's just that the technique of using the stack to pass stack-specific information seems like the cleanest approach. The ability to attach user-owned data to the parser which was made accessible to the resolver callbacks was one of the few things we liked about the Sablotron package. But as I say, it's not as if we don't have a workable solution. The primary incentive for my original post was to find out if the support was there and I just wasn't looking in the right place for it. I believe I have my answer to that question. :-) -- Bob Kline http://www.rksystems.com mailto:bkline@rksystems.com
participants (2)
-
Bob Kline
-
Stefan Behnel