Re: [lxml-dev] Custom resolvers

20 Apr 2006

      Brad Clements wrote:
...
On 20 Apr 2006 at 20:30, Stefan Behnel wrote:
...
...
Great, so does this resolver only get called when this one parser is
used, or is it global to the process (like it is with libxml2)?
It's currently local to a parser. I'm looking for a module level API
also, but I'm not sure yet how to make it look pretty. Anyway, the
parser-level API is likely the preferred one anyway.
Is the ability to register a resolver by-parser new functionality in libxml2?
No, lxml registers a global resolver and dispatches internally, possibly
falling back to the original default resolver.
...
...
Questions:
* if you parse an XSL document with one set of resolvers and then use
it to transform an XML document with another set of resolvers - which
ones should be used during the transform?
Well hmm.. when does the xsl transform process xsl:include and xsl:import?
I think those two statements should use the resolver assigned to the base xslt 
document.
Includes and imports are handled at compilation time, which happens in
XSLT.__init__(). Libxslt uses a different mechanism than libxml2 here, which
(as usual) complicates things. It allows you to specify an
"xsltDocLoaderFunction" that is expected to operate in the current XSLT context.

Replacing this function would also fix the document('') call as it could
access the in-memory stylesheet structure instead of trying to re-load it from
a possibly unknown source.

However, there doesn't seem to be a way to figure out the default document
loader function to provide the necessary fallback. So, I don't know, maybe
I'll have to see if libxslt can use the libxml2 resolver capabilities instead...
...
During the transform, calls to document() should use the resolver of the base-uri.
That's the main problem I see. I'm not sure we can figure out the document
that a resolver request comes from by means of libxml2. Libxslt provides this
information to the loader function, but as long as we don't have a fallback,
we can't just replace the loader function without re-implementing it completely.
...
So, that could be tricky, the document() call is complicated. I suppose you could 
say that document() always uses the resolver associated with the source xml file 
and just leave it at that.. that'd be easy.
Yeah, but it can't always work. Imagine a stylesheet loaded from a ZIP file
applied to an XML file loaded from the web. You'd then need both resolvers
registered on the XML document. You could possibly imagine using both (e.g.
using the XSLT resolvers as a fallback to the XML resolvers). But that may
yield other race conditions.
...
...
* should the document registries be independent of the parser
registries or should they reflect updates in their original parser?
sorry, I don't understand what you mean.
I just meant: should they be stored by reference or copied? But I assume you'd
want independent copies to allow updating the parser-local registry without
affecting documents that were parsed earlier. So that's a minor problem here.

Stefan