[XML-SIG] Replacing a Java tool-chain with 4Suite?
Mike C. Fletcher
mcfletch@rogers.com
Thu, 16 Jan 2003 16:47:19 -0500
This is a multi-part message in MIME format.
--------------090202050003080909060208
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Well, I think I've got a functional Python-coded transform (attached for
those following along at home). It doesn't yet do the setting of the
"PyOpenGL.version" attribute because I don't see where the xsl version
is getting that from. As I worked on it, however, I noticed a fairly
strange effect which would seem relevant to the multi-hour running time
of the 4xslt version.
My original solution, based loosely on the original xsl, was to do
something along these lines:
find all "refentry" nodes in source, add to a mapping from
refentry.id -> node
for each refentry in dest (the template):
search for nodes within refentry //*[@condition='replace'] using
an XPath query
search for nodes within the corresponding source refentry with
the same nodeType and id attribute value using another XPath query.
replace the first with the second
That worked (for the first entry at least), but it was human-time slow
(I'd guess about 5-10 seconds). From what I could tell, creating a
context with a sub-node (not the document root) wasn't restricting the
search to the sub-node, that is:
Compile( ""//*[@condition='replace']"").evaluate( Context.Context(
base, GetAllNs(base)))
where base is a sub-node of the document wasn't restricting the search
to base and its children, but was instead searching the whole document.
(Or, for some reason, was unbelievably slow in searching the sub-set).
If this is a general "feature", I can imagine the original xsl, which
does at least 1 selection query per refentry (there are 325 of those)
was bogging down in that. Don't really know (shrug). (Here's the line
from the xsl that makes me think it's doing that query):
<xsl:variable name="orig"
select="$original//refmeta[@id=current()/@id]"/>
My "solution" was to exploit a characteristic of the particular
documents in that the replacement IDs are actually globally unique, so I
can just do a straight mapping from id:originalnode instead of touching
the refentry nodes at all.
I _think_ that using ".//*[@condition='replace']" as the xpath might do
the restrictions, but haven't found anything to back up the idea other
than the original xsl source.
BTW, Uche, your tutorials were of great help to me in getting this
working. Thanks.
Still haven't tried to convert to HTML yet, that's the next project.
Enjoy,
Mike
Uche Ogbuji wrote:
...
>>Given the simplicity of the transformation in this case (just a merge by
>>section name!) I may write the darned thing in Python to save time (I
>>started out around 22 hours ago saying to myself "oh, guess I should
>>regenerate the manual before I start working on the web-site" :) ).
>>
>>
>
>Can you post merge.xsl? I'm guessing it simply operates on Docbook
>section/title elements and thus could work with any Docbook source file? DO
>you use recursive templates in merge.xslt? It's really easy to get into an
>infinite loop with XSLT recursive templates.
>
>Based on the simplicity of the transform you're doing, it certainly looks like
>a really easy job using the sorts of genrrator/iterator tools I outline in
>
>http://www.xml.com/pub/a/2003/01/08/py-xml.html
>
>I like XSLT better than most, but it's not for every task.
>
>
>
>
--
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/
--------------090202050003080909060208
Content-Type: text/plain;
name="testdom2.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="testdom2.py"
"""FourSuite-specific XML-documentation processing script
There's something wrong with the xsl merge mechanism, so
this module does the merge using direct Python manipulation
of the files via the FourSuite Python XML tools.
"""
import sys, os
from Ft.Xml import InputSource
from Ft.Xml.XPath import Compile, Context
from Ft.Xml.Domlette import GetAllNs, PrettyPrint, NonvalidatingReader
try:
import logging
log = logging.getLogger( 'xmlmerge' )
logging.basicConfig()
log.setLevel( logging.INFO )
except ImportError:
log = None
def load( source ):
"""Load a document from source as a DOM"""
uri = 'file:'+ os.path.abspath(source).replace( "\\", "/" )
if log:
log.info( "Loading source document %r", uri )
result = NonvalidatingReader.parseUri(uri)
if log:
log.debug( "Finished loading document %r", uri )
return result
def save( doc, destination ):
if log:
log.info( "Saving document to %r", destination )
PrettyPrint(doc, open(destination,'w'))
if log:
log.debug( "Finished saving document %r", destination )
def finder( pattern ):
"""Create an xpath searcher for the given pattern"""
return Compile( pattern )
def find( specifier, base ):
"""Find subnodes of base with given XPath specifier"""
return finder(specifier).evaluate( Context.Context( base, GetAllNs(base)) )
REPLACEFINDER = finder( "//*[@condition='replace']")
def main( rootFile, originalDirectory, destination ):
"""Load rootFile, merge with the docs in originalDirectory and write to destination"""
prefixedDocs = []
set = {}
for prefix in ['glut','glu','gle','gl']:
filename = os.path.join(originalDirectory, prefix.upper(), 'reference.xml')
doc = load(filename)
for node in find( "//*[@id]", doc ):
set[ node.getAttributeNS(None,'id')] = node
prefixedDocs.append( (prefix, doc))
doc = load( rootFile )
for entry in find( "//*[@condition='replace']", doc ):
# now, for each refentry, there is an "original" entry
# from which we copy 90% of the data...
id = entry.getAttributeNS(None,'id')
if log:
log.debug( "substitution for %r", id )
original = set.get( id )
if not original:
if log:
log.warn( "Unable to find substitution source for %r", id )
continue; # next entry
entry.parentNode.replaceChild( original, entry )
save( doc, destination )
main(
rootFile = os.path.abspath(sys.argv[1]),
originalDirectory = os.path.abspath(os.path.join( os.path.dirname(sys.argv[1]), '..', 'original')),
destination = os.path.abspath(sys.argv[2]),
)
--------------090202050003080909060208--