[XML-SIG] 0.6.4: another problem with building DOM using validating parser
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Sun, 4 Mar 2001 23:26:42 +0100
> from xml.dom.ext.reader.Sax2 import FromXmlFile
>
> f = open ('test5.xml', 'w')
> f.write ("""<?xml version="1.0"?>
> <!DOCTYPE configuration [
> <!ENTITY testscrap SYSTEM "testscrap">
> <!ELEMENT configuration EMPTY>
> ]>
>
> <configuration/>
> """)
> f.close()
>
> doc = FromXmlFile ('test5.xml', None, 1)
>
> print doc
[...]
> ! def unparsedEntityDecl (self, publicId, systemId, notationName):
> ! new_notation = self._ownerDoc.getFactory().createEntity(self._ownerDoc, publicId, systemId, notationName)
> self._ownerDoc.getDocumentType().getEntities().setNamedItem(new_notation)
> return
I'm glad that others are as confused about the matter as I am. What
you have in your document is not an unparsed entity, but an external
one - the unparsed ones have an NDATA notation name. xmlproc detected
that properly (by setting ndata to ""), but drv_xmlproc expected None
as the ndata. So I changed to to invoke externalEntityDecl in that
case, which is not handled by Sax2.
As you found, *if* this was ever invoked, _ownerDoc will be None
(since the document element has not been seen yet). Instead of
ignoring the unparsed entity, it would be better to put them into the
_orphanedChildren; I've changed it thus. In the process, I found that
things are put into _orphanedChildren which are later not processed -
I've fixed that too.
I still think that the unparsedEntityDecl callback is completely
broken. What is getFactory and getEntities? Also, if there is a
feature for creating entities, it is surely part of a 4DOM extension -
probably on the document type. However, that apparently is not capable
of distinguishing between external and unparsed entities; not sure
whether it should.
In any case, I've applied the following patch. I'd appreciate if
somebody of FourThough could take a look.
Regards,
Martin
Index: xml/dom/ext/reader/Sax2.py
===================================================================
RCS file: /cvsroot/pyxml/xml/xml/dom/ext/reader/Sax2.py,v
retrieving revision 1.7
diff -u -r1.7 Sax2.py
--- xml/dom/ext/reader/Sax2.py 2001/02/20 01:00:03 1.7
+++ xml/dom/ext/reader/Sax2.py 2001/03/04 22:05:59
@@ -8,7 +8,7 @@
Components for reading XML files from a SAX2 producer.
WWW: http://4suite.com/4DOM e-mail: support@4suite.com
-Copyright (c) 2000 Fourthought Inc, USA. All Rights Reserved.
+Copyright (c) 2000, 2001 Fourthought Inc, USA. All Rights Reserved.
See http://4suite.com/COPYRIGHT for license and copyright information
"""
@@ -148,6 +148,10 @@
self._ownerDoc.appendChild(comment)
elif o_node[0] == 'doctype':
before_doctype = 0
+ elif o_node[0] == 'unparsedentitydecl':
+ apply(self.unparsedEntityDecl, o_node[1:])
+ else:
+ raise "Unknown orphaned node:"+o_node[0]
self._rootNode = self._ownerDoc
self._nodeStack.append(self._rootNode)
return
@@ -222,7 +226,7 @@
def startDTD(self, doctype, publicID, systemID):
if not self._rootNode:
self._dt = implementation.createDocumentType(doctype, publicID, systemID)
- self._orphanedNodes.append(('doctype'))
+ self._orphanedNodes.append(('doctype',))
else:
raise 'Illegal DocType declaration'
return
@@ -255,9 +259,12 @@
self._ownerDoc.getDocumentType().getNotations().setNamedItem(new_notation)
return
- def unparsedEntityDecl (self, publicId, systemId, notationName):
- new_notation = self._ownerDoc.getFactory().createEntity(self._ownerDoc, publicId, systemId, notationName)
- self._ownerDoc.getDocumentType().getEntities().setNamedItem(new_notation)
+ def unparsedEntityDecl (self, name, publicId, systemId, ndata):
+ if self._ownerDoc:
+ new_notation = self._ownerDoc.getFactory().createEntity(self._ownerDoc, publicId, systemId, name)
+ self._ownerDoc.getDocumentType().getEntities().setNamedItem(new_notation)
+ else:
+ self._orphanedNodes.append(('unparsedentitydecl', name, publicId, systemId, ndata))
return
#Overridden ErrorHandler methods