[XML-SIG] [ pyxml-Bugs-576331 ] problem parsing xml with DOCTYPE decl.

noreply@sourceforge.net noreply@sourceforge.net
Tue, 02 Jul 2002 04:18:07 -0700


Bugs item #576331, was opened at 2002-07-02 06:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=576331&group_id=6473

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Christopher J. Prinos (cprinos)
Assigned to: Nobody/Anonymous (nobody)
Summary: problem parsing xml with DOCTYPE decl.

Initial Comment:
This is actually two problems, one with xmlproc, and 
one with expat. This is orginally from a 
comp.lang.python thread, Martin Loewis suggested I 
place a bug report here.

Starting with an xml file that specifies a dtd, if the file is 
parsed with validation it ends up with a an empty root 
node.

If it's parsed _without_ validation, then the document 
contents are correct, except that the doc will have a 
bogus DOCTYPE declaration that specifies the root 
element name without using SYSTEM or PUBLIC to 
specify the DTD.
-- t1.xml ---
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE a SYSTEM "t1.dtd">
<a>
 <b>simple test</b>
</a>

-- t1.dtd --
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT a (b)>
<!ELEMENT b (#PCDATA)>

-- test code --
>>> import xml.dom.ext.reader.Sax2 as Sax2
>>> ValReader = Sax2.Reader(validate=1)
>>> NonValReader = Sax2.Reader(validate=0)
>>> vd = ValReader.fromStream(open('t1.xml'))
>>> nvd = NonValReader.fromStream(open('t1.xml'))
>>> from xml.dom.ext import PrettyPrint as PPrint
>>>
>>> PPrint(vd)    # this shows vd to have an empty root
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE a SYSTEM "t1.dtd">
<a/>
>>>
>>> PPrint(nvd)  # this shows nvd to have a non-valid 
doctype declaration
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE a>
<a>
  <b>simple test</b>
</a>

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=576331&group_id=6473