[ expat-Bugs-441449 ] problems with parsing external entities

noreply@sourceforge.net noreply@sourceforge.net
Tue Jun 11 06:50:01 2002


Bugs item #441449, was opened at 2001-07-15 11:02
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=441449&group_id=10127

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Rafael R. Sevilla (didosevilla)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: problems with parsing external entities

Initial Comment:
I've tried to use Expat's external entity parsing
module in my project (http://xml-lit.sourceforge.net/)
and have gotten some very strange results.  I used
Expat's XML_ExternalEntityParserCreate within an
external entity reference handler and used the parser
once again.  Had mixed results with this.  For one
particular document referred to by an external entity
Expat would give an error: "no element found" at the
end of the document (line number).  Doesn't happen with
all the other documents I have.  The document was
perfectly legal XML and otherwise Expat can parse it
directly...just not through the external entity.  The
document was also quite large, so I tried to work
around it by splitting the document into several more
documents...the problem went away.  Will cruft together
a simpler example document and short program to
illustrate this problem.


----------------------------------------------------------------------

>Comment By: Karl Waclawek (kwaclaw)
Date: 2002-06-11 09:49

Message:
Logged In: YES 
user_id=290026

I agree - it seems for external entities, the processor
must be set to externalEntityContentProcessor.

I have submitted patch # 567400.
Please review and test. I hope it doesn't break
anything else.

Karl

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2002-04-25 22:54

Message:
Logged In: NO 

In my opinion,function "cdataSectionProcessor" can't call 
function "contentProcessor",because "contentProcessor" call
"doContent" with a zero value of the 
parameter "startTagLevel".But the CDATA section is in an 
external entity.The "startTagLevel" should be one.So I 
think,the key point of the bug is that 
function "cdataSectionProcessor" can't get the 
right "startTagLevel".

----------------------------------------------------------------------

Comment By: Rafael R. Sevilla (didosevilla)
Date: 2001-09-17 07:32

Message:
Logged In: YES 
user_id=26058

The tarball here contains a sample, minimal program which
consists of a parser that simply exits if no errors are
found when loading an XML document.  I have two pairs of XML
documents, each of which differ in only one byte.  See the
README inside the tarball for an explanation.  By the way,
I've seen this bug happen with Expat 1.95.1 and with the
most recent CVS.  Red Hat 7.1, Linux 2.4.5.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-07-25 15:27

Message:
Logged In: YES 
user_id=3066

Can you attach a short sample program and input file?  That would make it a *lot* easier to track this down.

Also, which version were you using?


----------------------------------------------------------------------

Comment By: Rafael R. Sevilla (didosevilla)
Date: 2001-07-16 03:25

Message:
Logged In: YES 
user_id=26058

Further notes on this apparent bug: It seems that it depends
both on the file size and the size of the buffer I use.  For
a buffer that is 8,192 bytes in size, a file of up to 10,775
bytes can be created that can be parsed without error. 
Going to 10,776 or larger file size will cause the parser to
exit with the above error.  Increasing the buffer size made
the problem go away, but apparently it will just take a
bigger file for Expat to produce the same errorin that case.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=441449&group_id=10127