[ expat-Bugs-548690 ] Incorrect "undefined entity" error

noreply@sourceforge.net noreply@sourceforge.net
Sat May 18 11:20:02 2002


Bugs item #548690, was opened at 2002-04-25 12:37
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=548690&group_id=10127

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Karl Waclawek (kwaclaw)
Assigned to: Nobody/Anonymous (nobody)
>Summary: Incorrect "undefined entity" error

Initial Comment:
I came across the following behaviour, given this 
document:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE intervalscript SYSTEM 'intervalscript.dtd' [
    <!ENTITY % worldupdtd SYSTEM 'worldup.dtd'>
    %worldupdtd;
    %visiondtd;
]>
<test>
  some text
</test>

Expat will report an "undefined entity" fatal error 
for the reference %visiondtd;.

However, the XML spec says this (look at the 
<emphasis> tag:

<spec>
Well-formedness constraint: Entity Declared

In a document without any DTD, a document with only an 
internal DTD subset which contains no
parameter entity references, or a document 
with "standalone='yes'", for an entity reference that
does not occur within the external subset or a 
parameter entity, the Name given in the entity
reference must match that in an entity declaration 
that does not occur within the external subset or
a parameter entity, except that well-formed documents 
need not declare any of the following
entities: amp, lt, gt, apos, quot. The declaration of 
a general entity must precede any reference to
it which appears in a default value in an attribute-
list declaration.

<emphasis>
Note that if entities are declared in the external 
subset or in external parameter entities, a
non-validating processor is not obligated to read and 
process their declarations; for such
documents, the rule that an entity must be declared is 
a well-formedness constraint only if
standalone='yes'.
</emphasis>

Validity constraint: Entity Declared

In a document with an external subset or external 
parameter entities with "standalone='no'", the
Name given in the entity reference must match that in 
an entity declaration. For interoperability,
valid documents should declare the entities amp, lt, 
gt, apos, quot, in the form specified in 4.6
Predefined Entities. The declaration of a parameter 
entity must precede any reference to it.
Similarly, the declaration of a general entity must 
precede any attribute-list declaration
containing a default value with a direct or indirect 
reference to that general entity.
</spec>

Since Expat is not validating, the emphasized section 
should apply.


----------------------------------------------------------------------

>Comment By: Karl Waclawek (kwaclaw)
Date: 2002-05-18 14:19

Message:
Logged In: YES 
user_id=290026

Patch # 551599 contains fixes which make Expat
conform to the spec. Leave this open until
a skippedEntityHandler patch has been submitted,
so that we don't loose sight of the related issues.

Karl

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2002-05-04 13:43

Message:
Logged In: YES 
user_id=290026

OK, I have added a bug report as feature request
for a skippedEntity handler (# 552297).
Please review and add your comments.

Karl

----------------------------------------------------------------------

Comment By: Rolf Ade (pointsman)
Date: 2002-05-04 12:58

Message:
Logged In: YES 
user_id=13222

_Please_ don't change expats behavior in this area
without giving the programmer a chance to get
noticed about such not resolvable entities. A
skippedEntity callback seems to be a way.

The reason for this is, that it is possible to do
DTD validation on top of the current expat. If expat
silently skip not resolvable entities, this is would be over.

(Well,almost complete DTD validation. I mentioned a
few remainig problems in
http://sourceforge.net/mailarchive/message.php?msg_id=839092
with a bit more information to the nested
parameter entities problem also discoverd by Karl
in 
http://sourceforge.net/mailarchive/message.php?msg_id=839078)



----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2002-05-04 00:12

Message:
Logged In: YES 
user_id=290026

I think the priority should be that Expat conforms
to the spec. So, if there is no well-formedness violation,
then Expat should not report one.

However, your concern is valid, but should be dealt with
differently. We were already discussing the introduction
of a skippedEntity callback, like the one in SAX.

Karl

----------------------------------------------------------------------

Comment By: Rolf Ade (pointsman)
Date: 2002-05-03 23:15

Message:
Logged In: YES 
user_id=13222

I'm a bit confused about the subject. While Karl
Waclawe's observation is right by it's own, I
don't want to loose the ability of exapt, to claim
about not to be able to resolve an external
parameter entity. 


----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2002-04-27 09:24

Message:
Logged In: YES 
user_id=290026

This bug and bug # 544679 seem to be related to a set
of difficulties Expat has in handling DTDs and PEs.

The best way to detect those problems and test them
is to subject Expat to James Clark's test cases at 
ftp://ftp.jclark.com/pub/xml/xmltest.zip,
specifically the test cases in the subdirectory
/valid/not-sa/ . Expat does not handle most of them
correctly, it seems.

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2002-04-25 23:46

Message:
Logged In: YES 
user_id=290026

To supply more detail:

- the external entityref handler is set
- it is called for the entity that isn't declared,
  as well as the external subset
- it always returns 1
- and SetParamEntityParsing was called with
  XML_PARAM_ENTITY_PARSING_ALWAYS 


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-25 21:45

Message:
Logged In: YES 
user_id=3066

I need more information to construct the test case.  In
particular, is a callback set with
XML_SetExternalEntityRefHandler(), how many times is it
called (is it called for the entity that isn't declared?),
and what does it return?  Was XML_SetParamEntityParsing()
called, and with which value for the 'code' argument?

Thanks.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=548690&group_id=10127