[ expat-Bugs-551852 ] BOM causes error with small buffers

noreply@sourceforge.net noreply@sourceforge.net
Sat May 4 14:11:01 2002


Bugs item #551852, was opened at 2002-05-03 09:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=551852&group_id=10127

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Karl Waclawek (kwaclaw)
>Assigned to: Karl Waclawek (kwaclaw)
Summary: BOM causes error with small buffers

Initial Comment:

This happens when an external entity that has a BOM
*and* a text declaration, is parsed, and the buffer 
size is very small.

For instance, take these two files:

--- test.xml ---
<!DOCTYPE test [
  <!ELEMENT test (#PCDATA)>
  <!ENTITY e SYSTEM "test.ent">
]>
<test>
  &e;
</test>
--- end of file ---

and this external entity, saved in UTF-16 with BOM:

--- test.ent ---
<?xml version="1.0" encoding="UTF-16"?>some text
--- end of file --- 

When parsing this with a buffer size of 1
(using XML_GetBuffer), you get the error
"xml processing instruction not at start of entity".
This error won't happen if you remove the BOM.

I have traced this to the function
externalEntityInitProcessor2.
I found a fix for this:

original code:
  ...
  switch (tok) {
  case XML_TOK_BOM:
    start = next;
    break;
  ...

fixed code:
  ...
  switch (tok) {
  case XML_TOK_BOM:
    if (next == end && endPtr) {
      *endPtr = next;
      return XML_ERROR_NONE;
    }
    start = next;
    break;
  ...

Explanation for fix:

If we are at the end of the buffer, the original
code would pass control to the next stage, i.e.
externalEntityInitProcessor3, which would detect
XML_TOK_NONE and pass control directly to doContent
without processing any xml text declaration.
However, in doContent the xml text declaration
will then be parsed and this will cause the error 
XML_ERROR_MISPLACED_XML_PI, sinc doContent does
not allow text declarations.

The fix simply prevents control to be passed to
doContent before externalEntityInitProcessor2
can process the xml text declaration.

Karl

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-05-04 17:10

Message:
Logged In: YES 
user_id=3066

Karl, I have a test case for this one and can confirm your
suggested fix.  Check it in whenever you're ready, and I'll
follow with the test case.

(Still working on the other tests; I got swamped by work &
kids this week.)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=551852&group_id=10127