[Expat-bugs] [ expat-Bugs-1506891 ] XML_SetCharacterDataHandler callback function not parsing

SourceForge.net noreply at sourceforge.net
Sat Jul 1 19:39:55 CEST 2006


Bugs item #1506891, was opened at 2006-06-15 16:08
Message generated for change (Comment added) made by fdrake
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1506891&group_id=10127

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Test Required
>Status: Closed
>Resolution: Works For Me
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: XML_SetCharacterDataHandler callback function not parsing

Initial Comment:
Hello,

I have a XML_SetCharacterDataHandler callback function
that uses the text to build a directory path.  I have
noticed that at times the 1st node or last node will
result in a partial capture of the text. 

example.

<directory>my-57/actual image/</directory>

will return with "my-57/actual" only.  I've attached my
callback functions, startelement, endelement and
dataelement.  

Thank you,

Satyajit


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2006-07-01 13:39

Message:
Logged In: YES 
user_id=3066

There's no clear indication of a bug here; general Q&A
should occur on the mailing lists, not in the bug tracker.

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2006-06-16 14:42

Message:
Logged In: YES 
user_id=290026

Expat comes with a few demo apps, just look at them.
This is taken from the "elements" demo:

main(int argc, char *argv[])
{
  char buf[BUFSIZ];
  XML_Parser parser = XML_ParserCreate(NULL);
  int done;
  int depth = 0;
  XML_SetUserData(parser, &depth);
  XML_SetElementHandler(parser, startElement, endElement);
  do {
    int len = (int)fread(buf, 1, sizeof(buf), stdin);
    done = len < sizeof(buf);
    if (XML_Parse(parser, buf, len, done) == XML_STATUS_ERROR) {
      fprintf(stderr,
              "%s at line %" XML_FMT_INT_MOD "u\n",
              XML_ErrorString(XML_GetErrorCode(parser)),
              XML_GetCurrentLineNumber(parser));
      return 1;
    }
  } while (!done);
  XML_ParserFree(parser);
  return 0;
}

----------------------------------------------------------------------

Comment By: sssketkar man! (sketkar)
Date: 2006-06-16 14:28

Message:
Logged In: YES 
user_id=944435

Could you point me to an example I can look at to compare. 
I'm probably missing something obvious.

Thanks for all your help.

Satyajit.

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2006-06-16 14:25

Message:
Logged In: YES 
user_id=290026

I have no problem with your file.
Maybe your parsing loop treats the last buffer incorrectly?

----------------------------------------------------------------------

Comment By: sssketkar man! (sketkar)
Date: 2006-06-16 14:17

Message:
Logged In: YES 
user_id=944435

Thought it might be useful, here is the XML file.

<XRT><Well>test</Well><Run>500</Run><DownloadTimeELS>1150211188</DownloadTimeELS><DownloadTime>13-Jun-06
15:06:28</DownloadTime><Tools><Tool><ToolAddress>1</ToolAddress><ImageID>0</ImageID><Directory>A/Actual
Image/</Directory></Tool><Tool><ToolAddress>7</ToolAddress><ImageID>0</ImageID><Directory>L/Actual
Image/</Directory></Tool><Tool><ToolAddress>7</ToolAddress><ImageID>1</ImageID><Directory>D/Acoustic
Image/</Directory></Tool><Tool><ToolAddress>1</ToolAddress><ImageID>0</ImageID><Directory>R-5/Actual
Image/</Directory></Tool><Tool><ToolAddress>7</ToolAddress><ImageID>0</ImageID><Directory>F/Actual
Image/</Directory></Tool><Tool><ToolAddress>6</ToolAddress><ImageID>0</ImageID><Directory>P/Actual
Image/</Directory></Tool><Tool><ToolAddress>8</ToolAddress><ImageID>0</ImageID><Directory>P-F/Actual
Image/</Directory></Tool><Tool><ToolAddress>11</ToolAddress><ImageID>0</ImageID><Directory>P-M/Actual
Image/</Directory></Tool></Tools></XRT>

----------------------------------------------------------------------

Comment By: sssketkar man! (sketkar)
Date: 2006-06-16 14:09

Message:
Logged In: YES 
user_id=944435

Also, out of curiousity, is the CharacterHandler called on a
timed basis, ie time-sliced?  Many the last node directory
isn't getting buffered correctly because of timing issue
between the StartElementHandler and EndElementHandler.

Satyajit

----------------------------------------------------------------------

Comment By: sssketkar man! (sketkar)
Date: 2006-06-16 14:07

Message:
Logged In: YES 
user_id=944435

Okay, I changed my approach, so that the StartElementHandler
sets a flag that is used by the CharacterHandler to collect
user text until the EndElementHandlers is called at which
point the buffered text is retreived.  This approach seems
to work expect for the last "iteration", i.e if there are 5
nodes, the first 4 are parsed properly.  The last one is
still not getting buffered all the way or correctly.

Here is a debug output from each Handler...

START: tool
START: tooladdress
END: tooladdress
START: imageid
END: imageid
START: directory
END: directory
XXY-WER/Actual Image/          <-- correct 
END: tool
START: tool
START: tooladdress
END: tooladdress
START: imageid
END: imageid
START: directory
END: directory
XYZ-W3/Actualage//e/        <-- not so correct
END: tool

As you can see the last tool node directory is incorrect, it
should have been XYZ-W3/Actual Image/

Thank you,

Satyajit

----------------------------------------------------------------------

Comment By: sssketkar man! (sketkar)
Date: 2006-06-15 18:42

Message:
Logged In: YES 
user_id=944435

Thank you very much.  I was expecting that the
CharacterHandler was collecting all non-XML data in 1-shot.
 I wasn't using the EndElement as a check.  I'll try that.

Satyajit


----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2006-06-15 17:52

Message:
Logged In: YES 
user_id=290026

Do not expect Expat to return all character data within an
element in one call-back. You have to accumulate the text in
a buffer until the end-tag is reported. Are you doing this?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1506891&group_id=10127


More information about the Expat-bugs mailing list