[Expat-discuss] How to fetch information for the position of each element in a file
Stefano Sabatini
stefano.sabatini-lala at poste.it
Mon Jan 21 14:54:07 CET 2008
On date Monday 2008-01-21 12:31:33 +0100, Stefano Sabatini wrote:
> Hi all, this is my first post here.
>
> I have an application which needs to parse an XML file, and I would
> like to print out the position of *each* element in the parsed file.
>
> Actually I slightly hacked outline.c to this:
Sorry to reply to self, the below code doesn't compile, and adequately
fixed seems to work just right for what I wanted to do, I think I was
compiling and running another program so I was getting wrong results.
> /*****************************************************************/
> #include <stdio.h>
> #include <expat.h>
>
> #define BUFFSIZE 8192
>
> char Buff[BUFFSIZE];
>
> int Depth;
>
> typedef struct UserData {
> char *filename;
> XML_Parser *p;
^^
parser
> } UserData;
>
> /* macro which defines the start handler for an element */
> static void XMLCALL
> start(void *data, const char *el, const char **attr)
> {
> int i;
> UserData *user_data = (UserData *)data;
>
> /* indent according to the indentation depth */
> for (i = 0; i < Depth; i++)
> printf(" ");
>
> printf("%s:%d:%s", user_data->filename, XML_GetCurrentLineNumber(user_data->parser), el)
^^^^^^^^^^^^^^^^^^
* missing, and ';' at the end of line too.
> for (i = 0; attr[i]; i += 2) {
> printf(" %s='%s'", attr[i], attr[i + 1]);
> }
>
> printf("\n");
> Depth++;
> }
>
> /* end handler */
> static void XMLCALL
> end(void *data, const char *el)
> {
> Depth--;
> }
>
> int main(int argc, char *argv[])
> {
> UserData data;
>
> XML_Parser parser = XML_ParserCreate(NULL);
> if (!parser) {
> fprintf(stderr, "Couldn't allocate memory for parser\n");
> exit(-1);
> }
>
> /* set the start and end handler for each element of the document */
> XML_SetElementHandler(parser, start, end);
>
> data.filename = "stdin";
> data.parser = &parser;
>
> /* this sets the pointer to pass to the various handler function
> * you need to fill accordingly this struct */
> XML_SetUserData(parser, &data);
>
> for (;;) {
> int done;
> int len;
>
> len = fread(Buff, 1, BUFFSIZE, stdin);
> if (ferror(stdin)) {
> fprintf(stderr, "Read error\n");
> exit(-1);
> }
> done = feof(stdin);
>
> if (XML_Parse(parser, Buff, len, done) == XML_STATUS_ERROR) {
> fprintf(stderr, "Parse error at line %d:\n%s\n",
> XML_GetCurrentLineNumber(parser),
> XML_ErrorString(XML_GetErrorCode(parser)));
> exit(-1);
> }
>
> /* when it reads EOF then quit the loop */
> if (done)
> break;
> }
> return 0;
> }
> /*****************************************************************/
>
> The start element handler accesses the parser struct and calls on it the
> XML_GetCurrentLineNumber function:
>
> printf("%s:%d:%s", user_data->filename, XML_GetCurrentLineNumber(user_data->parser), el)
>
> Unfortunately this doesn't work, for example with this sample file:
> <sample>
>
> <foo>it is me, foo</foo>
> <bar>it is you, bar</bar>
>
> </sample>
>
> I get this output:
> $ cat sample.xml | outline-passing-data
> stdin:1:sample
> stdin:1:foo
> stdin:1:bar
>
> I wonder if there is some way to get the actual position for every
> parsed element, this seems a very reasonable request since such
> information could be used for example when performing the semantical
> analysis of the XML tree to print out where exactly happened a
> semantical error.
>
> Any help will be highly appreciated.
With these simple correction the program seems to behave correctly,
that is it prints the correct position of the beginning of each
element, so I get for example:
$ cat sample.xml | outline2
stdin:1:sample
stdin:3:foo
stdin:4:bar
Sorry for the noise, regards.
--
Stefano Sabatini
Linux user number 337176 (see http://counter.li.org)
More information about the Expat-discuss
mailing list