[Expat-discuss] How to fetch information for the position of each element in a file

Stefano Sabatini stefano.sabatini-lala at poste.it
Mon Jan 21 14:54:07 CET 2008


On date Monday 2008-01-21 12:31:33 +0100, Stefano Sabatini wrote:
> Hi all, this is my first post here.
> 
> I have an application which needs to parse an XML file, and I would
> like to print out the position of *each* element in the parsed file.
> 
> Actually I slightly hacked outline.c to this:

Sorry to reply to self, the below code doesn't compile, and adequately
fixed seems to work just right for what I wanted to do, I think I was
compiling and running another program so I was getting wrong results.

> /*****************************************************************/
> #include <stdio.h>
> #include <expat.h>
> 
> #define BUFFSIZE        8192
> 
> char Buff[BUFFSIZE];
> 
> int Depth;
> 
> typedef struct UserData {
>     char *filename;
>     XML_Parser *p;
                 ^^
parser

> } UserData;
> 
> /* macro which defines the start handler for an element */
> static void XMLCALL
> start(void *data, const char *el, const char **attr)
> {
>     int i;
>     UserData *user_data = (UserData *)data;
> 
>     /* indent according to the indentation depth */
>     for (i = 0; i < Depth; i++)
>         printf("  ");
> 
>     printf("%s:%d:%s", user_data->filename, XML_GetCurrentLineNumber(user_data->parser), el)
                                                                       ^^^^^^^^^^^^^^^^^^

* missing, and ';' at the end of line too.

>     for (i = 0; attr[i]; i += 2) {
>         printf(" %s='%s'", attr[i], attr[i + 1]);
>     }
> 
>     printf("\n");
>     Depth++;
> }
> 
> /* end handler */
> static void XMLCALL
> end(void *data, const char *el)
> {
>     Depth--;
> }
> 
> int main(int argc, char *argv[])
> {
>     UserData data;
> 
>     XML_Parser parser = XML_ParserCreate(NULL);
>     if (!parser) {
>         fprintf(stderr, "Couldn't allocate memory for parser\n");
>         exit(-1);
>     }
> 
>     /* set the start and end handler for each element of the document */
>     XML_SetElementHandler(parser, start, end);
> 
>     data.filename = "stdin";
>     data.parser = &parser;
> 
>     /* this sets the pointer to pass to the various handler function
>      * you need to fill accordingly this struct */
>     XML_SetUserData(parser, &data);
> 
>     for (;;) {
>         int done;
>         int len;
> 
>         len = fread(Buff, 1, BUFFSIZE, stdin);
>         if (ferror(stdin)) {
>             fprintf(stderr, "Read error\n");
>             exit(-1);
>         }
>         done = feof(stdin);
> 
>         if (XML_Parse(parser, Buff, len, done) == XML_STATUS_ERROR) {
>             fprintf(stderr, "Parse error at line %d:\n%s\n",
>                     XML_GetCurrentLineNumber(parser),
>                     XML_ErrorString(XML_GetErrorCode(parser)));
>             exit(-1);
>         }
> 
>         /* when it reads EOF then quit the loop */
>         if (done)
>             break;
>     }
>     return 0;
> }
> /*****************************************************************/
> 
> The start element handler accesses the parser struct and calls on it the
> XML_GetCurrentLineNumber function: 
> 
> printf("%s:%d:%s", user_data->filename, XML_GetCurrentLineNumber(user_data->parser), el)
> 
> Unfortunately this doesn't work, for example with this sample file:
> <sample>
> 
>   <foo>it is me, foo</foo>
>   <bar>it is you, bar</bar>
> 
> </sample>
> 
> I get this output:
> $ cat sample.xml | outline-passing-data
> stdin:1:sample
>   stdin:1:foo
>   stdin:1:bar
> 
> I wonder if there is some way to get the actual position for every
> parsed element, this seems a very reasonable request since such
> information could be used for example when performing the semantical
> analysis of the XML tree to print out where exactly happened a
> semantical error.
> 
> Any help will be highly appreciated.

With these simple correction the program seems to behave correctly,
that is it prints the correct position of the beginning of each
element, so I get for example:

$ cat sample.xml | outline2
stdin:1:sample
  stdin:3:foo
  stdin:4:bar

Sorry for the noise, regards.
-- 
Stefano Sabatini
Linux user number 337176 (see http://counter.li.org)


More information about the Expat-discuss mailing list