[Expat-discuss] empty tags

Sylvain PRAT syprat@yahoo.fr
Tue, 14 Aug 2001 09:29:01 +0200 (CEST)


 --- "Fred L. Drake, Jr." <fdrake@acm.org> a écrit : >

> =?iso-8859-1?q?Sylvain=20PRAT?= writes:
>  > yes, but we should be aware of the charset...
> 
>   Yes, pretty flaky stuff.  Here's another approach:
> for every start
> tag, get the current source index, then for end
> tags, you know
> it was an empty tag if the position didn't change. 
> That can be
> optimized a little bit by maintaining a flag:
> 
> static int  maybe_empty_element_tag;
> static long byte_index;
> 
> void
> start(void *data, const char *el, const char **attr)
> {
>     maybe_empty_element_tag = 1;
>     byte_index = XML_GetCurrentByteIndex(parser);
> 
>     ...
> }
> 
> static void
> end(void *data, const char *el)
>     if (maybe_empty_element_tag) {
>         maybe_empty_element_tag = 0;
>         if (byte_index ==
> XML_GetCurrentByteIndex(parser)) {
>            /* empty-element tag */
>            return
>         }
>     }
>     ...
> }
> 
>   All other handlers could optionally clear
> maybe_empty_element_tag to
> avoid the call back into expat for elements like
> <e>characters
> only</e>.  Whether that would be a win depends how
> isolated you want
> this aspect of the processing, or if the performance
> improvement (very
> small) is worth the maintenance cost.
> 
>  > I think the parser is aware of the empty tag, so
> why
>  > couldn't this be a feature (as reading the input
>  > context is one too), especially because start
> tags are
>  > possibly erased before reading the end tags...  
> 
>   The parser has to be aware of it, but I'm not sure
> what would be the
> best way to expose it.  Perhaps something similar to
> the
> XML_GetSpecifiedAttributeCount() function, which is
> only valid during
> the start-element callback?  That's certainly
> possible, but would
> incur higher overhead that the approach outlined
> here (because it
> would always result in a function call).  Providing
> that would not
> invalidate this approach, so they could coexist.
> 

Finally, there's a best solution to this, using the
XML_GetCurrentByteCount which returns 0 for end
elements when using an empty tag... but it results in
a function call (i don't mind)...

Sylvain  

___________________________________________________________
Do You Yahoo!? -- Vos albums photos en ligne, 
Yahoo! Photos : http://fr.photos.yahoo.com