[Expat-discuss] Re: Feedback on pre-release snapshot

Karl Waclawek karl at waclawek.net
Sun Jan 30 02:12:36 CET 2005



Reid Spencer wrote:
> Attempt #2 .. looks like the list doesn't like signed messages ...
> 
> On Sat, 2005-01-29 at 00:45, Reid Spencer wrote:
> Hi,
> 
> I've been using expat for a while, but pretty silent about it. You asked
> for some feedback before the 2.0 release. In general, expat is working
> very well for me. I really only have a few small items to report:
> 
> 1. Documentation inconsistency (see patch). The "Overview" section
>    states that the XML_EndNamespaceDeclHandler callback is called
>    *before* the corresponding XML_EndElementHandler is called. This
>    doesn't make sense because the namespace can't go out of scope
>    before an end tag in that namespace is processed. Fortunately, the
>    "Reference" section correctly states the EndNamespaceDeclHandler
>    callback as coming after the EndElementHandler. The patch just
>    corrects the documentation.

Thanks for catching that.

> 2. I would prepare a patch for this, but its somewhat controversial and
>    requires an API change. So, I thought it would be better to ask 
>    first. The issue has to do with XML_ParserCreate_MM and the 
>    XML_Memory_Handling_Suite structure. The memory handling callback 
>    functions, as defined in the structure, have no way of knowing for 
>    which parser (and thus its user data) memory is being allocated. This
>    makes it *really hard* to write a thread safe pooled memory parsing
>    system because the memory handling functions can't know which pool
>    of memory to act upon. In my particular case, I'm using the Apache
>    Portable Runtime's pooled memory handling functions but there's no
>    way to pass the pool to the expat memory handling functions. I would
>    like to see each of these functions take an additional argument which
>    is the user data of the parser into which the functions were
>    installed. This shouldn't be too hard to implement since all the
>    calls to the memory hanlding routines are done via macros in expat.

I think there have been similar patches and feature requests already.
We (or at least I) definitely want this issue resolved, but this will
have to wait until after Expat 2.0, as we really want to get a stable
release out that satisfies current expectations and is backwards compatible.

As post 2.0 releases will not be backwards compatible with 2.0, this will
open the door for a few changes and improvements.

> 3. There's no way to avoid copying data with memory mapped files. If you
>    use XML_Parse, it copies the provided input calling XML_GetBuffer
>    itself. Alternatively you can call XML_GetBuffer yourself and call
>    XML_ParseBuffer. That's fine for files read from a file descriptor. 
>    However, its not much help with an mmap'd file where I have the
>    entire file in memory and I want expat to swallow it whole without
>    any copying at all. I haven't found a way to do this. Is there some
>    reason expat can't parse the provided buffer without copying it
>    first?

If the first buffer passed to XML_Parse() is also the final one,
Expat will not do any copying. If you pass multiple buffers
to Expat, and there are left-overs from a previous buffer, copying
becomes necessary.

> 4. It would be handy if the XML_ExternalEntityRefHandler callback was
>    passed the user data of the XML_Parser invoking it.

Have you looked at XML_SetExternalEntityRefHandlerArg()?

For Fred: I think its time we start a wish-list of post-2.0
changes and new features.

Karl


More information about the Expat-discuss mailing list