[Expat-discuss] expat parsing destructively?

Nick MacDonald nickmacd at gmail.com
Sun Oct 28 00:01:02 CEST 2007


Well.. it may not be exactly what you want.. but I think I have a
really easy way to get a similar result for you:

Simply allocate 2 copies of the same data... i.e. read in the data to
one buffer, malloc() another buffer the same size, and do a huge
memcpy() to make a duplicate copy.  Now let eXpat scan one copy, and
doing some quick and dirty pointer math, you can now find
corresponding locations in the "non-eXpat" buffer and destroy it to
your hearts content.  When you're all done, simply free both buffers.
Its a little more memory intensive.. but sounds fairly elegant to me,
and you don't need to worry about doing anything unsupported by eXpat.

Good luck,
  Nick


On 10/26/07, Mohun Biswas <m_biswas at mailinator.com> wrote:
> I want to do something with expat which seems elegant to me but I need
> to find out if it makes sense, if there's a way to do it, and if anyone
> has prior experience or example code.
>
> Imagine I have a complete XML document in a memory buffer and pass the
> starting address of the buffer to expat for parsing. Some of the
> attributes will need to be added into a hash table or similar in-memory
> data structure. In the way I've seen expat used that would mean calling
> malloc/strdup on the values passed into the callback, putting them in
> the table, then arranging to free them later. But, given that the entire
> document is already stored in a malloc-ed buffer for which I have no
> further use once parsed, I've been thinking how great it would be if
> expat would work "destructively", which in this case means writing a
> null byte at the back of each attribute value. This would be safe as it
> would always replace the trailing quote.
>
> If it would do that I could just put the data in the table as is and
> cleanup would be as simple as freeing the buffer. No debugging of memory
> leaks or bad frees, no heap fragmentation, simpler and faster code. For
> all I know expat already works like this but I can't find any
> documentation which discusses it. Does anyone know or have a pointer to
> related documentation?


More information about the Expat-discuss mailing list