[Expat-discuss] Pull API Status?

Tue Mar 11 09:40:25 EST 2003

> I was browsing the archives and came across a proposal to implement a pull
> based API on top of expat's XML_Parse+callbacks API.  This is something I
> could use for a current project at work, and I'd be willing to tackle it if
> no one is actively working on this.  

No one is currently working on it, as we plan to release Expat 2.0 first.

> API Option 1 in this message
> http://mail.libexpat.org/pipermail/expat-discuss/2002-August/000602.html
> meets my needs, but I wanted to make sure this was still considered to be a
> viable option.

It is not what I have in mind, but it could serve as the Pull solution
until we really go for a proper implementation. If we can hide implementation
details and make the API itself robust then a change in implementation
might not even break existing code - in theory, at least ;-).

> While I agree with a later message in the thread that a Pull API built
> directly on top of the xmltok tokenizer would be cleaner and more efficient,
> it's not something I personally feel up to tackling at this point due to
> time constaints on my work project.

Yes, that is always our problem!!!

Btw, here is how I see the "proper" Pull implementation:
(assuming an API has been established, details may change):

- Expat is already Pull based internally. So a lot of code can be re-used.
  One does not need to completely re-implement the layer on top of xmltok.
- The main things to change are:
  - instead of "pushing" buffers (with XML_ParseBuffer), have the main
    parsing loop pull buffers with an XML_GetNextBuffer callback.
  - add return codes to all the callbacks (like XML_SKIP, XML_USE, XML_ERROR, ...)
  - supply internal callbacks which perform the PULL API specific
    data preparation and also do any required filtering
- The Next() function would simply call the main parsing loop which returns
  when an (internal) callback returns XML_USE. The data to be reported would
  be stored in some fields in the Parser structure.

In addition we would also want to improve the API with regards to
complete entity reporting (currently the same restrictions as SAX2)
and namespace reporting (it seems better to return names as separate
localName, prefix and uri parameters).

And, of course, it should still be possible to use Expat in Push mode.

> After I hear back on the status, I have a couple of specific questions for
> how people would like to handle character text nodes in a pull based API.

I think if we want your API re-usable we should put a lot of thought into it.

Karl