nonblocking read of one xml element?

Paul Boddie paul at boddie.org.uk
Tue Feb 5 19:38:25 EST 2008


On 5 Feb, 07:09, m... at pixar.com wrote:
> So, I'm parsing a log file that's being written out in
> real time.
>
> <logfile>
> <entry><timestamp>123</timestamp><details>foo</details>
> </entry>
> <entry><timestamp>456</timestamp><details>bar</details>
> </entry>
>              <--- no </logfile>, coz the file hasn't yet been closed

This kind of "incomplete" XML (or perhaps ill-formed would be the
better term) is reminiscent of XMPP [1,2] where you have a connection
which is opened with a start tag and closed with an end tag.

> This is part of an event loop, so I want to have some code
> that looks like this:
>
>     when logfile is readable:
>         read one <entry> node, including children
>         but don't try to read past </entry>, so that
>         the read won't block.

I attempt to do this with the XMPP support in libxml2dom [3], although
I can't say that the work is exactly complete by any means. Generally,
I assume that each "stanza" (similar to an entry here, I think) is
complete and can be read, although the technique I use is dubious: I
treat each one like a separate document.

I imagine that the designers of XMPP intended that you connect up an
event-driven parser to the incoming stream and connect the event
handlers to various pieces of logic, with the initial start tag
causing a client to become active and the final end tag causing the
client to sleep.

Paul

[1] http://www.xmpp.org/rfcs/rfc3920.html
[2] http://www.xmpp.org/rfcs/rfc3921.html
[3] http://www.python.org/pypi/libxml2dom



More information about the Python-list mailing list