[XML-SIG] Parsing XML data from a stream where several XML
elements follow?
Uche Ogbuji
uche.ogbuji@fourthought.com
Wed, 18 Dec 2002 08:03:46 -0700
> On Wed, Dec 18, 2002 at 06:47:49AM -0700, Uche Ogbuji wrote:
> > > On Tue, Nov 26, 2002 at 04:25:38PM +0100,
> > > Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote
> > The real solution is really to hack the code that does the buffered reads so
> > that it returns as soon as it has exhausted the current octets on the channel
> > and perhaps to change the parser so that it determines itself when it's done
> > rather than having the calling code inform it. This is no trivial solution
> > :-(
>
> The XmlTextReader from C# have a specific call ResetState() specifically
> for case (IMHO broken but apparently some specification are defining this)
> where multiple instances are packed onto a single stream. Only the user
> level can know that the parsing is at its end (i.e. there is no (more) Misc*
> following the root document tag) and has to instruct the parser of this.
Yes. As you say, and as I should have pointed out, this is really a
fundamental problem of XML processing and not of the tools. The XML grammar
ensures that there is no way for a parser to know when the input is truly
complete.
It would be like reading a stream from a socket and passing it to a Python
interpreter for execution. The interpreter doesn't know when it's done. If
you as a user know that the parse is done after the document element closing
tag, then it is *your* job to tell the parser that, which is why I suggested
the kludge of a separate shallow parse.
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
A Python & XML Companion - http://www.xml.com/pub/a/2002/12/11/py-xml.html
XML class warfare - http://www.adtmag.com/article.asp?id=6965
MusicBrainz metadata - http://www-106.ibm.com/developerworks/xml/library/x-thi
nk14.html