[XML-SIG] Parsing XML data from a stream where several XML elements follow?

Uche Ogbuji uche.ogbuji@fourthought.com
Wed, 18 Dec 2002 08:03:46 -0700


> On Wed, Dec 18, 2002 at 06:47:49AM -0700, Uche Ogbuji wrote:
> > > On Tue, Nov 26, 2002 at 04:25:38PM +0100,
> > >  Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote 
> > The real solution is really to hack the code that does the buffered reads so 
> > that it returns as soon as it has exhausted the current octets on the channel 
> > and perhaps to change the parser so that it determines itself when it's done 
> > rather than having the calling code inform it.  This is no trivial solution  
> > :-(
> 
>   The XmlTextReader from C# have a specific call ResetState() specifically
> for case (IMHO broken but apparently some specification are defining this)
> where multiple instances are packed onto a single stream. Only the user
> level can know that the parsing is at its end (i.e. there is no (more) Misc*
> following the root document tag) and has to instruct the parser of this.

Yes.  As you say, and as I should have pointed out, this is really a 
fundamental problem of XML processing and not of the tools.  The XML grammar 
ensures that there is no way for a parser to know when the input is truly 
complete.

It would be like reading a stream from a socket and passing it to a Python 
interpreter for execution.  The interpreter doesn't know when it's done.  If 
you as a user know that the parse is done after the document element closing 
tag, then it is *your* job to tell the parser that, which is why I suggested 
the kludge of a separate shallow parse.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
A Python & XML Companion - http://www.xml.com/pub/a/2002/12/11/py-xml.html
XML class warfare - http://www.adtmag.com/article.asp?id=6965
MusicBrainz  metadata - http://www-106.ibm.com/developerworks/xml/library/x-thi
nk14.html