[XML-SIG] dom building, sax, and namespaces

M.-A. Lemburg mal@lemburg.com
Fri, 25 Jan 2002 14:51:09 +0100

Daniel Veillard wrote:
> On Fri, Jan 25, 2002 at 06:25:52AM -0700, Andrew Dalke wrote:
> > Daniel Veillard:
> > >   Considering that the XML spec very clearly says that an
> > >XML parser MUST stop delivering content as soon as a well
> > >formedness error is found in a document, and that the
> > >probability of growing a corruption on a
> > >very large file becomes not neglectable, using XML to store huge
> > >data on a single instance is IMHO brain-dead.
> >
> > The data files are themselves machine generated.  The
> > parser I have that converts the flat-file to a marked-up
> > version also does verification of the format.  So it meets
> > the XML criterion.
> >
> > What I'm providing is a migration mechanism for people to
> > keep their existing practice (large flat files) but start
> > taking advantage of the benefits of newer technologies.  Yes,
> > there's a change the input file is corrupt.  I can detect
> > that.  But it's at least better than what people do now,
> > where there is little detection of corruption.
>   Okay, still one really wonders if XML is really the
> right serialization format for such data. Clearly both
> the XPath data model and the DOM one for such document
> will grow huge unless you manage to use a database to
> generate the needed parts on the fly.
>   It's still not clear to me that XML/XSLT are really the
> right tools for this kind of work, you really have to push the
> envelopp (like restricting yourself to a streamable subset of
> XSLT).

If you don't trust flat file XML databases, why not use a
XML repository ? These are built to handle huge amounts of
data and certainly do not corrupt data; also you can access
the stored data in various ways, reducing the in-memory
overhead to an absolute minimum.

XML as technique is still the right choice, though, since
it provides the necessary flexibility to handle changing
input data formats and data layout requirements.

Marc-Andre Lemburg
CEO eGenix.com Software GmbH
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/