[XML-SIG] SAX prettyprinter V2 and SGMLOP

Lars Marius Garshol larsga@ifi.uio.no
24 Jan 1999 16:00:32 +0100


* Christian Tismer
|
| [ignorableWhitespace] 
|
| Well, I understand. Lars also mentioned that without a DTD and a
| parser which understands it, this event is useless.

Not useless, just impossible to fire as distinguished from the
characters event.
 
* Fredrik Lundh
|
| Our internal xml libraries allows the user to indicate whether a
| resource is "xml text" or "xml data".  the latter doesn't allow
| elements to contain both text and other elements, which means that
| it's easy to figure out what to ignore.

This sounds like a good approach to me. The XML recommendation
(sensibly) requires parsers to report all whitespace to the
application, but an application-specific layer on top of that sounds
good to me.
 
* Christian Tismer
|
| That sounds good, this is exactly what we need to distinguish,
| too. How do you indicate this without a DTD?  A list of tags which
| are treated as raw data? (kind of a sub-sub-DTD?)

Why not make a simple SAX parser filter that reads in such a list of
element type names and then filters characters events into characters
and ignorableWhitespace, possibly also doing whitespace normalization?

Sounds like something that is both simple to develop and eminently
reusable. 

--Lars M.