* Christian Tismer
| [ignorableWhitespace] 
| Well, I understand. Lars also mentioned that without a DTD and a
| parser which understands it, this event is useless.

Not useless, just impossible to fire as distinguished from the
characters event.
* Fredrik Lundh
| Our internal xml libraries allows the user to indicate whether a
| resource is "xml text" or "xml data".  the latter doesn't allow
| elements to contain both text and other elements, which means that
| it's easy to figure out what to ignore.

This sounds like a good approach to me. The XML recommendation
(sensibly) requires parsers to report all whitespace to the
application, but an application-specific layer on top of that sounds
good to me.
* Christian Tismer
| That sounds good, this is exactly what we need to distinguish,
| too. How do you indicate this without a DTD?  A list of tags which
| are treated as raw data? (kind of a sub-sub-DTD?)

Why not make a simple SAX parser filter that reads in such a list of
element type names and then filters characters events into characters
and ignorableWhitespace, possibly also doing whitespace normalization?

Sounds like something that is both simple to develop and eminently

