[XML-SIG] SAX prettyprinter V2 and SGMLOP

Christian Tismer tismer@appliedbiometrics.com
Sun, 24 Jan 1999 19:12:06 +0100

Lars Marius Garshol wrote:
> * Christian Tismer
> |
> | Playing a little more with sgmlop, I realized that it doesn't
> | resolve entities when run under SAX.
> |
> | [...]  Should I try to add this, or forget about SAX and use sgmlop
> | directly?
> If it's possible, I'd very much like either you or me to add it to the
> driver. As far as I can see one must set a handle_entity handler that
> does this somehow. Don't know the exact details, though.

Fredrik handled this different, he has an extra mode for SAX
where he does not use his callback for entities. I have no
idea why, must wait for his answer.

> | If no entityresolver is defined, should'nt the standard entities
> | < > & be resolved internally?
> Yes. This is part of the XML recommendation. However, EntityResolver
> is only used for external entities, not internal ones.

Aha! And sgmlop didn't do this, so that's the reason why I got
&amp;lt in my attributes which contained "<" encoded as &lt;

So this is funny: If I just do some reformatting and juggling,
the process is this: The parser gives me characters and
tags and entities and whatsoever, strips the encodings off,
and I have to insert them back. What a mess.

It appears to me that XML parsers are already doing quite much,
also in cases where I don't need it. In my case, I would have 
been comfortable with kinda XML scanner which just recognizes
tokens, makes no attempt to resolve anything, to parse and
reorder the parameters (which is ok but I hate it) and
gives the plain text to me.
From that point of view, my basic simple parser building block
would something which can correctly recognize tags and doesn't 
change anything, just give me indices into the text.
Marc Lemburg's tagging engine springs into mind...

Anyway, if sgmlop doesn't resolve external entities but handles
the standards internally, this is ok with me. Again, I need
advice form /F.

ciao - chris

Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home