[XML-SIG] Handling of character entity references

Thomas B. Passin tpassin@comcast.net
Sun, 25 May 2003 16:16:24 -0400


[Tamito KAJIYAMA]

> So, I decided to use a special markup to represent Latin-1
> characters in the input XML files, as illustrated below:
>
> <char name="eacute" />

Some people have done the same kind of thing using processing instructions.
Topologi's editor replaces illegal characters with PIs is this way.  The
idea is that you are not actually changing the information content of the
original (because you are not inserting new elements), yet you can inform
the processor about the situation since PIs can be reported.

Encoding issues are always hard!  Just try to capture the titles of web
pages, using a range of browsers, and combine them into one document (I am
referring to trying to combine the bookmarks from different browsers using
XBEL).  You can end up with all kinds of strange results.

Cheers,

Tom P