[XML-SIG] unicode entity refs

Dieter Maurer dieter@handshake.de
Thu, 6 May 1999 19:25:26 +0000 (/etc/localtime)

A.M. Kuchling writes:
 > 	However, this doesn't fix your problem, since the error
 > handler raises a BadHTML exception.  I'd argue for this behaviour,
 > since the HTML character set is ISO-whatever, not Unicode, and
 > therefore this is illegal HTML; if it's got character references >255,
 > it's not HTML but XML that looks like HTML.  (Hmm... I may have
 > written too soon; what's the status of HTML i18n?  Can you declare a
 > Unicode encoding for an HTML document?)
We (Saarbrücker Zeitung) make extensive use of UTF-8 encoded
HTML documents for european language documents (e.g. documents
containing both sweedish, french, german and greek passages). Both Netscape
and IE support it. We use the META tag for "encoding"
with value "UTF-8". This becomes an HTTP header telling
the browser that the document is encoded in "UTF-8".
With the appropriate fonts installed, you can see all
european characters on a single page (and would be able
to include japanese and chinese characters as well, but
we do not need that, so far).

 > 	On a side note, the Unicode issue seems to be heading for
 > using /F's Unicode type.  This would seem to be a good argument to
 > drop MvL's Unicode type, which is currently in the XML tree, and
 > replace it with /F's code.  Opinions?
What's the difference between these Unicode types?

- Dieter