[XML-SIG] Re: [I18n-sig] Mixed encodings and XML

Thomas B. Passin tpassin@home.com
Wed, 13 Dec 2000 23:00:00 -0500


Martin v. Loewis chimed in -

> So what you really want is to include binary data in a tag. As you've
> explained yourself when answering to Marc-Andre: That is not supported
> in XML. Of course, if XML had a BDATA type (or section) you could
> include a binary data fragment, and then any presentation tool would
> have to provide visualization (such as opening a hex editor on
> double-click).
>
> In the specific case of cjkv.doc, I guess the best approach would be:
> - use Python string escapes in Python code, e.g.
>   sjisStr = "\0x88\0xc0\0x91\0x53\0x82\0xc9\0x8e\0x67\0x82\0xa6\0x82\0xe9"
>   # Shift-JIS encoded source string
> - use Unicode text data where output is intended to be displayed properly
> - don't cite the output if it will come out as gibberish on any terminal
>   (e.g. when printing both SJIS and UTF-8 on the same terminal). Instead,
>   explain what the user will likely see.
>
 How about a good old-fashioned PI?  The PI could indicate when to switch to
another encoding for the purposes of display or conversion.  True, this takes
a specialized processor, but you are asking for specialized processing anyway.
This kind of instruction to a processor is just what a PI is supposed to be
for, I always thought.

Cheers,

Tom P