[XML-SIG] Re: [I18n-sig] Mixed encodings and XML

uche.ogbuji@fourthought.com uche.ogbuji@fourthought.com
Wed, 13 Dec 2000 21:14:47 -0700


> Martin v. Loewis chimed in -
> 
> > So what you really want is to include binary data in a tag. As you've
> > explained yourself when answering to Marc-Andre: That is not supported
> > in XML. Of course, if XML had a BDATA type (or section) you could
> > include a binary data fragment, and then any presentation tool would
> > have to provide visualization (such as opening a hex editor on
> > double-click).
> >
> > In the specific case of cjkv.doc, I guess the best approach would be:
> > - use Python string escapes in Python code, e.g.
> >   sjisStr = "\0x88\0xc0\0x91\0x53\0x82\0xc9\0x8e\0x67\0x82\0xa6\0x82\0xe9"
> >   # Shift-JIS encoded source string
> > - use Unicode text data where output is intended to be displayed properly
> > - don't cite the output if it will come out as gibberish on any terminal
> >   (e.g. when printing both SJIS and UTF-8 on the same terminal). Instead,
> >   explain what the user will likely see.
> >
>  How about a good old-fashioned PI?  The PI could indicate when to switch to
> another encoding for the purposes of display or conversion.  True, this takes
> a specialized processor, but you are asking for specialized processing anyway.
> This kind of instruction to a processor is just what a PI is supposed to be
> for, I always thought.

Very interesting thought.  However, my intention is to try to handle the CJKV 
doc with a minimum of highly specialized processing.  So now that I've come to 
my senses, I think I'll stick to my conclusion.  Besides, it will give me a 
chance to consider XInclude support throughout 4Suite.

Thanks to all for yor patience even when I wasn't making much sense.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python