[I18n-sig] Reading UTF-16 Scripts

Doug Edmunds dae_alt3@juno.com
Tue, 11 Apr 2000 01:10:11 -0700

python ver: 1.6a
os: Win98

Are there any plans to allow
python to be able to read scripts
written entirely in UTF-16 format
(such as those written by
Win98's Wordpad program and saved
as unicode text?)

Since each of these files begin
with 'FFEE' it would seem to be
not too difficult for python
to recognize that format and convert
the non-string context to 8bit, i.e.,
p r i n t -> print.

The advantage is that mixed language
scripts (i.e English/Russian) can
be written and saved unambiguously, 
not dependent upon selection 
of a particular 'font script' such as
cp1251 or KOI8-r for Russian.  

The motivation for getting away from
these scripts (encodings, whatever)
is to be able to write multiple languages 
in a single string.  

This kind of scripting could be avoided:
a = unicode ('Правда - газета', 'cp1251')
print a.encode('cp1251')

and replaced with a simpler:
print "In Russian, newspaper is ____; in Polish it is ______"

1. Cyrillic fonts do not appear in IDLE (US English is base).
2. In PythonWin, even with a Cyrillic 'script' selected,
   such as Courier New (Cyrillic), output appears in English
   -- the 'script' aspect is being ignored.

-- doug edmunds
11 April 2000

