transliteration in Python
Martin von Loewis
loewis at informatik.hu-berlin.de
Fri Jan 4 12:17:08 EST 2002
"Jason Orendorff" <jason at jorendorff.com> writes:
> s1 = "shchi" # start with some ascii bytes
> u = s1.decode('ascii') # decode them into a unicode string
> s2 = u.encode('utf16') # encode it as UTF-16 bytes
> outfile.write(s2) # write them to a binary file, for example
>
> [In this case, I know that s1 is ascii-encoded, because I typed in the
> letters "shchi" and I know that those are all ascii characters, and Python
> and my computer both handle ASCII just fine by default.
This is not what Giorgi wanted. He was asking for translateration,
i.e. where shch is a latin read-alike of some cyrillic
letter. Transliteration is a common technique to display non-latin
languages with latin letters (often even without using accent marks).
> Python only supports a few encodings out of the box. KOIR-8, the one
> you mentioned, apparently isn't one of them.
KOI8-R is supported.
Regards,
Martin
More information about the Python-list
mailing list