Interpreting non-ascii characters.
this at is.invalid
Wed Jul 18 08:46:23 CEST 2007
On Wed, 18 Jul 2007 08:29:58 +1000, John Machin <sjmachin at lexicon.net> wrote:
I have a bunch of directories and files from different systems
(each directory contains files from the same system) which are
encoded differently (though all of them are in Russian), so the
following encodings are present: koi8-r, win-1251, utf-8 etc.,
and I want to transliterate them into a regular ASCII so that they
would be readable regardless of the system. Personally I use both
Linux and Windows. So what I do, is read file name using os.listdir,
convert to list ('foo.txt' => ['f', 'o', ... , 't'], except that
file names are in Russian), transliterate (some letters in Russian
have to be transliterated into 2 or even 3 Latin letters),
and then rename file.
It seems though that after all I solved the problem - I thought
that my Windows (2000) used win-1251 and Linux used koi8-r and
because of that I couldn't understand what are those strange
codes I got while experimenting with locally created Cyrillic
file names, but in effect Linux uses utf-8, and Windows uses cp866,
so after getting it and reading the article you suggested I
solved the problem.
More information about the Python-list