[Python-3000] Pre-PEP: Easy Text File Decoding

Oleg Broytmann phd at phd.pp.ru
Mon Sep 11 16:23:04 CEST 2006


On Mon, Sep 11, 2006 at 06:58:42AM -0700, Paul Prescod wrote:
> For these purposes, Russia is European, isn't it?

   If the test is "a BOM in UTF-8 text files on Unices" - then no. :)

> Russian text can be subsumed by UTF-8 with relatively minor expansion, right?

   Sorry, what do you mean? That russian encodings can be converted to
UTF-8? Yes, they can. But the most popular encoding here is cp1251, not
UTF-8. Even on Unices there are people who use cp1251 as their main
encoding (locale, fonts, keyboard mapping) because they often switch
between a number of platforms.

> If so, then I
> would guess that UTF-8 would replace KOI8-R and iso8859-? for Russian
> eventually.

   On Unix? Probably yes, but not in the nearest future. There are some
popular tools (for me the most notable is Midnight Commander) that still
have problems with UTF-8 locales.

> Given these safeguards, I think that the feature is not only safe enough but
> also helpful.

   Ok then.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


More information about the Python-3000 mailing list