[Python-3000] locale-aware strings ?

Oleg Broytmann phd at mail2.phd.pp.ru
Wed Sep 6 13:16:43 CEST 2006

On Wed, Sep 06, 2006 at 03:55:04AM -0700, Paul Prescod wrote:
> But how would a system-wide default encoding help with any of these
> situations? These situations are IN FACT caused by system-wide default
> encodings used by naive programmers. Python should be part of the
> solution, not part of the problem.
> On 9/6/06, Oleg Broytmann <phd at oper.phd.pp.ru> wrote:
> >    First, there are text files. Really, there are still text files. A user
> > can dump a README file unto his/her personal FTP server, and the file
> > ususally is in the local encoding.
> >    MP3 tags. Real nightmare. Nobody follows the standard - tag editors
> > write tags in the local encoding, and mp3 players interpret them in the
> > local encoding.
> >    FTP and other dumb protocols that transfer file names in the encoding
> > local to the server without announcing that encoding in the metadata.

   These situations are caused because of the lack of metadata or clear
encoding-friendly standards. Ogg, for example, is encoding friendly - it
clearly states that tags (comments) must be in UTF-8, and all Ogg Vorbis
files I have saw were really in UTF-8, and all tag editors and players
write/use UTF-8. XML is encoding-friendly - every file specifies its
encoding. HTTP protocol is mostly encoding friendly with its Content-Type
header. HTML is partially encoding friendly, but only partially - if one
saves an HTML page to a file it may lack an encoding information.
   But text files and FTP protocol don't have any metadata, and ID3v2 don't
specify an universal encoding or encoding metadata. In these cases programs
can either guess encoding based on the file content or use system global
   I fail to see how Python can help here.

     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

More information about the Python-3000 mailing list