[Python-Dev] File encodings

Gustavo Niemeyer niemeyer at conectiva.com
Tue Nov 30 14:09:34 CET 2004


[...]
> You are mixing things here:
> 
> The source encoding is meant for the
> parser and defines the way Unicode literals are converted
> into Unicode objects.
> 
> The encoding used on the stdout stream doesn't have anything
> to do with the source code encoding and has to be handled
> differently.

Sorry. I probably wasn't clear enough in my message. I understand
the issue, and I'm not discussing source encoding at all. The
only problem I'd like to solve is that of output streams not
being able to have unicode strings written.

> The idiom presented by Bob is the right way to go: wrap
> sys.stdout with a StreamEncoder.

I don't see that as a good solution, since every Python software
that is internationalizaed will have do figure out this wrapping,
introducing extra overhead unnecessarily.

> Using sys.setdefaultencoding() is *not* the right solution
> to the problem.

I understand.

> In general when writing programs that are targetted for
> i18n, you should use Unicode for all text data and
> convert from Unicode to 8-bit only at the IO/UI layer.

That's what I think as well. I just would expect that Python was
kind enough to allow me to tell which output encoding I want,
instead of wrapping the sys.stdout object with a non-native-file.

IOW, being widely necessary, handling internationalization without
wrapping sys.stdout everytime seems like a good step for a language
like Python.

> The various wrappers in the codecs module make this
> rather easy.

Thanks for the suggestion!

-- 
Gustavo Niemeyer
http://niemeyer.net


More information about the Python-Dev mailing list