[Python-Dev] File encodings

Walter Dörwald walter at livinglogic.de
Tue Nov 30 18:43:37 CET 2004


Gustavo Niemeyer wrote:

> [...]
> 
>>You are mixing things here:
>>
>>The source encoding is meant for the
>>parser and defines the way Unicode literals are converted
>>into Unicode objects.
>>
>>The encoding used on the stdout stream doesn't have anything
>>to do with the source code encoding and has to be handled
>>differently.
> 
> Sorry. I probably wasn't clear enough in my message. I understand
> the issue, and I'm not discussing source encoding at all. The
> only problem I'd like to solve is that of output streams not
> being able to have unicode strings written.
> 
>>The idiom presented by Bob is the right way to go: wrap
>>sys.stdout with a StreamEncoder.
> 
> I don't see that as a good solution, since every Python software
> that is internationalizaed will have do figure out this wrapping,
> introducing extra overhead unnecessarily.

This wrapping is probably necessary for stateful encodings. If you
had a sys.stdout.encoding=="utf-16", print would probably add the
BOM every time a unicode object is printed. This doesn't happen if
you wrap sys.stdout in a StreamWriter.

> [...]
> That's what I think as well. I just would expect that Python was
> kind enough to allow me to tell which output encoding I want,
> instead of wrapping the sys.stdout object with a non-native-file.
> 
> IOW, being widely necessary, handling internationalization without
> wrapping sys.stdout everytime seems like a good step for a language
> like Python.

You can't have stateful encodings without something that keeps
state. The only thing that does keep state in Python is a
StreamReader/StreamWriter.

Bye,
    Walter Dörwald




More information about the Python-Dev mailing list