print() and unicode strings (python 3.1)
apt.shansen at gmail.com
Tue Aug 25 05:37:11 CEST 2009
> > You should be setting the terminal encoding administratively, not
> > programmatically.
> The terminal encoding has always been utf-8. It was not set
> It seems to me that python 3.1's string handling is broken.
> Apparently, in python 3.1 I am unable to explicitly set the encoding
> of a string and print() it out with the result being human readable
> text. On the other hand, if I let python do the encoding implicitly,
> python uses a codec I don't want it to.
This isn't Python's string handling-- its the streams. The string handling
is behaving precisely as it should, and you can explicitly set the encoding
of any string to anything you want (as long as the data can translate).
sys.stdout is a text stream: so has an explicit encoding. You can set your
string to any encoding, but if you try to pass it through a text-stream
which has an incompatible encoding, of course it will error out. That's the
The problem seems to be that Python appears to be auto-detecting an encoding
for your terminal (and assigning it to sys.stdout) that you think is
incorrect; and it appears that this detection routine has changed (though I
don't know if its the routine itself or sys.stdout being a "byte string"
based object in 2.6 and a "unicode string" based object in 3.x that has
caused this difference in behavior-- this is probably the case, really).
How are you 'setting' your terminal encoding? Is your LC_CTYPE environment
variable set to "en_US.UTF-8"? That's how Python (on nix'ses, at least, I
believe) determines the encoding of the terminal and thus the
If for some reason Python's detecting it wrong, the only way I know to
change it would be to:
import io, sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list