[IPython-dev] ASCII Terminal IPython re-encodes bytes greater than 127

Fri Jul 25 17:04:28 EDT 2014

I'm interested about why IPython (python 2) does ascii encoding the way it
does.

When I run ipython in a ascii terminal and enter a byte greater than 127,
it appears to be decoded using latin-1 and re-encoded with utf8.

    In [1]: '<meta-a, or byte \xe1>'
    Out [1]: '\xef\xbf\xbd'

In a ascii-encoded python file, this would be an error. In an ASCII vanilla
Python interpreter, this would be just the byte entered, '\xe1'. In
terminal ipython, just entering the byte (without quotes) gives ERROR -
failed to write data to stream. In vanilla ascii python, this would be a
syntax error.

I'm interested in why this decision was made. I'm in the process of
choosing a behavior for bpython, and so far can think of:

1) vanilla python 2 behavior - run source code as bytes when terminal is
ascii encoded
2) vanilla python 3 behavior - syntax error on finding this character
3) ipython behavior - somehow figure out (guess? what happens in ipython?)
which character is being represented on the user's terminal by this byte
and decode it to unicode, then rencode it (all assuming it's in a string).
I don't understand the specifics of this.

I'm particularly interested in whether it's important functionality to
users (maybe for localization? do some people's terminals say ASCII but
really represent important characters they can enter with their keyboards?)

Thanks very much for any thoughts,

-Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140725/62f9a50b/attachment.html>