[Python-Dev] iso-2022 and issue 7472: question for the experts

R. David Murray rdmurray at bitdance.com
Wed Apr 7 03:56:21 CEST 2010


On Wed, 07 Apr 2010 02:18:13 +0200, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <martin at v.loewis.de> wrote:
> > Can someone (Steve Turnbull?) confirm or refute my analysis? 
> 
> Refute, see http://bugs.python.org/issue804885
> 
> > ISO-2022 input will
> > be 7-bit, and the except will not trigger
> 
> This conclusion is false:
> 
> 1. it is 7-bit
> 
> py> unichr(913).encode("iso-2022-jp")
> '\x1b$B&!\x1b(B'
> 
> 2. the except *will* trigger, anyway.
> 
> py> unichr(913).encode("ascii")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u0391' in
> position 0: ordinal not in range(128)

My understanding, however, is that what comes out of get_payload is
always a string, not unicode.  That is, it would have to be already
encoded, and the encode('ascii') trick is just to see if there are
any 8 bit bytes.

Tracing the code a little farther, though, I now understand that
the *input* encoding that the payload is in (which will on output be
encoded as iso-2022-xx) can be an eight bit encoding.

So, now I understand the patch, and will fix the spelling mistake.
Thanks.

--
R. David Murray                                      www.bitdance.com


More information about the Python-Dev mailing list