[pypy-issue] [issue1536] custom encoding/decoding error handlers are not compatible with CPython

Mikhail Korobov tracker at bugs.pypy.org
Fri Jul 12 12:41:20 CEST 2013


New submission from Mikhail Korobov <kmike84 at gmail.com>:

Hi,

I think that the way PyPy implements support for custom encoding/decoding error 
handlers is not compatible 
with CPython. An example script that reproduces this is in attachements.

CPython 2.6.7 agrees with CPython 3.3.2:

$ python3.3 error_handlers.py 
\x80\xa3
UnicodeDecodeError('utf-8', b'\x80\xa3', 0, 1, 'invalid start byte')
UnicodeDecodeError('utf-8', b'\x80\xa3', 0, 1, 'invalid start byte')
!!
\xe3\xa3
UnicodeDecodeError('utf-8', b'\xe3\xa3', 0, 2, 'unexpected end of data')
UnicodeDecodeError('utf-8', b'\xe3\xa3', 0, 2, 'unexpected end of data')
!!

$ python2.6 error_handlers.py 
\x80\xa3
UnicodeDecodeError('utf8', '\x80\xa3', 0, 1, 'invalid start byte')
UnicodeDecodeError('utf8', '\x80\xa3', 0, 1, 'invalid start byte')
!!
\xe3\xa3
UnicodeDecodeError('utf8', '\xe3\xa3', 0, 2, 'unexpected end of data')
UnicodeDecodeError('utf8', '\xe3\xa3', 0, 2, 'unexpected end of data')
!!


PyPy 2.0.2 produces different results (different exceptions are raised, the output 
is also not the same):

$ pypy error_handlers.py 
\x80\xa3
UnicodeDecodeError('utf-8', '\x80\xa3', 0, 1, 'invalid start byte')
UnicodeDecodeError('utf-8', '\x80\xa3', 1, 2, 'invalid start byte')
!!
\xe3\xa3
UnicodeDecodeError('utf-8', '\xe3\xa3', 0, 2, 'unexpected end of data')
!

----------
messages: 5922
nosy: kmike, pypy-issue
priority: bug
status: unread
title: custom encoding/decoding error handlers are not compatible with CPython

________________________________________
PyPy bug tracker <tracker at bugs.pypy.org>
<https://bugs.pypy.org/issue1536>
________________________________________


More information about the pypy-issue mailing list