[Python-Dev] Issue 3745 backwards incompatibility

M.-A. Lemburg mal at egenix.com
Tue Dec 15 12:40:41 CET 2009


Michael Foord wrote:
> On 15/12/2009 11:23, M.-A. Lemburg wrote:
>> Karen Tracey wrote:
>>   
>>> In testing some existing code with the 2.7 alpha release, I've run into:
>>>
>>>      TypeError: Unicode-objects must be encoded before hashing
>>>
>>> when the existing code tries to pass unicode objects to hashlib.sha1 and
>>> hashlib.md5.  This is, I believe, due to changes made for issue 3745:
>>>
>>> http://bugs.python.org/issue3745
>>>
>>> The issue states the need to reject unencoded strings based on the
>>> fact that
>>> one backend implementation (openssl) refused to accept them while
>>> another
>>> (_sha256) assumed a utf-8 encoding.  The thing is, I cannot observe
>>> any such
>>> difference using Python 2.5 or 2.6.  Instead of what is shown in the
>>> ticket
>>> (which was done on a Python 3, I believe) I see, when I adjust the
>>> demo test
>>> to use Python 2 syntax for "unencoded strings":
>>>      
>> I think this was a misunderstanding during the issue 3745 processing:
>> the patch should not have been backported to trunk at all.
>>
>> For Python 3.x, the change was correct. For 2.x, a -3 warning
>> would have been a better fit.
>>
>> Note that the non-OpenSSL SHA et al. modules have never defaulted to
>> encoding to UTF-8 in Python 2.x. Python 2.x uses ASCII as default
>> encoding. Only Python 3.x uses UTF-8 as default encoding.
>>    
> 
> Doesn't Python 3 use the *platform* encoding as the default (which
> happens to be UTF-8 on sensible systems but is something truly horrible
> like CP1250 on Windows)? (So *assuming* a default encoding of UTF-8 is
> still incorrect on Python 3 if we are being consistent with other IO
> behaviour.)

Internally, it uses UTF-8 as default encoding, ie. when implicitly
converting unicode to bytes, e.g. as a result of using "s#" parser
markers.

Externally, Python tries to find the right encoding(s) for the given
platform and then uses these in the IO layer (e.g. sys.stdin.encoding)
and for OS interfacing (sys.getfilesystemencoding()).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 15 2009)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the Python-Dev mailing list