[Python-3000] bytes and dicts (was: PEP 3137: Immutable Bytes and Mutable Buffer)

Neil Toronto ntoronto at cs.byu.edu
Fri Sep 28 21:46:30 CEST 2007


Jim Jewett wrote:
> On 9/28/07, Guido van Rossum <guido at python.org> wrote:
>
>   
>> The question is whether it's worth the effort to raise TypeError when
>> the *potential* exists that a certain hash sequence *could* raise this
>> TypeError.
>>     
>
> Bugs depending on the hash sequence are exactly the sort of thing that
> doesn't get found by tests, and can't be easily reproduced.
>   

Not that my opinion counts for much because I mostly just lurk, but I 
have to agree. A one-in-a-million Heisenbug (Mandelbug?) is exactly the 
sort of thing that breaks production systems but nobody can figure out 
how to fix, and causes management to lose faith in a language or in 
their developers.

>> I'm less and less convinced -- after all, we're making the
>> exception only for bytes/str, not for other types that might raise
>> TypeError upon comparison.
>>     
>
> What would those other types be?
>
> As you point out in the "Bytes and the Str Type" section, this
> exception violates the "general rule that comparing objects of
> different types for equality
> should just return False".
>   

So there's a special case comparison that's intended to protect users 
from themselves - to keep them from comparing bytes and strings without 
specifying an encoding. Then there has to be another potentially 
performance-munching special case to save them from an essentially 
random exception that could occur because of this extra protection - and 
this special-casing can only be guaranteed for built-in types, not 
custom ones. It's too easy to forget to consider it.

Is the only case they need to be saved from the 'if <str> == <bytes>' 
case? Shouldn't it be perfectly fine for a dict to hold a str and a 
bytes? If I recall correctly, the decision to raise a TypeError on 
str/bytes comparison was made before bytes became immutable and could be 
put into dicts.

Maybe the *extra protection* isn't worth the effort. How about a warning 
instead of a TypeError? Can the bytecode interpreter do something for 
simple '==' cases? Are there other alternatives?

Neil



More information about the Python-3000 mailing list