[Python-3000] bytes and dicts (was: PEP 3137: Immutable Bytesand Mutable Buffer)
Guido van Rossum
guido at python.org
Sat Sep 29 16:33:01 CEST 2007
On 9/29/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 08:08 PM 9/28/2007 -0700, Guido van Rossum wrote:
> >Likely, programmers will attempt to look up keys
> >that they know are in the dict -- and if they use the wrong type,
> >because of the identical hash values, they will get the TypeError as
> >soon as they compare it to the first object at the hashed location.
>
> I'm coming into this thread a little bit late, but if we don't want
> strings and bytes to be comparable, shouldn't we just make them
> *unequal*? I mean, under normal circumstances, == and != are
> available on all objects without causing errors, and the same
> TypeError would occur for things like list.remove().
Until just before 3.0a1, they were unequal. We decided to raise
TypeError because we noticed many bugs in code that was doing things
like
data = f.read(4096)
if data == "": break
where data was bytes and thus the break never taken. Similar with
checks for certain magic strings (so it wasn't just empty strings).
It is also in line with the policy to refuse things like
b"abc".replace("a", "A") or "abc".replace(b"b", b"B").
> This seems a lot like Oleg's question on Python-Dev the other day,
> about raising a TypeError from __nonzero__: i.e., changing a
> significant expectation about all "normal" objects.
>
> While it's true that it would be good to know when you've
> unintentionally mixed bytes and strings, surely there could be less
> fatal ways to find this, like perhaps a command-line option that
> causes byte/string comparisons to output a warning?
I thought about using warning too, but since nobody wants warnings,
that would be pretty much the same as raising TypeError except for the
most dedicated individuals (and if I were really dedicated I'd just
write my own eq() function anyway). And the warning would do nothing
about the issue brought up by Jim Jewett, the unpredictable behavior
of a dict with both bytes and strings as keys.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-3000
mailing list