[Python-Dev] Counting collisions for the win

Guido van Rossum guido at python.org
Fri Jan 20 19:15:21 CET 2012


On Fri, Jan 20, 2012 at 5:10 AM, Barry Warsaw <barry at python.org> wrote:

> On Jan 20, 2012, at 01:50 PM, Victor Stinner wrote:
>
> >Counting collision doesn't solve this case, but it doesn't make the
> >situation worse than before. Raising quickly an exception is better
> >than stalling for minutes, even if I agree than it is not the best
> >behaviour.
>
> ISTM that adding the possibility of raising a new exception on dictionary
> insertion is *more* backward incompatible than changing dictionary order,
> which for a very long time has been known to not be guaranteed.  You're
> running some application, you upgrade Python because you apply all security
> fixes, and suddenly you're starting to get exceptions in places you can't
> really do anything about.  Yet those exceptions are now part of the
> documented
> public API for dictionaries.  This is asking for trouble.  Bugs will
> suddenly
> start appearing in that application's tracker and they will seem to the
> application developer like Python just added a new public API in a security
> release.
>

Dict insertion can already raise an exception: MemoryError. I think we
should be safe if the new exception also derives from BaseException. We
should actually eriously consider just raising MemoryException, since
introducing a new built-in exception in a bugfix release is also very
questionable: code explicitly catching or raising it would not work on
previous bugfix releases of the same feature release.

OTOH, if you change dictionary order and *that* breaks the application, then
> the bugs submitted to the application's tracker will be legitimate bugs
> that
> have to be fixed even if nothing else changed.
>

There are lots of things that are undefined according to the language spec
(and quite possibly known to vary between versions or platforms or
implementations like PyPy or Jython) but which we would never change in a
bugfix release.

So I still think we should ditch the paranoia about dictionary order
> changing,
> and fix this without counting.  A little bit of paranoia could creep back
> in
> by disabling the hash fix by default in stable releases, but I think it
> would
> be fine to make that a compile-time option.


I'm sorry, but I don't want to break a user's app with a bugfix release and
say "haha your code was already broken you just didn't know it".

Sure, the dict order already varies across Python implementations, possibly
across 32/64 bits or operating systems. But many organizations (I know a
few :-) have a very large installed software base, created over many years
by many people with varying skills, that is kept working in part by very
carefully keeping the environment as constant as possible. This means that
the target environment is much more predictable than it is for the typical
piece of open source software.

Sure, a good Python developer doesn't write apps or tests that depend on
dict order. But time and again we see that not everybody writes perfect
code every time. Especially users writing "in-house" apps (as opposed to
frameworks shared as open source) are less likely to always use the most
robust, portable algorithms in existence, because they may know with much
more certainty that their code will never be used on certain combinations
of platforms. For example, I rarely think  about whether code I write might
not work on IronPython or Jython, or even CPython on Windows. And if
something I wrote suddenly needs to be ported to one of those, well, that's
considered a port and I'll just accept that it might mean changing a few
things.

The time to break a dependency on dict order is not with a bugfix release
but with a feature release: those are more likely to break other things as
well anyway, and uses are well aware that they have to test everything and
anticipate having to fix some fraction of their code for each feature
release. OTOH we have established a long and successful track record of
conservative bugfix releases that don't break anything. (I am aware of
exactly one thing that was broken by a bugfix release in application code I
am familiar with.)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120120/87d46f0d/attachment.html>


More information about the Python-Dev mailing list