[Python-Dev] Repeatability of looping over dicts

Fri Jan 4 23:54:49 CET 2008

On Jan 4, 2008 11:50 AM, A.M. Kuchling <amk at amk.ca> wrote:
> This post describes work aimed at getting Django to run on Jython:
> http://zyasoft.com/pythoneering/2008/01/django-on-jython-minding-gap.html
>
> One outstanding issue is whether to use Java's ConcurrentHashMap type
> to underly Jython's dict type.  See
> <http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/ConcurrentHashMap.html>.
>
> ConcurrentHashMap scales better in the face of threading because it
> doesn't lock the whole table when updating it, but iterating over the
> map can return elements in a different order each time.  This would
> mean that list(dict_var) doesn't return values in the same order as a
> later call to list(dict_var), even if dict_var hasn't been modified.
>
> Why?  Under the hood, there are 32 different locks, each guarding a
> subset of the hash buckets, so if there are multiple threads iterating
> over the dictionary, they may not go through the buckets in order.
> <http://www.ibm.com/developerworks/java/library/j-jtp08223/> discusses
> the implementation, at least in 2003.
>
> So, do Python implementations need to guarantee that list(dict_var) ==
> a later result from list(dict_var)?

What code would break if we loosened this restriction? I guess
defining d.items() as zip(d.keys(), d.values()) would no longer fly,
but does anyone actually depend on this? Just like we changed how we
think about auto-closing files once Jython came along, I think this is
at least worth considering.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)