[Python-Dev] Repeatability of looping over dicts
Gregory P. Smith
greg at krypto.org
Sat Jan 5 07:17:35 CET 2008
On 1/4/08, A.M. Kuchling <amk at amk.ca> wrote:
> This post describes work aimed at getting Django to run on Jython:
> http://zyasoft.com/pythoneering/2008/01/django-on-jython-minding-gap.html
>
> One outstanding issue is whether to use Java's ConcurrentHashMap type
> to underly Jython's dict type. See
> <http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/ConcurrentHashMap.html>.
>
As a side note, you may also want to consider using a
NonBlockingHashMap as implemented by Cliff Click instead of
ConcurrentHashMap:
http://blogs.azulsystems.com/cliff/2007/03/a_nonblocking_h.html
http://blogs.azulsystems.com/cliff/2007/03/talking_with_go.html
http://video.google.com/videoplay?docid=2139967204534450862
http://sourceforge.net/projects/high-scale-lib
I haven't looked to see if it has the same issues repeatability issues
but it does scale even better.
Anyways, could you not use something with repeatability issues with a
wrapper that caches lists of keys to satisfy the repeatability?
(don't lock and maintain the list, just regenerate the list on demand
when a caller needs it and discard the cached list anytime a key is
added or deleted)
Also, should we drop this guarantee in py3k to give future
implementations more flexibility?
-gps
> ConcurrentHashMap scales better in the face of threading because it
> doesn't lock the whole table when updating it, but iterating over the
> map can return elements in a different order each time. This would
> mean that list(dict_var) doesn't return values in the same order as a
> later call to list(dict_var), even if dict_var hasn't been modified.
>
> Why? Under the hood, there are 32 different locks, each guarding a
> subset of the hash buckets, so if there are multiple threads iterating
> over the dictionary, they may not go through the buckets in order.
> <http://www.ibm.com/developerworks/java/library/j-jtp08223/> discusses
> the implementation, at least in 2003.
>
> So, do Python implementations need to guarantee that list(dict_var) ==
> a later result from list(dict_var)?
>
> --amk
More information about the Python-Dev
mailing list