[pypy-dev] 1.4.1 threading

Mon Dec 27 18:25:22 CET 2010

On Mon, Dec 27, 2010 at 09:31, Carl Friedrich Bolz <cfbolz at gmx.de> wrote:
> On 12/27/2010 07:00 AM, Dima Tisnek wrote:
>> (dan) random binary trees: O(log2(n)) is 7.2 for builtins, though the
>> constant factor for locks, etc, might make it worthwhile
>>
>> (paolo) non blocking hash maps: memory barriers can be quite costly too
>>
>> different options will have to be implemented and tested when we get there,
>> speaking on which, is there a test dict load?
>> does anyone have some profiling data, what portion of runtime is spent
>> reading and writing attributes and globals anyways?
>>
>> while we are on the subject, is there a plan to provide different
>> levels of sparseness for different levels of name lookup?
>> for example globals vs builtins, first needs to be quite sparce so
>> that builtins show through well, second hardly, because there's
>> nowhere else to look if builtins don't have the name.
>> then the tradeoff could be more dynamic, that is frequently accessed
>> dicts could be e.g. more sparse and rarely accessed more compact?
>> (obviousely more sparse is not always better, e.g. in terms of cpu cache)
>> of course "frequently accessed" is not as easy as frequently ran code,
>> e.g. here "class A: ...; while 1: A().func()", func lookup occurs in
>> different objects, yet it is same logical operation.
>>
>> come to think of it, is there any point in polymorphic dicts, e.g.
>> attribute access could be imeplemented as a near-perfect compact hash
>> map if attribute names change rarely, while regular dict()s are
>> expected to change keys often.
>
> All these thoughts go into the wrong direction, imo. The JIT removes
> nearly all dictionary accesses to global dicts, instance and class
> dicts. Even without the JIT, purely interpreting things, there are
> caches that bypass most dict lookups.

That's very interesting. However, aren't such caches also hash maps in
the end (unless you do inline caching in your interpreter, like in
Ruby 1.9)? I remember reading so in PyPy's docs; moreover, that's a
standard optimization for method lookup.

Sharing such caches between different interpreter threads would be
potentially useful, if that implied no expensive synchronization for
readers - which is possible for instance on x86 (read barriers are for
free), by using persistent data structures. Writes to such caches
should (hopefully) be rare enough to make the extra synchronization
not too expensive, however this needs benchmarking.

Best regards
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/