[Python-Dev] PEP 509: Add a private version to dict

Brett Cannon brett at python.org
Wed Jan 20 21:04:58 EST 2016


On Wed, 20 Jan 2016, 17:54 Andrew Barnert <abarnert at yahoo.com> wrote:

> On Wednesday, January 20, 2016 4:10 PM, Brett Cannon <brett at python.org>
> wrote:
>
>
> >I think Glenn was assuming we had a single, global version # that all
> dicts shared without having a per-dict version ID. The key thing here is
> that we have a global counter that tracks the number of mutations for all
> dictionaries but whose value we store as a per-dictionary value. That ends
> up making the version ID inherently both a token representing the state of
> any dict but also the uniqueness of the dict since no two dictionaries will
> ever have the same version ID.
>
> This idea worries me. I'm not sure why, but I think because of threading.
> After all, it's pretty rare for two threads to both want to work on the
> same dict, but very, very common for two threads to both want to work on
> _any_ dict. So, imagine someone manages to remove the GIL from CPython by
> using STM: now most transactions are bumping that global counter, meaning
> most transactions fail and have to be retried, so you end up with 8 cores
> each running at 1/64th the speed of a single core but burning 100% CPU.
> Obviously a real-life implementation wouldn't be _that_ stupid; you'd
> special-case the version-bumping (maybe unconditionally bump it N times
> before starting the transaction, and then as long as you don't bump more
> than N times during the transaction, you can commit without touching it),
> but there's still going to be a lot of contention.
>

This is all being regarded as an implementation detail of CPython, so in
this hypothetical STM world we can drop all of this (or lock it).


> And that also affects something like PyPy being able to use
> FAT-Python-style AoT optimizations via cpyext. At first glance that sounds
> like a stupid idea--why would you want to run an optimizer through a slow
> emulator? But the optimizer only runs once and transforms the function
> code, which runs a zillion times, so who cares how slow the optimizer is?
> Of course it may still be true that many of the AoT optimizations that FAT
> makes don't apply very well to PyPy, in which case it doesn't matter. But I
> don't think we can assume that a priori.
>
> Is there a way to define this loosely enough so that the implementation
> _can_ be a single global counter, if that turns out to be most efficient,
> but can also be a counter per dictionary and a globally-unique ID per
> dictionary?
>

There's no need to if this is all under the hood and in no way affects
anyone but the eval loop and those who choose to use it. We can make sure
to preface all of this with underscores so it's obvious they are private
and so use at your own peril.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160121/fa7b1fd7/attachment-0001.html>


More information about the Python-Dev mailing list