[Python-Dev] PEP 509: Add a private version to dict

Yury Selivanov yselivanov.ml at gmail.com
Wed Jan 20 13:45:51 EST 2016


Brett,

On 2016-01-20 1:22 PM, Brett Cannon wrote:
>
>
> On Wed, 20 Jan 2016 at 10:11 Yury Selivanov <yselivanov.ml at gmail.com 
> <mailto:yselivanov.ml at gmail.com>> wrote:
>
>     On 2016-01-18 5:43 PM, Victor Stinner wrote:
>     > Is someone opposed to this PEP 509?
>     >
>     > The main complain was the change on the public Python API, but
>     the PEP
>     > doesn't change the Python API anymore.
>     >
>     > I'm not aware of any remaining issue on this PEP.
>
>     Victor,
>
>     I've been experimenting with the PEP to implement a per-opcode
>     cache in ceval loop (I'll share my progress on that in a few
>     days).  This allows to significantly speedup LOAD_GLOBAL and
>     LOAD_METHOD opcodes, to the point, where they don't require
>     any dict lookups at all.  Some macro-benchmarks (such as
>     chameleon_v2) demonstrate impressive ~10% performance boost.
>
>
> Ooh, now my brain is trying to figure out the design of the cache. :)

Yeah, it's tricky.  I'll need some time to draft a comprehensible
overview.  And I want to implement a couple more optimizations and
benchmark it better.

BTW, I've some updates (html5lib benchmark for py3, new benchmarks
for calling C methods, and I want to port some PyPy benchmakrs)
to the benchmarks suite.  Should I just commit them, or should I
use bugs.python.org?

>
>     I rely on your dict->ma_version to implement cache invalidation.
>
>     However, besides guarding against version change, I also want
>     to guard against the dict being swapped for another dict, to
>     avoid situations like this:
>
>
>          def foo():
>              print(bar)
>
>          exec(foo.__code__, {'bar': 1}, {})
>          exec(foo.__code__, {'bar': 2}, {})
>
>
>     What I propose is to add a pointer "ma_extra" (same 64bits),
>     which will be set to NULL for most dict instances (instead of
>     ma_version).  "ma_extra" can then point to a struct that has a
>     globally unique dict ID (uint64), and a version tag (unit64).
>     A macro like PyDict_GET_ID and PyDict_GET_VERSION could then
>     efficiently fetch the version/unique ID of the dict for guards.
>
>     "ma_extra" would also make it easier for us to extend dicts
>     in the future.
>
>
> Why can't you simply use the id of the dict object as the globally 
> unique dict ID? It's already globally unique amongst all Python 
> objects which makes it inherently unique amongst dicts.

We have a freelist for dicts -- so if the dict dies, there
could be a new dict in its place, with the same ma_version.

While the probability of such hiccups is low, we still have
to account for it.

Yury



More information about the Python-Dev mailing list