Re: [Python-Dev] PEP 509: Add a private version to dict

21 Jan 2016


      On Wed, 20 Jan 2016 at 10:46 Yury Selivanov <yselivanov.ml@gmail.com> wrote:
...
Brett,
On 2016-01-20 1:22 PM, Brett Cannon wrote:
...
On Wed, 20 Jan 2016 at 10:11 Yury Selivanov <yselivanov.ml@gmail.com
<mailto:yselivanov.ml@gmail.com>> wrote:
On 2016-01-18 5:43 PM, Victor Stinner wrote:
    > Is someone opposed to this PEP 509?
    >
    > The main complain was the change on the public Python API, but
    the PEP
    > doesn't change the Python API anymore.
    >
    > I'm not aware of any remaining issue on this PEP.
Victor,
I've been experimenting with the PEP to implement a per-opcode
    cache in ceval loop (I'll share my progress on that in a few
    days).  This allows to significantly speedup LOAD_GLOBAL and
    LOAD_METHOD opcodes, to the point, where they don't require
    any dict lookups at all.  Some macro-benchmarks (such as
    chameleon_v2) demonstrate impressive ~10% performance boost.
Ooh, now my brain is trying to figure out the design of the cache. :)
Yeah, it's tricky.  I'll need some time to draft a comprehensible
overview.  And I want to implement a couple more optimizations and
benchmark it better.
BTW, I've some updates (html5lib benchmark for py3, new benchmarks
for calling C methods, and I want to port some PyPy benchmakrs)
to the benchmarks suite.  Should I just commit them, or should I
use bugs.python.org?
I actually emailed speed@ to see if people were interested in finally
sitting down with all the various VM implementations at PyCon and trying to
come up with a reasonable base set of benchmarks that better reflect modern
Python usage, but I never heard back.

Anyway, issues on bugs.python.org are probably best to talk about new
benchmarks before adding them (fixes and updates to pre-existing benchmarks
can just go in).
...
...
I rely on your dict->ma_version to implement cache invalidation.
However, besides guarding against version change, I also want
    to guard against the dict being swapped for another dict, to
    avoid situations like this:
def foo():
             print(bar)
exec(foo.__code__, {'bar': 1}, {})
         exec(foo.__code__, {'bar': 2}, {})
What I propose is to add a pointer "ma_extra" (same 64bits),
    which will be set to NULL for most dict instances (instead of
    ma_version).  "ma_extra" can then point to a struct that has a
    globally unique dict ID (uint64), and a version tag (unit64).
    A macro like PyDict_GET_ID and PyDict_GET_VERSION could then
    efficiently fetch the version/unique ID of the dict for guards.
"ma_extra" would also make it easier for us to extend dicts
    in the future.
Why can't you simply use the id of the dict object as the globally
unique dict ID? It's already globally unique amongst all Python
objects which makes it inherently unique amongst dicts.
We have a freelist for dicts -- so if the dict dies, there
could be a new dict in its place, with the same ma_version.
Ah, I figured it would be too simple to use something we already had.
...
While the probability of such hiccups is low, we still have
to account for it.
Yep.