[Python-Dev] PEP 550 leak-in vs leak-out, why not just a ChainMap

Jim J. Jewett jimjjewett at gmail.com
Thu Aug 24 10:05:05 EDT 2017


On Thu, Aug 24, 2017 at 1:12 AM, Yury Selivanov > On Thu, Aug 24, 2017
at 12:32 AM, Jim J. Jewett <jimjjewett at gmail.com> wrote:

> The key requirement for using immutable datastructures is to make
> "get_execution_context" operation fast.

Do you really need the whole execution context, or do you just need
the current value of a specific key?  (Or, sometimes, the writable
portion of the context.)


>  Currently, the PEP doesn't do
> a good job at explaining why we need that operation and why it will be
> used by asyncio.Task and call_soon, so I understand the confusion.

OK, the schedulers need the whole context, but if implemented as a
ChainMap (instead of per-key), isn't that just a single constant?  As
in, don't they always schedule to the same thread?  And when they need
another map, isn't that because the required context is already
available from whichever code requested the scheduling?

>> (A)  How many values do you expect a typical generator to use?  The
>> django survey suggested mostly 0, sometimes 1, occasionally 2.  So
>> caching the values of all possible keys probably won't pay off.

> Not many, but caching is still as important, because some API users
> want the "get()" operation to be as fast as possible under all
> conditions.

Sure, but only because they view it as a hot path; if the cost of that
speedup is slowing down another hot path, like scheduling the
generator in the first place, it may not be worth it.

According to the PEP timings, HAMT doesn't beat a copy-on-write dict
until over 100 items, and never beats a regular dict.    That suggests
to me that it won't actually help the overall speed for a typical (as
opposed to worst-case) process.

>> And, of course, using a ChainMap means that the keys do NOT have to be
>> predefined ... so the Key class really can be skipped.

> The first version of the PEP had no ContextKey object and the most
> popular complaint about it was that the key names will clash.

That is true of any global registry.  Thus the use of keys with
prefixes like com.sun.

The only thing pre-declaring a ContextKey buys in terms of clashes is
that a sophisticated scheduler would have less difficulty figuring out
which clashes will cause thrashing in the cache.

Or are you suggesting that the key can only be declared once (as
opposed to once per piece of code), so that the second framework to
use the same name will see a RuntimeError?

-jJ


More information about the Python-Dev mailing list