[Python-Dev] Temporary Constantification

Eric Sumner kd5bjo at gmail.com
Mon Jun 26 00:12:52 CEST 2006


It seems to me that the main reason to provide constantification (in a
switch statement or anywhere else) is to be able to optimize execution
by caching the result locally for later use.  The problem comes from
trying to determine exactly when and how a value gets calculated, as
most values in Python are subject to change.

If, however, there was a mechanism by which a cache could be
invalidated when its value changes, the only remaining difference
between cached and non-cached values becomes their execution speed and
memory usage (and possibly impact on the execution speed of other
code).  Thus, I propose a 'cached' keyword much like the static and
const proposed keywords.

In general, a cached value can be used (rather than re-evaluating the
expression) if:
  - The expression has no side effects,
  - The result of all operations is deterministic, and
  - None of the expression parameters have changed since the cached
value was generated

The first two can be determined at compile-time without too much
difficulty (except possibly function calls, I'll get to those in a
minute).  The hard issue here is knowing when parameters have changed.
 These fall into two different categories: literals and name lookups.
Immutable literals don't cause a problem, and mutable literals always
have a side-effect of generating a new object.  There are two ways to
determine if name lookups have changed:
  1) Cache all the parameters, and check them against the current values, or
  2) Be notified whenever one of the parameters changes.

The first option requires a bunch of name lookups whenever the cached
value is considered, which is exactly the problem that caching is
supposed to fix.  To implement the second, each name in each namespace
needs a list of caches that depend on the name, and all name binding
operations need to check the list and mark all dependent caches
invalid.  This is a small performance penalty whenever any name is
rebound, and a large penalty whenever a watched name is rebound.

Function calls can safely be considered volatile, but this would
invalidate many of the uses for caching.  Instead, each function has
an immutable property of being either volatile or deterministic.  Each
deterministic function call maintains its own cache which is
invalidated if  the name to which the associated function (or any of
its parameters) is rebound.  Thus, if a function is rebound to
something volatile, it does not force continual re-evaluation of other
sub-expressions.  Functions should be assumed to be volatile unless
specified otherwise (probably via a decorator).

I'm not particularly familiar with the internals of Python, so I'm not
able to actually assess the feasability or performance implications of
this proposal, but I think it basically covers the problem.

  -- Eric Sumner


More information about the Python-Dev mailing list