Re: [Python-ideas] 'const' and 'require' statements

18 Jan 2013

      On Fri, Jan 18, 2013 at 5:06 PM, Haoyi Li  wrote:
...
Compiler-enforced immutability is one of those really hard problems which,
if you manage to do flexibly and correctly, would be an academically
publishable result, not something you hack into the interpreter over a
weekend.
If you go the dumb-and-easy route, you end up with a simple "sub this
variable with constant" thing, which isn't very useful (what about
calculated constants?)
If you go the slightly-less-dumb route, you end up with some mini-language
to work with these `const` values, which has some operations but not the
full power of python. This basically describes C Macros, which I don't think
you'd want to include in python!
If you go the "full python" route, you basically branch into two
possibilities.
- enforcement of `const` as part of the main program. If you do it hackily,
you end up with C++'s `const` or Java's `final` declaration. Neither of
these really make the object (and all of its contents!) immutable. If you
want to do it properly, this would involve some sort of
effect-tracking-system. This is really hard.
- multi-stage computations, so the program is partially-evaluated at
"compile" time and the `const` sections computed. This is also really hard.
Furthermore, if you want to be able to use bits of the standard library in
the early stages (you probably do, e.g. for things like min, max, len, etc.)
either you'd need to manually start annotating huge chunks of the standard
library to be available at "compile" time (a huge undertaking) or you'll
need an effect-tracking-system to do it for you.
In any case, either you get a crappy implementation that nobody wants (C
Macros) something that doesn't really give the guarantees you'd hope for
(java final/c++ const) or you would have a publishable result w.r.t. either
effect-tracking (!) or multi-stage computations (!!!).
Even though it is very easy to describe the idea (it just stops it from
changing, duh!) and how it would work in a few trivial cases, doing it
properly will likely require some substantial theoretical breakthroughs
before it can actually happen.
As James noted, lack of a good answer to this problem is part of the
reason Python doesn't have a switch/case statement [1,2] (only part,
though).

We already have three interesting points in time where evaluation can
happen in Python code:

- compile time (evaluation of literals, including tuples of literals)
- function definition time (evaluation of decorator expressions,
annotations and default arguments, along with decorator invocation)
- execution time (normal execution time - in the case of functions,
function definition time occurs during the execution time of the
containing scope)

We know from experience with default arguments that people find
evaluation at function definition time *incredibly* confusing, because
it means a data value is shared across functions. You can try to limit
this by saying "immutable values only", but then you run into the
problem where dynamic name lookups mean only literals can be
considered truly constant, and those are *already* evaluated (and
sometimes folded together) at compile time:
...
...
...
def f():
...     return 2 * 3
...
dis.dis(f)
  2           0 LOAD_CONST               3 (6)
              3 RETURN_VALUE
(The constant folding in CPython isn't especially clever, but that's
an implementation issue - the language spec already *allows* such
folding, we just don't always detect when it's possible).

So, once you allow name lookups, the question then becomes what
namespace they run in. If you say "the containing namespace" then you
get a few interesting consequences:

1. We're in the same, already known to be confusing, territory as
function default arguments
2. The behaviour of the new construct at module and class level will
necessarily be different to that at function level
3. Quality of error messages and tracebacks will be a potential issue
for debugging
4. When two of these constructs exist in the same scope, is the later
one allowed to refer to the earlier one?

Now we get to the meat of James's suggestion, and while I think it's a
pretty decent take on the "multi-stage evaluation" proposal, it still
runs afoul of many of the same problems past proposals [3] have
struggled with:

1. Name binding operations other than assignment (e.g. import,
function and class definitions)
2. Handling of name binding in nested functions
3. Handling of references to previous early evaluation operations
4. Breaking expectations regarding dynamic modification of module globals
5. Finding a good keyword is hard - suitable terms are either widely
used as variable names, or have too much misleading baggage from other
languages

I can alleviate the concerns about making other components available
at compile time though - if this construct was defined appropriately,
Python would be able to happily import, compile and execute other
modules during a suitable "pre-execution" phase.

The real kicker though, is that, after all that work, you'll have to
ask two questions:
1. Does this change help Python users write more readable code?
2. Does this change help JIT-compiled Python code (e.g. in PyPy) run
faster? (PyPy's JIT can often identify near-constants and move their
calculation out of any frequently executed code paths)

If the answer to that turns out to be "No to both, but it will help
CPython, which has no JIT, run some manually annotated code faster",
then it's a bad idea (it's not an *obviously* bad idea - just one that
is a lot trickier than it may first appear).

Cheers,
Nick.

[1] http://www.python.org/dev/peps/pep-0275/
[2] http://www.python.org/dev/peps/pep-3103/
[3] https://encrypted.google.com/search?q=site%3Amail.python.org%20inurl%3Apytho...

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia