[Python-Dev] The Need for a Declarative Syntax Element (aka Getting PEP 318 back on track)

Mike Pall mikepy-0404 at mike.de
Mon Apr 5 02:50:55 EDT 2004


I wanted to share my thoughts about the issue and wrote the following essay.

Please note: all of this is IMHO of course. Nothing is cast in stone.
I'm very willing to change it to whatever decisions the discussion leads to.

Sorry about the length of this posting. :)

Any and all comments are appreciated. Please send your comments to
the list only, to avoid duplicate copies to my mailbox. Thank you!


The Need for a Declarative Syntax Element in Python

(aka "Getting PEP 318 back on track")

Author: Mike Pall
Release: 2004-04-05

1. Introduction
1.1 Clean and Lean
1.2 Sugar is Mean
1.3 Pep up Your Life

2. Getting it Straight
2.1 Terminology (n); cf. lack of ~
2.2 I Hereby Declare ...
2.3 Definitions
2.4 Now, what?

3. Foreign Territories
3.1 Big Brother in Action: C# Attributes
3.1.1 Syntax
3.1.2 Semantics
3.1.3 Use Cases
3.2 Catch Up, Baby: Java Annotations
3.2.1 Syntax
3.2.2 Semantics
3.2.3 Use Cases With JSR-175 Annotations
3.2.4 Use Cases With javadoc Annotations

4. Pythonic (Ab)use Cases
4.1 Assorted Attributes
4.2 Roaring Registries
4.3 Witty Wrappers
4.4 Proper Properties
4.5 To Sync or Not to Sync
4.6 Lexical Liberty

5. Semantic Wonderland
5.1 What?
5.2 How?
5.2.1 Calling the DO
5.2.2 Using __declare__
5.2.3 Passing The Context
5.3 When?
5.4 What else?

6. Syntax, Syntax on the Wall
6.1 Wishful Thinking
6.2 Round Up The Candidates
6.3 Narrowing the Candidate Set
6.4 And The Winner is ...
6.5 ASCII Art
6.6 Finally

1. Introduction

1.1 Clean and Lean

Python is an imperative language and has very few declarative elements
embedded down in the language specification. And most of them are in the
form of explicit statements (like "def", "class" or "global"). Only a few
are implicit (that's why some people have mixed feelings about "yield").
There are only two declarative delimiters ("*" and "**" in parameter lists)
and there are no non-statement declarative keywords (owing to the fact
that Python has pure dynamic typing).

This is in fact good. That's why we all love Python so much: the syntax
is clean and lean!

Just compare what other languages have embedded deep down in their
lexical analyzer and their grammar: type definitions in C, the massive
lexical and grammatical overloading of "*" and "&" in C++, scope-sensitive
keywords like "static", "volatile" or "private" in C, C++ or Java.
Some of that is pure necessity in languages with static typing, to save
some typing (no pun). But most of it can be described with just one
word: yuck!

Python has (so far) successfully avoided the syntax inflation. Even
classic OO stuff like "classmethod" or "staticmethod" are just builtin
types. You have to invoke (instantiate) them yourself in the body of
a class definition and modify the binding to the method name by assigning
to the class dictionary. Alas that works only *after* the method definition
is done, i.e. it has to be after the method body.

1.2 Sugar is Mean

Unfortunately simplicity comes at a price: a severe shortage of syntactic

This is not necessarily a bad thing as the instance reference dilemma shows:
Some languages use implicit scoping (C++, Java) and provide overrides
("this"). Others use explicit delimiters ("@" in Ruby).

Python does neither and leaves the name of the identifier for the instance
reference up to the programmer (though "self" is a well established
convention). In this case most of us are happy about it, because "Explicit
is better than Implicit" holds here (you may object that "@" is explicit
enough, but well ... then it's implicit in the method definition).

However other issues leave something to be desired. The pressure to
get a nicer syntax for "classmethod" and "staticmethod" has been
traditionally rather low. Probably because neither is *that* common
(though this may depend on your programming style). And this is where
PEP 318 comes into play ...

1.3 Pep up Your Life

The proposal of PEP 318 has resulted in a flurry of activity at the
python-dev mailing list culminating during February and March 2004.
Even shouting and mutual accusations have been reported. Most everone
would be happy to get back to real work, as the subject has been
creating an enormous amount of traffic but little consent.

PEP 318 goes a bit further than simply proposing a nicer syntax for
"classmethod" and "staticmethod". It proposes a generic 'decorator'
syntax that allows you to write your own decorator functions. And you
get to see the decorator definition *before* the method body.

Most of the discussion has centered around what the most desirable syntax
would be. However everyone's definition of 'desirable' differs remarkably.
And the lack of convincing use cases hasn't helped keeping the discussion
on track either.

The early syntax proposals have some similarity with the C# language
feature "[Attribute] definition". The feature is called C# attributes but
cannot be compared directly to decorators nor Python attributes.

That of course sparked some discussion about using decorators for defining
attributes, but this is an entirely different matter.

A few regularly scheduled syntactic duels later, we are still where we
started: a solution in search of its problem. Knowing we want 'something',
but not spelling it out.

My humble hope is that this document helps to improve the quality of
the ongoing discussion.

2. Getting it Straight

2.1 Terminology (n); cf. lack of ~

All this talk about decorators and attributes is really missing the point.

Well, the word 'attribute' suffers severe proliferation in computer science.
Everyone seems to have a different opinion on its meaning. The Pythonic
definition is at best misleading when it comes to our discussion. And
a 'decorator' is a software design pattern -- this is not a language issue.

But if we want to extend syntax we _have_ to talk about language. Computer
language design, that is. So let me rephrase our desire in proper terms:

==> We want to define a new kind of DECLARATIVE SYNTAX ELEMENT for Python.

Now that it's out of the hat, you can gain new appreciation at the
preceeding discussion. Part of the problem: naming conveys meaning.
Not having a proper name for something will lead you astray.

2.2 I Hereby Declare ...

I sure don't want to patronize the language cracks out there. You all know
very well what declarative vs. imperative means.

However giving some examples in context is always useful. So here are
a few of the declarative notions we would like to express, phrased
in natural language statements:

  "This is a classmethod."
  "This function needs to be run at exit."
  "This is an abstract class."
  "This is the documentation for a class attribute."
  "This method has been written by John Doe."
  "This class implements the Foo interface."
  "This method implements the Visitor pattern for this class."
  "This parameter must be an integral type."
  "This compound block shall by synchronized by a lock."
  "This statement should be supressed when DEBUG is not defined at runtime."

All of these statements _declare_ that some other language element
(class, method, function, class attribute ...) has a specific property.

Usually they tell you something about the _effect_ of that declaration,
but most of this knowledge is implied (you ought to know what happens
when you use such a statement).

However in general they do NOT tell you _how_ this effect is accomplished
(i.e. implemented). Nor should they. Declarative syntax owes much of its
power to the effect that most goes on behind the scenes. Someone else
has of course written up in glorious detail how that is to happen.
But you (the user of such a declaration) just get to say what you _want_.
Not more, but no less.

BTW: the word 'statement' in this context means 'natural language statement'.
     In computer languages we have more degrees of freedom, like keywords
     or delimiters. That's why I called it a declarative syntax _element_.
     The discussion about syntactic alternatives deserves an entire section
     of its own (see below).

Ok ... back to our high tech adventure.

2.3 Definitions

Declarative Syntax Element (DSE)
  The syntax element includes the Declaration itself and any lexical tokens
  that are required to make it recognizable as such by the parser. It can
  be either a grammatical production or a compound lexical token. This may
  be a minor issue from a visual point of view, but has subtle implications
  when it comes to applicability of the syntax element.

  The content of the DSE. As far as the grammar goes it could be anything
  from an identifier up to a suite. It could even be in a language of its
  own (don't do that). However looking ahead to the section on semantics
  it makes most sense to use an expression.
  The term 'Declaration' is a bit too generic and would apply to other
  syntactic elements, too (e.g. a method definition includes a declaration).
  Thus it is suggested that the more specific term be used everywhere.

Declarative Expression (DE)
  The most likely syntactical choice for a Declaration. Any standard Python
  expression is possible here.

Declarative Object (DO)
  The result of evaluating a Declarative Expression.

Target of the DSE
  The grammatical element that is the target of the DSE. This could be
  anything from a method definition down to a statement or even an
  expression. The exact range of grammatical variety that could or should
  be permissible is discussed in the sections on syntax and semantics.

Processing of the DSE
  Processing of the DSE involves several steps: parsing the DSE,
  Compilation of the DE, Binding to the Target, Evaluating the DE and
  Applying the DO. These steps may or may not occur in that order.
  Each step may be assigned to a different processing phase.
  The important decision to make is the exact point in time _when_ each
  step is done.

Binding to the Target
  The process of binding either the DE or the DO to the Target.

Evaluation of the DE
  The process of evaluating a DE gives us a DO. This however does not (yet)
  affect the Target.

Applying the DO
  The process of applying the DO to the Target of the DSE.

2.4 Now, what?

Ok, now that we have set things straight its time to get our hands dirty.

I suggest we first take a peek over the fence, have a closer look at
use cases, take a refreshing detour to semantic wonderland and
then visit syntax hell again.

Gee, like all good things, the fun part is at the end.

3. Foreign Territories

3.1 Big Brother in Action: C# Attributes

Ignoring C#, just because it comes from the company who brought you EDLIN
and BlueScreens is NOT a wise move.

You gotta keep up with the news or you loose track. What the heck, the
language spec (ECMA 334 -- freely available for download) goes so far as
to explicitly spell out that you should NOT use Hungarian notation for
identifiers. Whoa! Maybe they made their mind up?!?

And for our little discussion C# has quite something to offer. Not only
does it have an extensible declarative element. The same company brings
you to new heights of enjoyment with their .NET and Longhorn initiatives.
And part of that all-new lifestyle is using C# in every imaginable corner.
So we get some desperately needed use cases to fuel our discussion.
Even overly contorted ones -- promise!

So let's explain the way C# attributes work in our terms.

3.1.1 Syntax

An attribute section consists of one or more C# DSEs prepended to
the target.

A C# DSE is an optional attribute-target plus a colon and a list of
attributes enclosed in brackets:

DSE ::= "[" [attribute-target ":" ] attribute ("," attribute)* [","] "]"

A C# attribute cannot be an arbitrary expression. It looks like a call
to a constructor, but without "new":

attribute ::= type [ "(" arguments ")" ]

However the arguments look like any regular call and allow for arbitrary
constant expressions. Both positional and named arguments are allowed.
I'll save you the gory details of the syntax.

Most of the time the target of the DSE is implicitly derived from context.
The attribute-target allows you to specify the target explicitly.
The permissible targets are:

- An assembly or module.
- A type declaration.
- A method or operator declaration.
- A parameter or return value declaration.
- A field declaration.
- An accessor declaration.
- An event declaration.
- An enum declaration.

Defining an attribute is simple: write a class that derives from the
abstract class System.Attribute or any subclass of it.

3.1.2 Semantics

No big surprises so far. The semantics however are a bit tricky:

During compilation the argument expressions are evaluated. They must
resolve to constant arguments. An instance constructor is derived from
the type and the type of the arguments. The type, the constructor and
the arguments are bound to the target in the image for the module.

During runtime when an attribute is accessed its constructor is called
and returns the attribute instance. This is available with the
GetCustomAttributes() reflection method.

C# has static typing so compilation of one module requires access to
all referenced modules. Several 'magic' attributes assigned to members
of the referenced modules are evaluated at compile time to achieve
interesting effects:

- [AttributeUsage] specifies the scope of an attribute definition.
  I.e. which target(s) it applies to, whether it is inherited and if
  multiple applications to the same target are allowed.

- [Conditional] allows conditional compilation. This allows you to omit
  the call to conditional methods defined in other modules depending on
  defines set while compiling the module containing the call. So basically
  it is sufficient to tag the target once and get the effect everywhere
  else whenever it is used.
  Confused? Please read section 24.4.2 of the spec.

- [Obsolete] marks some elements as obsolete. When the compiler encounters
  it on a referenced element it emits a warning.

- The standard mentions a [Flags] attribute that applies to enums and
  redefines it to automatically assign bitfield masks instead of values to
  the enum members. However I have not found enough information to guess
  the implementation details.

There are a couple of other attributes that directly influence the
runtime. E.g. the JIT compiler takes hints from the [StructLayout]
or the [MarshalAs] attributes. And automatic permission elevation
(including permission checks) is performed when code security attributes
are present.

C# has another interesting feature: it allows you to omit the 'Attribute'
suffix from say [FooAttribute]. So when the compiler encounters [Foo]
and finds a type named Foo that is not derived from the Attribute base
class it searches again for FooAttribute. This is useful for interfaces
where you can get both declarative and imperative behaviour using the
same name.

Ok, so far so good. But you cannot achieve the same kind of effects with
user defined attributes in C# since that would either require extending
the compiler or force compile time evaluation of attributes.

This is where XC# comes in: this extension to the C# compiler opens up
the compilation entity tree and allows you to operate on that. It comes
packaged with some basic attributes: declarative assertions (contracts),
code coverage analysis, design rule verification, spell checking and
(oh well) code obfuscation. You can write your own custom attributes
and you can inspect and modify every aspect of the grammar on the fly.

3.1.3 Use Cases

The following use cases have been culled from various documents at MSDN
and other sources. I've stripped the implementations and just left the
bare declarations and definitions to avoid clutter.

*** An attribute definition uses attribute declarations, too:

public class HelpAttribute: Attribute {
    public HelpAttribute(string url) {
        this.url = url;
    public string Topic = null;
    private string url;
    public string Url {
        get { return url; }

The attribute we just defined can be used like this:

public class Class1 {
    [Help("http://.../Class1.html", Topic = "Method1")]
    public void Method1() {}

*** A DEBUG Conditional:

public static void DebugMessage(string msg)

// Somewhere else, even in a different module
    // The *call* to the method is omitted if DEBUG is not defined while
    // compiling *this* module.

*** The enum magic:

public enum FileModes {

This one gets you 1, 2, 4, ... instead of 0, 1, 2, ...

*** Custom serialization (marshaling, pickling) made easy:

public class PartiallySerializableClass
 public string serializeMe;
 public int serializeMeToo;
 [NonSerialized()] public string leaveMeAlone;

Not only is this easier to specify than imperative serialization support.
It is also less error prone because the serializer and deserializer is
autogenerated from the type information.

*** An XC# example of a declarative assertion (contract) that dynamically
    adds assertion checking code at compile time:

void WriteHashCode([NotNull] object o)

... generates the following code with an imperative assertion:

void WriteHashCode(object o)
    Debug.Assert(o != null);

But the really interesting consequences of a declarative assertion are:
- It allows for runtime introspection.
- If you include it in the interface declaration, it automatically
  applies to any implementation. This allows for inheritance of
  declarative aspects.

*** Calls to unmanaged code:

class NativeMethod
    [DllImport("msvcrt.dll", EntryPoint="_getch")]
    public static extern int GetCh();

*** Defining security behaviour:

[SecurityPermission(SecurityAction.Deny, Flags =
private static void CallUnmanagedCodeWithoutPermission()

internal static extern int puts(string str);

*** WinFS extensions:

public class Drive
    public string DriveName
        [Probe] get { return drive; }
        set { ; }
    public static Drive GetLogicalDrive(string name)
    [Probe(Uri="GetAll", ResultType=typeof(Drive))]
    public static Drive[] GetLogicalDrives()

*** Indigo (Longhorn remote messaging framework):

URL: http://msdn.microsoft.com/longhorn/default.aspx?pull=/library/en-us/dnlong/html/indigoattrprog.asp

public interface IHelloChannel :
System.MessageBus.Services.IDatagramPortTypeChannel {

    [return: System.MessageBus.Services.WrappedMessageAttribute
    string Greeting(string name);

[DatagramPortType (Name = "Hello", Namespace = "http://tempuri.org/")]
public class IndigoService
    public string Greeting (string name)

3.2 Catch Up, Baby: Java Annotations

Java has plenty of declarative syntax elements. Some people argue that
Java is overly declarative. Some like it. Some don't.

But up until recently it had no support for an extensible DSE.
Ok, ok ... Java always had javadoc tags embedded in comments:

 * An abstract class with foo-like behaviour.
 * @author  John Q. Doe
 * @see     Bar
public abstract class Foo { ... }

These are processed by an extra tool (javadoc) to autogenerate the docs
from the sources. Calling this a DSE is ok, but it is not part of the
language itself and as such has no access to its structure (you cannot
pass constant expressions).

Some say it was a quick hack initially. But it stuck. And it got abused.
A lot. Examples below.

It seems now we have a plethora of tools that scan for javadoc tags and
then do some pre- or post-processing of your source and class files.

But then came JSR-175 (public review ended November 2003) ...

3.2.1 Syntax

JSR-175 introduces a new syntax for 'Annotations'.

An annotation definition starts with the "@interface" keyword. A severly
restricted variant of the interface definition syntax is used. It has
implicit and final inheritance from java.lang.annotation.Annotation.

Well ... to me it looks just like a glorified struct declaration anyway.

An annotation can be applied to any declaration: classes, interfaces,
fields, methods, parameters, constructors, enums, local variables and
enum constants (implied field declarations). However the same annotation
type can be applied at most once to any entity. Application of an
annotation uses one of three syntactic variants:

"@" type "(" member-name "=" member-val ["," member-name "=" member-val]* ")"
"@" type
"@" type "(" single-member-value ")"

The latter two are just shorthands for variants of the first one with
an empty list or for a single member type. This muddies the distinction
between types and constructors quite a bit though. The values must be
constant expressions of course.

3.2.2 Semantics

During compilation the constant expression arguments for annotations
are evaluated and combined into a set of initializers for the structure
defined by the annotation interface.

There are three retention policies for this set of initializers:

- Source-only annotations are dropped at compile time. This is supposed
  to replace/augment javadoc tags in source comments. Local-variable
  annotations are always source-only.

- Class-only annotations are stored in the class file but not loaded
  into the runtime.

- Runtime annotations are stored in the class file and are loaded
  by the runtime.

Reflection support via java.lang.reflect.AnnotatedElement works only
for runtime annotations. Reflection is easy to use because only a single
annotation for each type may be present for each element. The classes
used to represent annotations are created at runtime using dynamic proxies.

Owing to static typing the Java compiler has to read referenced class files
and understands a few special annotations (excerpt):

- @Target specifies the allowable targets for an annotation definition.

- @RetentionPolicy() specifies the retention policy for an annotation

- @Inherited allows for inheritance of annotations from superclasses
  (this is NOT inheritance for annotation definitions themselves).

To summarize: nothing spectacular and not very dynamic. It shows that
the feature has been added to the language as an afterthought. The key
improvement over javadoc tags is that external tools do not have to
parse Java source files (unless pre-processing is required) and that
annotations can be stored in the class file. This however does not
obviate the need for some post-processors either.

3.2.3 Use Cases With JSR-175 Annotations

Since the spec is pretty young you won't find many use cases right now:

*** Giving a new tune to javadoc stuff:

public @interface RequestForEnhancement {
    int    id();
    String synopsis();
    String engineer();
    String date();

    id       = 2868724,
    synopsis = "Provide time-travel functionality",
    engineer = "Mr. Peabody",
    date     = "4/1/2004"
public static void travelThroughTime(Date destination) { ... }

*** This is the definition for the @Retention meta-annotation:

public enum RetentionPolicy {

@Documented @Retention(RUNTIME) @Target(ANNOTATION_TYPE)
public @interface Retention {
    RetentionPolicy value();

3.2.4 Use Cases With javadoc Annotations

*** Some stuff for BEA WebLogic using Java web services annotations:

 * @jws:location http-url="http://localhost:7001/webapp/Bank.jws"
 * @jws:protocol http-soap="true"
public interface BankControl extends ServiceControl

*** Security annotations and conversational support for web services:

 * @common:security single-principal="false"
public class PurchaseSupplies implements com.bea.jws.WebService
     * @common:operation
     * @jws:conversation phase="start"
    public void requestPurchase()

     * @common:operation
     * @jws:conversation phase="continue"
    public void approvePurchase()

     * @common:operation
     * @jws:conversation phase="finish"
    public void executePurchase()

*** Embedded SQL for Java:

 * @jc:sql statement::
 *     SELECT name
 *     FROM employees
 *     WHERE name LIKE {partialName}
 * ::
public String[] partialNameSearch(String partialName);

4. Pythonic (Ab)use Cases

Leaving the exact choice for the syntax aside I used my personal favourite
everywhere. I suggest you substitute it with your favourite choice just
to see the visual effect.

I'm sorry, but I gave up trying to identify the original contributor
for each of these examples. Let's say it's a true community effort.
A big thank you to everyone for your input!

4.1 Assorted Attributes

Attribute setters modify the target attributes but not the target itself.

*** The first one is generic, the second one could inherit from it:

<| funcattrs(foo = 1, bar = "baz") |>
def foobar(x):

<| rstattrs(
       arguments = (1, 0, 1),
       options   = {'class': directives.class_option},
       content   = 1)
def admonition(*args):
    return make_admonition(nodes.admonition, *args)

<| rstattrs(content = 1) |>
def attention(*args):
    return make_admonition(nodes.attention, *args)

*** Here is a nice example to show how neatly the vertical bars line up:

    <| Author("Joe Q. Random") |>
    <| Copyright("ACME Inc.")  |>
    <| Release("2004-04-01")   |>
    <| Version(0, 0, 1)        |>

    def industrialStrengthMethod(self):
        raise BlueScreenError(random.Random())

*** Hints for mad metaclasses with magic mangling motives:

    def overrideMe(self, foo):

*** With clever definition of the DO application process (see semantics)
    we get docstrings *before* the corresponding definition. I guess
    nobody will like the idea, but anyway:

<|"""This is a new style docstring for a class.

     Bla bla bla ...
     Bla bla bla ...
     Bla bla bla ...
class HeavilyDocumented(object):

    <|"Class attribute docstrings now have the proper position."|>
    BUFSIZE = 8192

    <|"This is a new style docstring for a method."|>
    def method1():

    <|"Dynamic %s [%s]." % ("docstring" % time.ctime()) |>
    def thisReallyWorks():

4.2 Roaring Registries

Registry functions add a reference to the target in some other object.
Often combined with attribute setters.

*** This one shows where it would be convenient to have a declarative
    type with the same name as an inheritable imperative type:

<| WebService("soap://localhost:8888/mywebservice") |>
class MyWebService(WebService):

    <| ServiceMethod(None, str) |>
    def setName(self, name):
        self.name = name

    <| ServiceMethod(str) |>
    def sayHello(self):
        return "Hello %s!" % self.name;

*** For more complicated interfaces to environments with static typing
    it would be nice to have declarations that apply to parameters
    and/or the return value:

    <| ServiceMethod |>
    def <|str|> method(self, <|str|> foo, <|int|> bar):

4.3 Witty Wrappers

Wrappers generally wrap functions or methods with their own functions.
The latter is then stored in the current dictionary to replace the
original function.

Oh BTW: the 'synchronized' wrapper has a section of its own. See below.

*** We know these two well enough by now:

    def AClassMethod(cls, foo):

    def AStaticMethod(foo):

*** Using multiple decorators:

    <| Decorator1(), Decorator2("Just a random string") |>
    def someMethod(...):

    # which is equivalent to:

    <| Decorator1() |>
    <| Decorator2("Just a random string") |>
    def someMethod(...):

*** Generics for Python? Well, if you must:

def f(x):
    print x, 'is an int'

def f(x):
    print x, 'is a string'

*** A variation of the theme: runtime type checking:

<|CheckSignature(int, int)|>
def f(x, y):
    return x+y+1

*** Interface evolution does not need to be messy:

    <|Context("version", 1)|>
    def movePlayerMessage(self, arg1):

    <|Context("version", 2)|>
    def movePlayerMessage(self, arg1, arg2):

Yes, the second declaration needs to be able to get at the dict entry for
the first one (see semantics).

*** Defining a native method interface:

# The first one should be applicable to the module. TODO: But how?
class GtkScrolledWindow(GtkWindow):

    # Yow, the great renaming ... the GNOME people would love it ... :)

    def setPolicy(self,
                  <|GtkPolicyType|> hScrollbarPolicy,
                  <|GtkPolicyType|> vScrollbarPolicy): pass

    def <|GtkShadowType|> getShadowType(self): pass

4.4 Proper Properties

The following has been proposed to make property definition easier:

class Foo(object):

    def x(self): return self.__x

    def x(self, newx): self.__x = val

    def x(self): del self.__x

The problem with this is that propxxx needs to go through some hoops to
update the previous definition of x. This includes using sys._getframe()
and then searching the local and global dicts of the frame. Oh dear!

This is a good argument why the DO should get the target object _and_
the dict where the TO is to be stored.

A different idea would be to follow the C# precedent which provides an
extra nesting level for properties:

class Foo
    private int x_;

    public int x {           // <-- !!
        get { return x_; }
        set { x_ = value; }  // value is an implicit identifier

Foo foo = new Foo()
foo.x = 1
int y = foo.x

However I don't know how to do this in Python. Declaring a method
as a property and using nested functions for the getter/setter does
not work because the locals (i.e. the nested functions) are lost:

class Foo(object):

    def x(self):                          # Does not work!
        def get(): return self._x
        def set(value): self._x = value

Doing the same thing with an inner class might work somehow, but I'm not
sure. We could get inheritance for properties, too!?

TODO: I haven't explored this any further. More input is welcome.

4.5 To Sync or Not to Sync

Java has special declarative syntax for synchronized methods and blocks:

public abstract class FancyControlStream extends FancyStream {

    public synchronized void putMsg(byte[] data)

    public int kickBack()
        synchronized (obj) {
            obj += len;

This would be a good use case for our DSE, too. But ... being only
applicable to a method definition just doesn't cut it:

    def put(self, item):

This is way too tedious for some stuff. So, what about:

class ThreadSafeQueue(Queue):
    pass                          # Yes, this is all that's needed.

On the other hand you really want to use fine grained locking for more
ambitious endeavors. And this is where you need to use locks at the
compound statement level (called suites in Python and blocks elsewhere):

    def stopStream(self):

        <| Synchronized(self.recv_lock) |>
        if self.recv_len<=0:
            if self.recv_thread:


        <| Synchronized(self.send_lock) |>
        while self.send_len>=0 && self.send_close:

Getting it at the statement level would be nice, too. But you could work
around this with 'if 1: SUITE'.

How you would _implement_ this is another story. And this is where the
bad reputation from Java's 'synchronized' comes from: there have been some
initial design flaws. However just because you have one bad precedent
does not mean we can't do it better. IMHO declaring some parts of a program
to be regions with synchronized access is inherently useful.

4.6 Lexical Liberty

To get at the lexical level you may need a different delimiter which is
evaluated at compile time. I have opted not to replace it with a different
one in the following examples.

Please don't flame me. I'm just documenting some strange ideas:

class Whatever(object):

  # Uh, imports and assignments in a declarative compile-time context?
  <| from declarative.lexical import * |>
  <| T = Conditional("TRACE") |>

  # These statements are optimized away at module load time.
  def run(self):
      <|T|> self.tracePrint("Startup")
      <|T|> self.tracePrint("Shutdown")

  # Combining the power of Python with the syntax of SQL. Or vice versa?

  def query1(self, start, end)
      return <|SQL|> "SELECT * FROM table WHERE name BETWEEN :start AND :end"

  def query2(self, start, end)
      return <|SQLPy|> SELECT * FROM table WHERE name BETWEEN :start AND :end

  def query3(self, table, a, b)
      return <|PySQL|> SELECT "*" FROM table WHERE "f1" == a*2 AND "f2" >= a+b

  # C has inline assembler. Python has inline C.

  def fast1(self):
      <|C|> "{ for (int i=0; i<100; i++) somefunc(i) }"

  def fast2(self):
      <|C|> { for (int i=0; i<100; i++) somefunc(i) }

  def fast3(self, a, b):
      x = self.foo + <|C|>
      int dummy(int a, int b)
          for (int i=0; i<1000; i++) { a=a+3*i+(b&15); b+=a; }
          return b;
      return x

  int fast4(int a, int b)
      for (int i=0; i<1000; i++) { a=a+3*i+(b&15); b+=a; }
      return b;

# Sssshhh! Don't mention the r-word!

<|LexicalSubstitute("@", ("self", "."))|>
<|Grammar("parameter_list").afterParsing( lambda p: p.insert(0, "self") )|>
class Wherever(object):
    def __init__():

    def setFlag(flag):

5. Semantic Wonderland

5.1 What?

First we need to specify what kind of grammatical element a DSE may
contain. I could go on an reason about all the possibilities. But
nobody ever mentioned anything else than an expression. It seems to be
the most straightforward approach.

This document is already far too long. A DSE contains a DE. So be it. :)

Expressions evaluate to a single object. This is the DO in our case.

For multiple declarations in a single DSE there are two possibilities:
- Allow a plain expression and add the commas to the DSE grammar.
- Allow an expression_list and add tuple handling.

Multiple DOs from multiple adjacent DSEs result in a DO list (in the
given order).

The target the DSE applies to is to be specified by the syntax.
To be able to bind the DSE to the target it needs to be bundled into
an object. We call this the Target Object (TO).

What the TO needs to contain depends on the target. The issues with
arbitrary grammatical elements are discussed below (see 'What Else?').

So let's restrict the target to functions, methods and classes for now.
Each one of them already bundles its contents into an object.

But _defining_ one of them involves adding its name to a dictionary, too:

- The Target Dictionary (TD) is required by the DO for some use cases.

- The name to be defined is accessible from the TO for all three types.
  But it is either read-only (for functions) and/or modifying it may not
  get you the desired effect. It is unclear whether we have any use case
  that requires storing a different name in the TD instead of the one used
  in the target. Being able to supress the store might be desirable, too.
  TODO: Ideas?

5.2 How?

[My bet is that this section will be the most controversial.]

We need to specify how each of the required steps is to be performed.
First the straightforward stuff:

- Parsing the DSE: since this is a plain Python expression it is to
  be performed by the compiler.

- Compilation of the DE: dito.

- Binding to the target: the Pythonic way to do this is to emit code
  for the module or class initialization code objects. The code gets
  to mangle the compiled DE, the target, the target name and the TD.

  It is an implementation detail whether ...
  - a call to each DO is explicitly emitted
  - whether we call a builtin function or emit a new bytecode that
    gets passed the TO and all DOs.
  The latter may have some advantages for future extensibility.

  TODO: Code generation needs to be specified in detail.

- Evaluating the DE: this is done by the code compiled from the DE.

Now the troublesome part: applying the DO.

5.2.1 Calling the DO

The original proposal was derived from the way stuff like classmethod

*** TO is a function:

def func(x):

*** TO is a method:

class Foo(object):
    def method(args):

So we require the DO to be a callable and apply it by calling it,
passing the TO. The return value is then used as the new TO and stored
under the target name in the TD. I.e.: TO = DO(TO)

Subtle side effect: The TD may contain a previous element named like
the target before the target definition is performed. This is available
to the DO because the TO is not stored until all DOs have been applied.

This cannot be achieved without some renaming games by doing it the
classic way.

There are three common cases for DEs:

*** The DE is a function call that evaluates to an inner function. The DO
    is the inner function. It may be bound to derivatives of some arguments
    of the DE. Applying the DO means calling the inner function.

def funcattrs(attrs):
    def inner(func):
        for name, value in attrs.iteritems():
            setattr(func, name, value)
        return func
    return inner

*** The DE is a class constructor and evaluates to an instance of the
    class. The DO is an instance. It may hold derivatives of some arguments
    of the DE. Applying the DO means calling the instance (which only works
    if you define a __call__ method).

class funcattrs(object):
    def __init__(self, **kwds):
        self.attrs = kwds
    def __call__(self, func):
        for name, value in self.attrs.iteritems():
            setattr(func, name, value)
        return func

*** The DE is a class and evaluates to a class. The DO is a class.
    Applying the DO means to call the constructor of the class.
    A side effect of this is that the resulting TO is an instance of
    the class.

    The primary example for this are classmethod and staticmethod which
    are builtin types.

5.2.2 Using __declare__

Reusing the callable attribute of objects for declarator application
has a few drawbacks:

- Any callable may be used as a DO. There is no error checking. Mistaking
  imperative classes for declarative classes, dropping empty braces,
  scoping mismatches or just plain carelessness will NOT be caught.

  This may be very hard to track down since the resulting TO is not
  checked for validity. Errors may pop up very late or never.
  Any error message you might get will probably be misleading.

- A class cannot have both declarative and imperative behaviour.
  However as the precedent given by C# shows, this might be quite useful.

- Inheritance is a useful concept for declarative behaviour, too.
  Thus it is most likely that DE is a class constructor. This requires
  the use of __call__ which is neither self-documenting nor exactly
  easy to explain to newbies.

I propose to use a new kind of slot with the name "__declare__".

The DO must define this slot and it must be bound to a callable.
Applying the DO means calling the slot and passing the TO. The return
value is used as the new TO and stored under the target name in the TD.
I.e.: TO = DO.__declare__(TO)

This has the following advantages:

- Existing objects cannot be accidentially used as declarative expressions.

- Error checking is done at module load time and misuse will abort
  the load with a unique error message exactly pointing out the problem.

- Defining a method with the name __declare__ makes your intention
  crystal clear. Newbies can find out about the meaning easily by
  searching for __declare__ in the documentation. Compare this with
  the value the documentation for __call__ would have to a newbie.

- Whenever you see a declaration you know exactly where to look in the
  class definition to find out what it does.

- Requiring a __declare__ slot strongly encourages to use classes instead
  of functions for declarative behaviour. Since classes are inheritable,
  this will over time improve the quality of the code base.

  Using functions is still possible but one has to set the __declare__
  slot explicitly in the function dictionary. You can even avoid defining
  an inner function if you don't need bound DE arguments (otherwise
  a class would be the cleaner solution).

- We can easily support the old and the new way to apply classmethod
  and staticmethod. We also can adapt the new calling sequence to
  our needs without breaking compatibility with existing uses of
  classmethod and staticmethod (see the next section).

- We may set __declare__ slots on internal types if we want to give
  them declarative behaviour without necessarily making them callables.
  E.g. you can have strings behave like docstrings if used declaratively.

I know very well that adding a new slot requires changes in several
places. But I think the benefit outweighs the required work by far.

TODO: Write up what needs to be done to add a new slot.

5.2.3 Passing The Context

As you can see from the use cases, some of them need access to the TD.
The current workaround involves sys._getframe() and is ugly beyond belief.

We can avoid this problem by passing the TO *and* the TD to the DO.
I.e.: TO = DO.__declare__(TD, TO)

This may be required, too for some future extensions that involve targets
other than functions, methods and classes. In general it is not a bad
idea to pass the context where the target is defined.

Since applying the DOs is usually done at module initialization there
is no performance penalty for passing two arguments instead of one.
Code generation would be pretty easy, too.

Summary: we gain flexibility with minimal cost.

TODO: Check the issue mentioned above about modifying the target name
      before it gets stored in the TD. We could pass a modifiable
      reference to the name somehow but that would complicate the code
      generation for this case quite a bit. Ideas?

5.3 When?

There are basically two choices for the temporal behaviour of the DSE:

- A DSE with access to the lexical level needs compile time evaluated DEs.
  As indicated in other sections this is not within the current scope
  of this document.

- Any other imaginable DSE is only useful if the DE is evaluated and
  applied at module initialization time. Any runtime-only behaviour
  can be implemented on top of it.

This implies that all other processing steps need to be done at compilation
time (parsing the DSE, compilation of the DE and binding to the target).

Phew. That was easy.

A subtle issue that has been discussed previously is the order of
application of the DEs. Previously you had to write things backwards
to get the correct order of application (d first, then e):

def f(x):

Thus the question arised, what the order might be with the new syntax:

def f(x):

The natural interpretation depends a little bit on the syntax chosen
(whether the DSE is before or after the function name). But with my
favourite syntax there is no ambiguity: d is applied before e.
Also <|d,e|> is equivalent to <|d|> <|e|>.

It helps to remember that Pythons existing declarative statements
(such as "def") are implemented in an imperative way. And this is
clearly top-to-bottom and left-to-right.

If the order of application is important, then it is the duty of the user
to get the order of the DEs right. It is however deemed unlikely that
this is a problem in reality.

TODO: I cannot find a way for the DO class author to specify the
      preferred order of application (other than through documentation).
      Priorities would not help since they would need to be attached
      to the DSE, which is pretty useless (remember: the compiler does not
      see the DO class). Comments?

5.4 What else?

The 'synchronized' issue highlights the need for having declarative
syntax that is applicable to a wider range of syntactic constructs.

But there is some trouble ahead with Python: methods and classes are first
class objects. Compound statements are not.

Passing a method or a class to a function to modify them is easy.
Inserting the returned object (possibly a wrapper) into the module or
class dictionary is easy, too. Getting the same level of support for
compound statements or even individual statements is way more difficult.

You basically have two choices for this:

a) Make statements first class objects and somehow optimize this artifact
   away by using late code combining.

b) Allow compile time evaluation of a declarative element giving it access
   to the lexical level.

Ok, so a) is more akin to the way statements inside a class definition
incrementally build up the class. You'd have to plan ahead a bit and
think how this could be extended to declarations for parameters and
other stuff, too.

But b) is a really a much more powerful tool. Forget about CPP macros.
Forget about SQL preprocessors. You can do all of this yourself now just
by writing some Python functions. C has inline assembler, Python could
have inline C! Embedding foreign languages, extending the grammer, even
redefining your favourite lexical tokens: you get it, if you ask for it.

Just a thought: you could do b) 'the language way' and open up the
compiler or 'the text processing way' and open up the input stream.
The latter would be pretty easy to do, I think. But then you may
not get to see or modify all the compiler context you need.

Both issues need A LOT MORE discussion and are certainly NOT something
for the 2.4 time frame or for PEP 318.

However we should take a mental note about two things:

- A new declarative element SHOULD have the potential to be applied
  to almost any other syntactic element. This must be reflected in
  the initial syntactic _specification_ but not necessarily in the
  initial _implementation_.

- Having a declarative element evaluated at compile time holds big
  potential for the future. Although this could emulate any other more
  traditional kind of declarative element, the framework for this
  is just not ready today.
  So in fact I do NOT propose to specify such an element right now.
  However I propose to either
  a) allow for a later extension of the declarative element that we
     are defining now
  b) define a second declarative element later that shares most
     of the syntax but has different temporal evaluation behaviour.

6. Syntax, Syntax on the Wall

Hey, the first Python beauty contest in history. Quick, place your bets.

6.1 Wishful Thinking

The Syntax ...

Requirement  Requirement Definition:
Mnemnonic    The Syntax ...
|LEXICAL|    MUST fit into the existing lexical structure.

|PARSER|     SHOULD put no extra burden on the existing parser.

|GRAMMAR|    MUST fit into the existing grammar.

|GENERIC|    SHOULD allow for the declarative element to be applied to
             almost any other syntactic element.

|EXPR|       SHOULD allow for a general expression as the content of the
             declarative element.

|SPLIT|      SHOULD allow for lengthy element content that may need to
             be split across several lines.

|SEQ|        SHOULD allow for a convenient notation for a sequence of
             declarative elements.

|CONCISE|    MUST be concise.

|VISUAL|     MUST stand out visually.

|DISTINCT|   SHOULD be immediately recognizable as a distinct language

|MIMIC|      SHOULD NOT mimic completely unrelated features of other
             common programming languages.

|NEWBIE|     SHOULD NOT overly confuse newbies. :)

There are a lot more issues, but I think I covered the most important ones
from a language perspective. Feedback is appreciated of course.

>From a language perspective not much attention has been given to |GENERIC|.
This is what worried me most and got me to write this essay.

6.2 Round Up The Candidates

The existing lexical and grammatical structure allows ...

{STATEMENT} a statement with a (possibly new) keyword:
            as DECL

{COMPOUND}  a compound statement with a (possibly new) keyword:

{EXTENSION} to extend an existing grammatical element:
            def f(x) DECL:

{CONTEXT}   an existing lexical element in a different context:
            def [DECL] f(x):

{REDEFINE}  redefining the meaning of an existing grammatical construct:
            def f(x):

{UNARY}     a new introducing delimiter:

{ASYM}      a new asymmetric paired delimiter:

{SYM}       a new symmetric paired delimiter:

I guess there are some more variations on this theme, but none too relevant.

6.3 Narrowing the Candidate Set

Well our BDFL has already spoken out and narrowed the set considerably.
But I think it's still worthwile to discuss this in detail:

{STATEMENT} has not been received well, because it would almost certainly
necessitate a new keyword. And neither has a good one been proposed nor
should the introduction of a new keyword be taken lightly. Getting
|GENERIC| straight might prove to be difficult, too. And don't forget
about |EXPR| and |VISUAL| while you are at it.

Even if we manage to find a good name, it's doubtful that everyone seeing
such a construct knows that it is evaluated in a different context
than almost any other statement (violating |NEWBIE|).

{COMPOUND} looks strange to me because it does not fit well into the
lexical structure (putting aside the variants that violate |LEXICAL|).
Compound blocks usually contain statements and not expressions. And of
course it only works if the target of the declarative element is a block
(violating |GENERIC|).

Prepending a new kind of block to an existing target block introducer is
confusing because one may expect another level of nesting here (which
would violate |CONCISE|).

Adding a new kind of block just _after_ the target block introducer destroys
the visual link between the introducer and the body of the block.

The compact variant (putting a single declarative element right after
the colon) shares the concerns about |NEWBIE| with {STATEMENT}.

In short I cannot find a compelling reason to add a new kind of compound
statement to the language. And a compound declarative statement is ...
well ... awkward (considering the lack of precedent set by other languages).

{EXTENSION} has been thrown out early because it violates |CONCISE|,
|VISUAL| and |MIMIC|. But I think far more important are the violation
of |SEQ| and |EXPR|.

{CONTEXT} has been discussed to death but this is just because the initial
discussion focused on variants of this syntax. Other than that it has
few merits. Since it is an existing lexical construct (list displays)
it violates |GENERIC|, |DISTINCT| and |NEWBIE|. Depending on the position
it may also violate |PARSER|, |GRAMMAR| and |VISUAL|.

But I think the most compelling reason against it, is that it hurts the
eyes with |SPLIT|. The existing use cases for C# indicate that this would
be pretty common.

{REDEFINE} is problematic because it redefines an existing (though useless)
grammatical construct and as such violates |GRAMMAR| and |DISTINCT|.

The grammatical meaning could be redefined, but it would only work just
before a block introducer (violating |GENERIC|). Just because this syntax
mimics C# somewhat is not a good enough reason to introduce the same syntax
to Python. It cannot mimic it completely anyway due to lexical differences
(break-on-white-space vs. break-on-lines-or-sequence-delimiters).

It also violates the principle of least surprise: running a program using
this construct in an older version of Python won't get you a syntax error.
And the desired effect is dropped silently -- which can be a very dangerous
thing to do.

{UNARY} fails on |VISUAL| for |SPLIT| because the end may be hard to match
to the start. It may be hard for |PARSER| in intra-line contexts because
of |EXPR|. Restricting the construct to be used only on extra lines
would violate |GENERIC| and maybe |CONCISE|, too.

I do not think we should waste one of the three remaining unused
non-alphanumeric ASCII characters (@, $, ?) for this purpose. It would
miserably fail on |MIMIC| anyway (except for the Java precedent).

Prepending a dot is not an option because it would not conform to |GENERIC|,
|VISUAL|, |DISTINCT|, |MIMIC| and |NEWBIE|. Also GvR indicated that he
wanted to reserve this construct for a future "with" statement.

{ASYM} doesn't look bad if the construct is used on extra lines. It may be
a bit harder to spot in an intra-line context.

{SYM} looks good in any context. It would be natural to define it like
any other paired sequence delimiter ((), [] and {}). This allows for
flexible line breaks and broad grammatical applicability.

6.4 And The Winner is ...

Beware: everything I wrote is IMHO of course, but this applies especially
to this section!

{ASYM} saves a single character to type over {SYM} (which is not much of
a gain). OTOH the asymmetry spoils the visual effect quite a bit.

So my personal winner is: {SYM}

Your mileage may vary though. Even if you disagree with my reasoning, the
best the preceeding section buys you is a systematic way to discuss all
alternatives now. Go ahead!

6.5 ASCII Art

If (BIG IF!) there is a concluding discussion that indeed supports my
reasoning about the lexical alternatives we should decide/vote/pronounce
on a delimiter pair.

Since @, $ and ? usually do not come in pairs, the shortest possible
delimiter pair is two characters long each.

We should include one of the four visual pairs that ASCII has proudly
brought to your home for the past decades ((), [], {} or <>). For best
visual effect the pair should be used for the outer characters. The inner
characters may be a single non-paired character or another set of paired

Of course neither the opening nor the closing delimiter should collide
with an existing delimiter. Nor should the inner character be mistaken
for the start of an expression or (less important) for the end.

That leaves us with:

(< DECL >)   [< DECL >]   {< DECL >}

($ DECL $)   (% DECL %)   (& DECL &)   (/ DECL /)   (: DECL :)
(= DECL =)   (? DECL ?)   (@ DECL @)   (^ DECL ^)   (| DECL |)

[$ DECL $]   [% DECL %]   [& DECL &]   [/ DECL /]   [: DECL :]
[= DECL =]   [? DECL ?]   [@ DECL @]   [^ DECL ^]   [| DECL |]

{$ DECL $}   {% DECL %}   {& DECL &}   {/ DECL /}   {: DECL :}
{= DECL =}   {? DECL ?}   {@ DECL @}   {^ DECL ^}   {| DECL |}

<$ DECL $>   <% DECL %>   <& DECL &>   </ DECL />   <: DECL :>
<= DECL =>   <? DECL ?>   <@ DECL @>   <^ DECL ^>   <| DECL |>

Now, before you choose, consider that you can use it without any
spaces between the delimiter and the declarative expression. I guess
this would be common for e.g. classmethod. That's why my personal
favourites (in descending order of preference) are:

<|DECL|>   <== I like this one most.

The vertical bars line up neatly if you use them on successive lines.
And it visually separates the content well, even without spaces.

But back to a more scientific analysis: conveying meaning with delimiters
is hard. Only precedent may help us here:

So <? ?> is used in XML for processing instructions (PIs). But it's up to
the language used for the PI whether the content of the PI has declarative
or imperative meaning. We'd fall prey to |MIMIC| I guess.

Choosing </ /> would not help to clear up matters, either.

Most other delimiters have no common precedent (that I know of). But the
inner character has. That makes $, @ and = seem awkward choices.

Oh and when picking one, we should think about reserving another one for
a future compile time evaluated variant, too (IMHO <:DECL:> has some merit

BTW: choosing smileys as lexical tokens would sure help to entertain
     slashdot for weeks! :)

6.6 Finally

Still with me? Good, because here is the syntax I propose:

Lexical definitions:


The tokens should be defined so they work like the existing paired
delimiters (i.e. allowing NEWLINE).

Grammar definitions:

decl_element ::= "<|" [expression] ("," expression)* [","] "|>"
decl_element ::= "<|" [expression_list] "|>"

The expressions should yield declarative objects of course (see the section
on semantics).

The DSE should be placed immediately *before* the lexical element to which
it applies. Multiple declarative elements may stack up and all apply to
the following element.

The use cases indicate the following possible applications in descending
order of importance:

- Method and function definitions ("def").
- Class definitions ("class").
- Related to the module (there is no good place to put it though ...).
- Class attributes, i.e. assignments in a class definition.
- All compound statements ("if", "while", "for", "try").
- Parameter definitions.
- Related to the return value of a method or function (either by applying
  it to "return" or by putting it between "def" and the function name).
- All statements, i.e. assignments, too.

The sheer variety indicates that we should not restrict the DSE to be
usable only on a line of its own.

We might as well make the DSE applicable to *any* grammatical element.
>From a language perspective that would certainly be pretty orthogonal.
>From an implementation perspective we may disallow it or ignore it for
some cases. The exact way to pass the target of the DSE to the
declarative object(s) needs to be figured out, though.

A more radical thought would be to treat the DSE as a compound lexical
token. This could then be used before ANY token. It would work like
a C comment block. Yes, this smells a bit like a macro, but it doesn't
behave like one (unless we do compile time evaluation). And no, I wouldn't
place my bets on this one.

TODO: More input is required. Please go ahead.

-------- End of Document --------

More information about the Python-Dev mailing list