Proposal for adding symbols within Python

Mike Meyer mwm at mired.org
Sat Nov 12 16:52:12 EST 2005


Pierre Barbier de Reuille <pierre.barbier at cirad.fr> writes:
> Please, note that I am entirely open for every points on this proposal
> (which I do not dare yet to call PEP).
>
> Abstract
> ========
>
> This proposal suggests to add symbols into Python.

You're also proposing adding a syntax to generate symbols. If so, it's
an important distinction, as simply addig symbols is a lot more
straightforward than adding new syntax.

> Symbols are objects whose representation within the code is more
> important than their actual value. Two symbols needs only to be
> equally-comparable. Also, symbols need to be hashable to use as keys of
> dictionary (symbols are immutable objects).

The values returned by object() meet this criteria. You could write
LISPs gensym as:

      gensym = object

As you've indicated, there are a number of ways to get such
objects. If all you want is symbols, all that really needs to happen
is that one of those ways be blessed by including an implementation in
the distribution.

> In LISP : Symbols are introduced by "'". "'open" is a symbol.

No, they're not. "'(a b c)" is *not* a symbol, it's a list. Symbols in
LISP are just names. "open" is a symbol, but it's normally evaluated.
The "'" is syntax that keeps the next expression from being evaluated,
so that "'open" gets you the symbol rather than it's value. Since
you're trying to introduce syntax, I think it's important to get
existing practice in other languages right.

> Proposal
> ========
>
> First, I think it would be best to have a syntax to represent symbols.

That's half the proposal.

> Adding some special char before the name is probably a good way to
> achieve that : $open, $close, ... are $ymbols.

$ has bad associations for me - and for others that came from an
earlier P-language. Also, I feel that using a magic character to
introduce type information doesn't feel very Pythonic.

While you don't make it clear, it seems obvious that you intend that
if $open occurs twice in the same scope, it should refer to the same
symbol. So you're using the syntax for a dual purpose. $name checks to
see if the symbol name exists, and references that if so. If not, it
creates a new symbol and with that name. Having something that looks
like a variables that instantiates upon reference instead of raising
an exception seems like a bad idea.

> On the range of symbols, I think they should be local to name space
> (this point should be discussed as I see advantages and drawbacks for
> both local and global symbols).

Agreed. Having one type that has different scoping rules than
everything else is definitely a bad idea.

> There should be a way to go from strings to symbols and the other way
> around. For that purpose, I propose:
>
>>>> assert symbol("opened") == $opened
>>>> assert str($opened) == "opened"

So the heart of your proposal seems to be twofold: The addition of
"symbol" as a type, and the syntax that has the lookup/create behavior
I described above.

> Implementation
> ==============
>
> One possible way to implement symbols is simply with integers resolved
> as much as possible at compile time.

What exactly are you proposing be "resolved" at compile time? How is
this better than using object, as illustratd above?

Suggested changes:

Provide a solid definition for the proposed builtin type "symbol".
Something like:

          symbol objects support two operations: is and equality
          comparison. Two symbol objects compare equal if and only if
          they are the same object, and symbol objects never compare
          equal to any other type of object. The result of other
          operations on a symbol object is undefined, and should raise
          a TypeError exception.

          symbol([value]) - creates a symbol object. Two distinct
          calls to symbol will return two different symbol objects
          unless the values passed to them as arguments are equal, in
          which case they return the same symbol object. If symbol is
          called without an argument, it returns a unique symbol.

I left the type of the value argument unspecified on purpose. Strings
are the obvious type, but I think it should be as unrestricted as
possible. The test on value is equality, not identity, because two
strings can be equal without being the same string, and we want that
case to give us the same symbol. I also added gensym-like behavior,
because it seemed useful. You could do without equality comparison,
but it seems like a nice thing to have.

Now propose a new syntax that "means" symbol, ala {} "meaning" dict
and [] "meaning" list. Don't use "$name" (& and ^ are also probably
bad, but not as; pretty much everything else but ? is already in
use). Python does seem to be moving away from this kind of thing,
though.

Personally, I think that the LISP quote mechanism would be a better
addition as a new syntax, as it would handle needs that have caused a
number of different proposals to be raised.  It would require that
symbol know about the internals of the implementation so that ?name
and symbol("name") return the same object, and possibly exposing said
object to the programmer. And this is why the distinction about how
LISP acts is important.

      <mike
-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list