[Python-ideas] Add new `Symbol` type

Thu Jul 5 20:26:36 EDT 2018

On Thu, Jul 5, 2018 at 1:25 PM, Flavio Curella <flavio.curella at gmail.com> wrote:
>
>> What functionality does such a thing actually need?
>
> I think the requirements should be:
> * The resulting symbol behave exactly like None. IE: the symbol should not
> be an instance of object, but an instance of its own class
> * A symbol can optionally be globally unique.
> * Two symbols created by the same key must not be equal. IE: they have equal
> key, but different value
>    * if we're trying to create global symbols with the same key, an
> exception is thrown
>
> This is mostly based on the Javascript spec.

I think the name "symbol" here is pretty confusing. It comes
originally from Lisp, where it's used to refer to an interned-string
data type. It's a common source of confusion even there. Then it
sounds like JS took that name, and it ended up drifting to mean
something that's almost exactly the opposite of a Lisp symbol. In
Lisp, symbols are always "global"; the whole point is that if two
different pieces of code use the same name for the same symbol then
they end up with the same object. So this is *super* confusing. I
think I see how JS ended up here [1], but the rationale really doesn't
translate to other languages.

The thing you're talking about is what Python devs call a "sentinel"
object. If your proposal is to add a sentinel type to the stdlib, then
your chance of success will be *much* higher if you use the word
"sentinel" instead of "symbol". People don't read mailing list threads
carefully, so if you keep calling it "symbol" then you'll likely spend
infinite time responding to people rushing in to critique your
proposal based on some misconception about what you're trying to do,
which is no fun at all. Honestly I'd probably start a new thread with
a new subject, ideally with an initial straw-man proposal for the
semantics of these objects.

-n

[1] What was JS thinking? Well, I'm not sure I have all the details
right, but AFAICT it's all very logical... JS objects, like Python
objects, have attributes, e.g. 'console.log' is the 'log' attribute of
the 'console' object. There's a table inside the 'console' object
mapping keys like 'log' to their corresponding values, much like a
Python object's __dict__. But a Python dict can use arbitrary objects
as keys. JS attribute tables are different: the keys are required to
be Lisp-style symbol objects: they're arbitrary strings (and only
strings), that are then interned for speed. This kind of table lookup
is exactly why Lisp invented symbols in the first place; a Lisp scope
is also a table mapping symbols to values. BUT THEN, they decided to
enhance JS to add the equivalent of special methods like Python's
__add__. Now how do you tell which attributes are ordinary attributes,
and which ones are supposed to be special? In Python of course we use
a naming convention, which is simple and works well. But in JS, by the
time they decided to do this, it was too late: people might already be
using names like "__add__" for regular attributes, and making them
special would break compatibility. In fact, *all* possible strings
were potentially already in use for ordinary attributes; there were no
names left for special attributes. SO, they decided, they needed to
expand the set of symbol objects (i.e., attribute names) to include
new values that were different from all possible strings. So now the
JS Symbol class is effectively the union of {strings, compared as
values} + {sentinels, compared by identity}. And for string
attributes, you can mostly ignore all this and pretend they're
ordinary strings and the JS interpreter will paper over the details.
So the main kind of symbol that JS devs actually have to *know* about
is the new sentinel values. And that's how the name "symbol" flipped
to mean the opposite of what it used to. See? I told you it was all
very logical.

-- 
Nathaniel J. Smith -- https://vorpus.org