[Python-3000] Set literals - another try

Talin talin at acm.org
Tue Aug 8 18:49:08 CEST 2006


Part 1: The concrete proposal part.

I noticed that a lot of folks seemed to like the idea of making the 
empty set resemble the greek letter Phi, using a combination of 
parentheses and the vertical bar or forward slash character.

So lets expand on this: slice Phi in half and say that (| and |) are 
delimiters for a set literal, as follows:

    (|)     # Empty set

    (|a|)   # Set with 1 item

    (|a,b|) # Set with 2 items

The advantage of this proposal is that it maintains visual consistency 
between the 0, 1, and N element cases.


Part 2: The idle speculation part, not to be considered as a actual 
proposal.

I've often said that "whenever a programmer has the urge to invent a new 
programming language, that they should lie down on the couch until the 
feeling passes".

One of the reasons for this is that many times, a programmer's 
motivation in creating a new language is not that they actually need a 
new language, but rather as a means of *criticising* an existing 
language. Inventing their own language gives them the opportunity to 
show how they would have done it.

I think that kind of criticism can be valid, and that languages invented 
for this purpose can be useful, as long as you don't actually sit down 
and try to implement the thing.

As a thought experiment, I decided to apply this idea to the Python set 
literal case - i.e. if we were going to do a massive "do over" of 
Python, how would we approach the problem of set literals?

The syntax that comes to mind is something like this:

    a = b|c

Where the vertical bar character means "forms a set with". Larger sets 
could be made using the same syntax:

    a = b|c|c|d

You can also wrap parens around the set if you want:

    a = (b|c)

Like tuples, a set with a single member still requires at least one 
delimiter:

    a = (b|)

And the for the empty set, we're back to phi again:

    a = (|)

However, the parens aren't generally required - the rules are pretty 
much the same as for tuples and the comma operator. Thus, passing sets 
as arguments:

    index = s.find_first_of( 'a'|'b'|'c'|'d' )

Of course, by doing this, we're re-assigning the meaning of the '|' 
operator from 'bitwise or' to 'set construction'. This only makes sense 
if you assume that either (a) set construction is more common than 
bitwise-or operations or (b) you provide some reasonable alternative way 
to express bitwise-or operations. Lets assume that we create some 
reasonable replacement and move on.

Another thing to note is that the set construction operator resembles in 
some ways the "alternative" operator of BNF notation. In the previous 
example, 'find_first_of' looks for the first of the given alternatives.

Since dictionaries are similar to sets, we can represent a dictionary as 
a set of keys and associated values. Dictionary literals already use the 
':' operator to indicate a key - we can continue that with:

    a = ('Monday':1 | 'Tuesday':2 | 'Wednesday':3)

Unlike the current language, however, you can omit the parens:

    a = 'Monday':1 | 'Tuesday':2 | 'Wednesday':3

(This creates a syntax ambiguity with colon, but let's move on :)

One of the fun things about this line of speculation is watching how 
such a tiny change ripples outward, affecting the entire language 
definition. In this case, the change to set construction has much 
farther-reaching effects than what I have described here, assuming that 
you take each effect to its logical conclusion. I find it an enjoyable 
mental excersize :)

-- Talin


More information about the Python-3000 mailing list