[Python-ideas] Dict literal use for custom dict classes

Sat Dec 12 22:24:16 EST 2015

On Sun, Dec 13, 2015 at 01:13:49AM +0100, Jelte Fennema wrote:
> I really like the OrderedDict class. But there is one thing that has always
> bothered me about it. Quite often I want to initialize a small ordered
> dict. When the keys are all strings this is pretty easy, since you can just
> use the keyword arguments. But when  some, or all of the keys are other
> things this is an issue. In that case there are two options (as far as I
> know). If you want an ordered dict of this form for instance: {1: 'a', 4:
> int, 2: (3, 3)}, you would either have to use:
> OrderedDict([(1, 'a'), (4, int), (2, (3, 3))])
> 
> or you could use:
> d = OrderedDict()
> d[1] = 'a'
> d[4] = int
> d[2] = (3, 3)
> 
> In my opinion both are quite verbose and the first is pretty unreadable
> because of all the nested tuples. 

You have a rather strict view of "unreadable" :-)

Some alternatives if you dislike the look of the above:

# Option 3:
d = OrderedDict()
for key, value in zip([1, 4, 2], ['a', int, (3, 3)]):
    d[key] = value

# Option 4:
d = OrderedDict([(1, 'a'), (4, int)])  # The pretty values.
d[2] = (3, 3)  # The ugly nested tuple at the end.

So there's no shortage of work-arounds for the lack of nice syntax for 
creating ordered dicts.

And besides, why single out OrderedDict? There are surely people out 
there using ordered mappings like RedBlackTree that have to deal with 
this same issue. Perhaps what we need is to stop focusing on a specific 
dictionary type, and think about the lowest common denominator for any 
mapping. And that, I believe, is a list of (key, value) tuples. See 
below.

> That is why I have two suggestions for
> language additions that fix that.
> The first one is the normal dict literal syntax available to custom dict
> classes like this:
> OrderedDict{1: 'a', 4: int, 2: (3, 3)}

I don't understand what that syntax is supposed to do.

Obviously it creates an OrderedDict, but you haven't explained the 
details. Is the prefix "OrderedDict" hard-coded in the parser/lexer, 
like the b prefix for byte-strings and r prefix for raw strings? In that 
case, I think that's pretty verbose, and would prefer to see something 
shorter:

o{1: 'a', 4: int, 2: (3, 3)}

perhaps. If OrderedDict is important enough to get its own syntax, it's 
important enough to get its own *short* syntax. That was my preferred 
solution, but it no longer is.

Or is the prefix "OrderedDict" somehow looked-up at run-time? So we 
could write something like:

spam = random.choice(list_of_callables)
result = spam{1: 'a', 4: int, 2: (3, 3)}

and spam would be called, whatever it happens to be, with a single 
list argument:

[(1, 'a'), (4, int), (2, (3, 3))]

What puts me off this solution is that it is syntactic sugar for not one 
but two distinct operations:

- sugar for creating a list of tuples;
- and sugar for a function call.

But if we had the first, we don't need the second, and we don't need to 
treat OrderedDict as a special case. We could use any mapping:

MyDict(sugar)
OrderedDict(sugar)
BinaryTree(sugar)

and functions that aren't mappings at all, but expect lists of (a, b) 
tuples:

covariance(sugar)

> This looks much cleaner in my opinion. As far as I can tell it could simply
> be implemented as if the either of the two above options was used. This
> would make it available to all custom dict types that implement the two
> options above.
> 
> A second very similar option, which might be cleaner and more useful, is to
> make this syntax available (only) after initialization. So it could be used
> like this:
> d = OrderedDict(){1: 'a', 4: int, 2: (3, 3)}
> d{3: 4, 'a': 'c'}
> *>>> *OrderedDict(){1: 'a', 4: int, 2: (3, 3), 3: 4, 'a': 'c'}

What does that actually do, in detail? Does it call d.__getitem__(key, 
value) repeatedly? So I could do something like this:

L = [None]*10
L{1: 'a', 3: 'b', 5: 'c', 7: 'd', 9: 'e'}
assert L == [None, 'a', None, 'b', None, 'c', None, 'd', None, 'e']

If we had nice syntax for creating ordered dict literals, would we want 
this feature? I don't think so. It must be pretty rare to want something 
like that (at least, I can't remember the last time I did) and when we 
do, we can often do it with slicing:

py> L = [None]*10
py> L[1::2] = 'abcde'
py> L
[None, 'a', None, 'b', None, 'c', None, 'd', None, 'e']

> This would allow arguments to the __init__ method as well.

How? You said that this option was only available after
initialization.

> And this way it could simply be a shorthand for setting multiple attributes. 

How does the reader (or the interpreter) tell when 

d{key: value}

means "call __setitem__" and when it means "call __setattr__"?

> It might even
> be used to change multiple values in a list if that is a feature that is
> wanted.
> 
> Lastly I think either of the two sugested options could be used to allow
> dict comprehensions for custom dict types. But this might require a bit
> more work (although not much I think).
> 
> I'm interested to hear what you guys think.

I think that there is a kernel of a good idea in this. Let's go back to 
the idea of syntactic sugar for a list of tuples. The user can then call 
the function or class of their choice, they aren't limited to just one 
mapping type.

I'm going to suggest [key:value] as syntax. Now your original example 
becomes:

d = OrderedDict([1: 'a', 4: int, 2: (3, 3)])

which breaks up the triple ))) at the end, so hopefully you will not 
think its ugly. Also, we're not limited to just calling the constructor, 
it could be any method:

d.update([2: None, 1: 'b', 5: 99.9])

or anywhere at all:

x = [2: None, 1: 'b', 5: 99.9, 1: 'a', 4: int, 2: (3, 3)] + items

# shorter and less error-prone than:
x = (
    [(2, None), (1, 'b'), (5, 99.9), (1, 'a'), (4, int), (2, (3, 3))]
    + values
    )

There could be a comprehension form:

[key: value for x in seq if condition]

similar to the dict comprehension form.

-- 
Steve