[Python-ideas] Dict literal use for custom dict classes

Tue Dec 15 13:36:45 EST 2015

One more point I don't think anyone's brought up yet with the [k: v] syntax: Just as there's no empty set literal because it would be ambiguous with the empty dict, there would be no empty OrderedDict literal because it would be ambiguous with the empty list. The fact that the exact same ambiguity is resolved in opposite directions (which only makes sense if you know the history--dict preceded set in the language, but list preceded OrderedDict) makes it doubly irregular.

Of course we could introduce [:] as an empty OrderedDict literal, and {:} as an empty dict. But unless you actually wanted to deprecate {} for empty dict and eventually make it mean empty set (which I doubt anyone would argue for), that just gives us two ways to do it instead of solving the problem.

Meanwhile:

On Tuesday, December 15, 2015 5:09 AM, Jelte Fennema <me at jeltef.nl> wrote:

>After thinking some more, I think you are right in saying that it would make more sense to let it represent an OrderedDict directly. Mostly because the mutability suggested by the square brackets. And also a bit because I'm not sure when a mapping that maps multiple values to the same key is actually useful.

Well, multidicts are actually useful pretty often--but in Python, they're usually spelled defaultdict(set) or defaultdict(list). After all, you need some syntax to look up (and modify and delete) values. In a dict that directly has multiple values per key, there's no way to specify which one you want, but in a dict that explicitly stores those multiple values as a set or list, it's just d[key] to get or delete that set or list, and d[key].add to add a value, and so on.

I think Franklin's point was that a list of pairs is _most often_ used as a mapping initializer, but can mean other things as well, some of which might have a need for duplicate keys. For example, a stats package might take a mapping or iterable-of-pairs (the same type the dict constructor takes) for a collection of timestamped data points, and it's perfectly reasonable for two measurements to have the same timestamp in some datasets, but not in others. If the syntax defines an OrderedDict, it can't be used for the first kind of dataset.

As for your mutability point: there's no reason it couldn't be a list of 2-lists instead of a list of 2-tuples. Sure, that will take a little more space in most implementations, but that rarely matters for literals--an object that's big enough in memory that you start to worry about compactness is probably way too big to put in a source file. (And if it _does_ matter, there's no reason CPython couldn't have special code that initializes the lists constructed by [:] literals with capacity 2 instead of the normal minimum capacity, on the expectation that you're not likely to go appending to all of the elements of a list constructed that way, and if you really want to, it's probably clearer to write it with [[]] syntax.)

>Secondly, I think your idea for namedtuple literals is great. This would be really useful in the namedtuple use case where you want to return multiple values from a function, but you want to be clear in what these values actually are. I think this would need to generate some kind of anonymous named tuple class though, since it would make no sense to have to create a new class when using a literal like this.

First, why would it make no sense to create a new class? This isn't a prototype language; if the attributes are part of the object's type, then that type has to exist, and be accessible as a class. (You could cache the types, so that any two object literals with the same attributes have the same type, but that doesn't really change anything.) If you really want to avoid generating a new type, the type has to have normal-Python dynamic-per-object attributes, like SimpleNamespace, not class-specified attributes.

I don't think that's an argument for (:) literals creating new classes, so much as an argument against them producing anything remotely like a namedtuple.

None of the existing collection literals produce anything with attributes; namedtuple values can't be looked up by key (as in a mapping), only by attribute (as in SimpleNamespace) or index (as in a sequence); namedtuples don't iterate their keys (like a mapping) but their values (like a sequence)... So that's a pretty bad analogy with dict and OrderedDict in almost every way.

Also, the syntax looks enough like general object literals in other languages like JavaScript that it will probably mislead people. (Or, worse, they'd actually be right--you could use (:) and lambda together to create some very unpythonic code, and people coming from JS would be very tempted to do so. The fact that you have to explicitly call type to do that today is enough to prevent that from being an attractive nuisance.)

If you look at all the other collection literals (including the proposed [:] for OrderedDict) and try to guess what (:) does by analogy, the obvious answer would be a FrozenOrderedDict. Since frozen dicts in general aren't even useful enough to be in the stdlib, much less as builtins, much less ordered ones, I can't imagine they need literals. But a literal that looks like it should mean that, and instead means something completely different, is at best a new and clunky thing that has to be memorized separately from the rest of the syntax.