[Python-ideas] Let's be more orderly!
Andrew Barnert
abarnert at yahoo.com
Wed May 15 23:27:36 CEST 2013
From: Don Spaulding <donspauldingii at gmail.com>
Sent: Wednesday, May 15, 2013 12:35 PM
>
>On Wed, May 15, 2013 at 1:01 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
>
>From: Don Spaulding <donspauldingii at gmail.com>
>>Sent: Tuesday, May 14, 2013 6:57 PM
>>
>>>In the interest of moving the discussion forward, I've had a few use cases along these lines. Let's say I want to create simple HTML elements by hand:
>>
>>>
>>> def create_element(tag, text='', **attributes):
>>> attrs = ['{}="{}"'.format(k,v) for k, v in attributes.items()]
>>> return "<{0} {1}>{2}</{0}>".format(tag, ' '.join(attrs), text)
>>>
>>> print(create_element('img', alt="Some cool stuff.", src="coolstuff.jpg"))
>>> <img src="coolstuff.jpg" alt="Some cool stuff."></img>
>>
>>Well, HTML explicitly assigns no meaning to the order of attributes. And I think this is a symptom of a larger problem. Every month, half a dozen people come to StackOverflow asking how to get an ordered dictionary. Most of them are asking because they want to preserve the order of JSON objects—which, again, is explicitly defined as unordered. If code relies on the order of HTML attributes, or JSON object members, it's wrong, and it's going to break, and it's better to find that out early.
>>
>Yes, I'm aware that HTML and JSON are explicit about the fact that order should not matter to parsers. But just because I know that, and you know that, doesn't mean that the person in charge of developing the XML-based or JSON-based web service I'm trying to write a wrapper for knows that. Twice now I've encountered poorly-written web services that have choked on something like:
I suppose when the other side is poorly-written and out of your control, that's also a legitimate use for ordering, along with human readability for debugging. (Or maybe it's the same case—the human brain cares about order even when you tell it not to, and that's out of your control…)
But I think it's another case where maybe it _shouldn't_ be on by default. Explicitly asking for an OrderedDict is a great way of signaling that someone cares about order, whether or not they should, right?
>The first thing you think of is, "Oh, I just need to use an OrderedDict.". Well, technically yes, except there's no convenient way to instantiate an OrderedDict with more than one element at a time. So now you're back to rewriting calling sites into order-preserving lists-of-tuples again. Which is why I think the OrderedDict.__init__ case is in and of itself compelling. ;-)
But if the OrderedDict.__init__ case were the only good case, coming up with some other way to create OrderedDict objects might be a better solution than changing kwargs. And if the OrderedDict solution automatically solved all of the other cases, that would _also_ mean that solving OrderedDict is what matters, not solving kwargs.
You've already given cases that you could solve with Python as it is today, if only you had a good OrderedDict constructor.
And, even for the cases that you _can't_ solve today, most of the obvious potential solutions will only work if OrderedDict is a solved problem, because they rely on OrderedDict.
odict literals are an obvious example of that.
So is my mapping_constructor idea. If everyone uses @kwargs(OrderedDict), then OrderedDict has to use @kwargs(_HackyOrderedDictBuilder), which is presumably some class that abuses the mapping protocol by wrapping custom __getitem__ and __setitem__ calls around list or something.
Or consider this small change to the rules for passing **kwargs. Currently, Python guarantees to build a new dict-like object out of anything you pass, then update it. What if Python instead guaranteed to build a new mapping of the same type (e.g., via copy.copy), then update it in order? Then you could just do this:
create_element('img', alt="Some cool stuff.", src="coolstuff.jpg", **OrderedDict())
Or take that last change, and also change the syntax to allow specifying default values for *args and **kwargs. Then:
def create_element(tag, text='', **attributes=OrderedDict()):
And so on. There are tons of possible designs out there that cannot possibly be used for OrderedDict.__init__, but which are trivial for every other use case assuming that OrderedDict.__init__ has already been solved.
That's why giving OrderedDict.__init__ as the primary use case is a mistake.
>>The syntax seems pretty obvious:
>>
>> def kwargs(mapping_constructor):
>> def deco(fn):
>> fn.kwargs_mapping_constructor = mapping_constructor
>> return fn
>> return deco
>>
>> @kwargs(OrderedDict)
>> def foo(a, b, *args, **kwargs):
>> pass
>
>That's an interesting concept. It would certainly address the most common need I see for better OrderedDict support in the language.
>
>>Handling this at the calling site is a bit harder, but still not that hard.
>
>I don't see how this would require changes to the calling site. Can you elaborate?
Sorry, I think I wasn't clear enough here. For you, as a Python coder, the only change is in defining functions, not calling them. But for the interpreter, there's obviously a change in CALL_FUNCTION (and friends) or somewhere nearby—wherever it builds a dict out of the keyword arguments that don't match named parameters, it instead has to look up and use the mapping constructor. I meant to talk about the interpreter level, but it ended up sounding like I was talking about the user level.
Anyway, it looks like the simplest implementation in CPython is about 5 one-liner changes in ext_do_call (http://hg.python.org/cpython/file/3.3/Python/ceval.c#l4294) and update_keyword_args (http://hg.python.org/cpython/file/3.3/Python/ceval.c#l4171). In PyPy, if I remember correctly, it would be a 1-liner change in the standard argument factory function. I don't know about other implementations, but I doubt they'd be much worse.
Thinking about the implementation raises some points about the interface. CPython (with the simplest changes) will always call your constructor with no parameters, and then set the items one by one. So, maybe don't require any more than empty-construction, __setitem__, and __getitem__, instead of a fancy constructor and the full MutableMapping protocol. Alternatively, PyPy's argument factory is already more flexible; maybe require that as part of the language?
More information about the Python-ideas
mailing list