[Python-ideas] namedtuple literals [Was: RE a new namedtuple]

Wed Jul 26 13:47:31 EDT 2017

On 2017-07-26 01:10 PM, Steven D'Aprano wrote:
> On Thu, Jul 27, 2017 at 02:05:47AM +1000, Nick Coghlan wrote:
>> On 26 July 2017 at 11:05, Steven D'Aprano <steve at pearwood.info> wrote:
>>> I don't see any way that this proposal can be anything by a subtle
>>> source of bugs. We have two *incompatible* requirements:
>>>
>>> - we want to define the order of the fields according to the
>>>   order we give keyword arguments;
>>>
>>> - we want to give keyword arguments in any order without
>>>   caring about the field order.
>>>
>>> We can't have both, and we can't give up either without being a
>>> surprising source of annoyance and bugs.
>> I think the second stated requirement isn't a genuine requirement, as
>> that *isn't* a general expectation.
>>
>> After all, one of the reasons we got ordered-by-default keyword
>> arguments is because people were confused by the fact that you
>> couldn't reliably do:
>>
>>     mydict = collections.OrderedDict(x=1, y=2)
>
> Indeed. But the reason we got *keyword arguments* in the first place was 
> so you didn't need to care about the order of parameters. As is often 
> the case, toy examples with arguments x and y don't really demonstrate 
> the problem in real code. We need more realistic, non-trivial examples. 
> Most folks can remember the first two arguments to open:
>
> open(name, 'w')
>
> but for anything more complex, we not only want to skip arguments and 
> rely on their defaults, but we don't necessarily remember the order of 
> definition:
>
> open(name, 'w', newline='\r', encoding='macroman', errors='replace')
>
> Without checking the documentation, how many people could tell you 
> whether that order matches the positional order? I know I couldn't.
>
>
> You say
>
>>     ntuple(x=1, y=2) == ntuple(y=1, x=2) == tuple(1, 2)
>>     ntuple(x=2, y=1) == ntuple(y=2, x=1) == tuple(2, 1)
>>
>> Putting the y-coordinate first would be *weird* though
> Certainly, if you're used to the usual mathematics convention that the 
> horizontal coordinate x comes first. But if you are used to the curses 
> convention that the vertical coordinate y comes first, putting y first 
> is completely natural.
>
> And how about ... ?
>
>     ntuple(flavour='strange', spin='1/2', mass=95.0, charge='-1/3',
>            isospin='-1/2', hypercharge='1/3')
>
> versus:
>
>     ntuple(flavour='strange', mass=95.0, spin='1/2', charge='-1/3',
>            hypercharge='1/3', isospin='-1/2')
>
> Which one is "weird"?
>
> This discussion has been taking place for many days, and it is only now 
> (thanks to MRAB) that we've noticed this problem. I think it is 
> dangerous to assume that the average Python coder will either:
>
> - always consistently specify the fields in the same order;
>
> - or recognise ahead of time (during the design phase of the program) 
>   that they should pre-declare a class with the fields in a particular 
>   order.
>
>
> Some people will, of course. But many won't. Instead, they'll happily 
> start instantiating ntuples with keyword arguments in inconsistent 
> order, and if they are lucky they'll get unexpected, tricky to debug 
> exceptions. If they're unlucky, their program will silently do the wrong 
> thing, and nobody will notice that their results are garbage.
>
> SimpleNamespace doesn't have this problem: the fields in SimpleNamespace 
> aren't ordered, and cannot be packed or unpacked by position.
>
> namedtuple doesn't have this problem: you have to predeclare the fields 
> in a certain order, after which you can instantiate them by keyword in 
> any order, and unpacking the tuple will always honour that order.
>
>
>> Now, though, that's fully supported and does exactly what you'd expect:
>>
>>     >>> from collections import OrderedDict
>>     >>> OrderedDict(x=1, y=2)
>>     OrderedDict([('x', 1), ('y', 2)])
>>     >>> OrderedDict(y=2, x=1)
>>     OrderedDict([('y', 2), ('x', 1)])
>>
>> In this case, the "order matters" expectation is informed by the
>> nature of the constructor being called: it's an *ordered* dict, so the
>> constructor argument order matters.
> I don't think that's a great analogy. There's no real equivalent of 
> packing/unpacking OrderedDicts by position to trip us up here. It is 
> better to think of OrderedDicts as "order-preserving dicts" rather than 
> "dicts where the order matters". Yes, it does matter, in a weak sense. 
> But not in the important sense of binding values to keys:
>
> py> from collections import OrderedDict
> py> a = OrderedDict([('spam', 1), ('eggs', 2)])
> py> b = OrderedDict([('eggs', -1), ('spam', 99)])
> py> a.update(b)
> py> a
> OrderedDict([('spam', 99), ('eggs', -1)])
>
> update() has correctly bound 99 to key 'spam', even though the keys are 
> in the wrong order. The same applies to dict unpacking:
>
> a.update(**b)
>
> In contrast, named tuples aren't just order-preserving. The field order 
> is part of their definition, and tuple unpacking honours the field 
> order, not the field names. While we can't update tuples in place, we 
> can and often do unpack them into variables. When we do, we need to know 
> the field order:
>
>
> flavour, charge, mass, spin, isospin, hypercharge = mytuple
>
> but the risk is that the field order may not be what we expect unless we 
> are scrupulously careful to *always* call ntuple(...) with the arguments 
> in the same order.
>
>
>
The main use case for ntuple literals, imo, would be to replace
functions like this:
>>> def spam(...):
...   [...]
...   return eggs, ham

With the more convenient for the caller
>>> def spam(...):
...   [...]
...   return (eggs=eggs, ham=ham)

Ntuple literals don't introduce a new field-ordering problem, because
this problem already existed with the bare tuple literal it replaced. In
the case where you need to create compatible ntuples for multiple
functions to create, collections.namedtuple is still available to
predefine the named tuple type. Or you can use a one-liner helper
function like this:
>>> def _gai(family, type, proto, canonname, sockaddr):
...   return (family=family, type=type, proto=proto,
canonname=canonname, sockaddr=sockaddr)
>>> type(_gai(family=1, type=2, [...])) is type(_gai(type=2, family=1,
[...]))

Alex Brault