On Mon, Dec 28, 2020 at 12:33 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Dec 28, 2020 at 12:15 PM Christopher Barker <pythonchb@gmail.com> wrote:
Though frankly, I would rather have had it use .items() -- seems more efficient to me, and you do need both the keys and the values, and items() is just as much part of the Mapping API as keys.

There may be a (small) performance issue with that -- items() requires creating a tuple object for each key/value pair.

it does look like items() is a tad faster (dict with 1000 items), but not enough to matter:

In [61]: %timeit {k: d[k] for k in d.keys()}
112 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [62]: %timeit {k: v for k, v in d.items()}
92.6 µs ± 1.9 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Anyway, of course it's too late to change. And there are probably other "protocols" that check for the presence of keys and __getitem__(). Also, in a sense keys() is more fundamental -- deriving keys() from items() would be backwards (throwing away the values -- imagine a data type that stores the values on disk).

Does there need to be a single defined "protocol" for a mapping (other than the ABC)? -- that is, would **unpacking be able to use .items() and keys() be used in other contexts? 

And why does ** unpacking need to check at all (LBYL) couldn't it simply do something like:

{k: d[k] for k in d}

sure, there could occasionally be a Sequence for which that would happen to work (like a range object for instance), but then it would be unlikely to result in the expected result anyway -- just like many other uses of Duck typing. Or not, and it could still be correct. 

But as you say -- too late the change now anyway.

To the OP: you suggested that you had, I think, four ways to make a dataclass "unpackable", but none were satisfactory. How about this decorator:

def make_mapping(cls):
    def __getitem__(self, key):
        if key in self.__dataclass_fields__:
            return self.__dict__[key]
            raise KeyError(key)

    def keys(self):
        return self.__dataclass_fields__.keys()

    cls.__getitem__ = __getitem__
    cls.keys = keys

    return cls

class Point:
    x: int
    y: int

p = Point(1, 2)




Side Question: when should one use __dict__ vs vars() vs getattr() ??? all three work in this case, but I'm never quite sure which is prefered, and why.

But there is an argument that the ** operator should be able to be supported only with dunder methods -- which could be done if it used the iterator protocol to get the keys, rather than the keys() method, which does not appear to work now. though to be fair, all you need to do to get that is add a __len__ and derive from Mapping.

If we had to do it all over from scratch we would probably design mappings and sequences to be differentiably using dunders only. But it's about 31 years too late for that. And looking at the mess JavaScript made of this (sequences are mappings with string keys "0", "1" and so on), I'm pretty happy with how Python did this.

--Guido van Rossum (python.org/~guido)

Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython