On Mon, Dec 28, 2020 at 11:36 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Mon, Dec 28, 2020 at 12:33 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Dec 28, 2020 at 12:15 PM Christopher Barker <pythonchb@gmail.com> wrote:
Though frankly, I would rather have had it use .items() -- seems more efficient to me, and you do need both the keys and the values, and items() is just as much part of the Mapping API as keys.

There may be a (small) performance issue with that -- items() requires creating a tuple object for each key/value pair.

it does look like items() is a tad faster (dict with 1000 items), but not enough to matter:

In [61]: %timeit {k: d[k] for k in d.keys()}
112 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [62]: %timeit {k: v for k, v in d.items()}
92.6 µs ± 1.9 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Interesting -- thanks for taking up the challenge. I still suspect that if we ran the corresponding benchmark at the C level, the first form would win, but it's a matter of hashing twice vs. creating a tuple -- both of which have the wazoo optimized out of them so it could go either way. There are many surprises possible here (e.g. long ago someone found that `s.startswith('x')` is slower than `s[:1] == 'x'` and the reason is the name lookup for `startswith`!).
 
Anyway, of course it's too late to change. And there are probably other "protocols" that check for the presence of keys and __getitem__(). Also, in a sense keys() is more fundamental -- deriving keys() from items() would be backwards (throwing away the values -- imagine a data type that stores the values on disk).

Does there need to be a single defined "protocol" for a mapping (other than the ABC)? -- that is, would **unpacking be able to use .items() and keys() be used in other contexts? 

And why does ** unpacking need to check at all (LBYL) couldn't it simply do something like:

{k: d[k] for k in d}

sure, there could occasionally be a Sequence for which that would happen to work (like a range object for instance), but then it would be unlikely to result in the expected result anyway -- just like many other uses of Duck typing. Or not, and it could still be correct.

I don't understand why LBYL is considered such an anti-pattern. It helps produce much clearer error messages in this case for users who are exploring this feature, and distinguishing *early* between sequences and mappings is important for that. Long ago we decided that the distinctive feature is that mappings have a `keys()` method whereas sequences don't (and users who add a `keys()` method to a sequence are just asking for trouble). So that's what we use.

--
--Guido van Rossum (python.org/~guido)