On 2020-02-18 7:03 p.m., Guido van Rossum wrote:
On Tue, Feb 18, 2020 at 1:45 PM Soni L. <fakedme+py@gmail.com> wrote:
[...]
Iteration is done thusly:

def _pairs(obj):
     if isinstance(obj, collections.abc.Mapping):
         return iter(obj.items())
     elif isinstance(obj, collections.abc.Sequence):
         return iter(enumerate(obj, 0))
     elif isinstance(obj, collections.abc.Set):
         return iter(((e, e) for e in obj))
     else:
         # maybe there's more stuff I can implement later
         raise TypeError

I think having a proper protocol for this behaviour would be beneficial,
as no (monkey)patching would be needed to add support for additional
collections and data types. The comment saying there may be additional
data types I've missed would then become irrelevant, as they'd be
covered by __valid_getitem_requests__.

Sometimes you just want to treat everything as a (read-only) dict.
Mostly when dealing with user input (e.g. configs), but not only.

But what will you do with the pairs?

For mappings and sequences, the pairs represent (key, value) or (index, value) pairs, which makes sense, since (as you write in your first post) one can write obj[key] to obtain values from a mapping or sequence (key ~~ index).

But then you go on and propose returning pairs (value, value) for sets, and there's nothing corresponding to obj[key]. So I'm not sure what your intent for the proposed is. Why do you want pairs for everything? Suppose obj is a number x, perhaps you would want to get a single pair (x, x) from the iteration? (I know that's probably bending the metaphor too far. :-)

Yep, I'll bring you back to the previous post:

"(Friendly warning, it does handle sets in a rather unusual way. I've never used that feature outside unit tests, either. Pay no attention to that however, as it's not relevant to this proposal.)"

I kinda just really wish sets did easy interning with a very low chance of accidentally mapping two different things to the same (or otherwise wrong) objects (e.g. intern_cache["a"]=None), and so I decided to encode that into my _pairs. But that is unrelated to this proposal.


Even if you were to answer this question, consider that perhaps your application is rather unique in this insistence on pairs. Or perhaps you should've stopped at proposing an API that unifies mappings and sequences (which more sensible since both support __getitem__) -- isn't that what your original post was about?

I should've stopped myself from adding this weird behaviour with sets. I've never even used it yet. Sequences and dicts, I do use, but not sets.


However even for mappings and sequences the analogy only goes so far, since *iterating* over them behaves very differently: for a mapping, iteration yields keys, while for a sequence, it yields values. A legitimate question that comes up from time to time is actually how to *distinguish* between mappings and sequences -- se the other python-ideas thread that I mentioned previously.

Yeah I can't say I don't have some "I like the way Lua does this" feelings. I'll be honest tho, if I were to implement fully separate behaviour for lists and dicts, the fact that dicts iterate by key and lists iterate by value means I would have to go with either a slower implementation (such as an operator that basically just converts an iterator of keys into one of values - or at least I'm pretty sure python doesn't optimize "indexing a dict with the key you're iterating", even if that may or may not be a cpython implementation detail) or one with loads of duplicated code (different code for lists and for dicts, and different arrow operators to select between them). I opted for this instead, it seemed like an acceptable tradeoff altho I'm not sure the python community likes it.


--
--Guido van Rossum (python.org/~guido)