Add __keys__ or __items__ protocol
It is hard to distinguish a sequence and a mapping in Python 3. Both have __getitem__, __len__ and __iter__ methods, but with different semantic. In Python code you can use isinstance() and issubclass() checks against collections.abc.Mapping, but it works only with types which was explicitly registered in Mapping. Autodectection does not work as for Iterable or Hashable. Also importing collections.abc.Mapping and calling PyObject_IsInstance() or PyObject_IsSubclass() is cumbersome and inefficient in the C code. It is rarely used. It is more common to check for the existence of the keys() method. The dict constructor and update() method do this, as well as other code which emulates the behavior of dict. This is the only non-dunder method used by the core operations. There are problems with this: * Since it is not a dunder method, it is not reserved, and may conflicts with the user attribute. * Since it is not a dunder method, it is checked for the instance, not for the type. Instances of the same type can look as a mapping or as not a mapping for the same code. Checking the attribute of the class does not give you information about the attribute of the instance. * There is no a corresponding slot in the type object. So checking the existence of the keys attribute is slow, non-atomic, can execute an arbitrary code and raise an exception. It can't be used in PyMapping_Check() and slows down the dict constructor. * The keys() method is not called in the dict constructor. Just the existence of the keys attribute is checked, its value is not used. * It is a special case. All other implicitly called methods which determined the behavior of builtin types are dunder names. I propose to add support for special methods `__keys__` or `__items__` and corresponding type slot (`tp_mapping->mp_keys` or `tp_mapping->mp_items`). At first stage, the code which checked for "keys" should check also for the special method and use them if defined. At second stage it should emit a warning if "keys" is defined, but the special method is not defined. At third stage, it should use the result of the special method and ignore "keys". PyMapping_Check() can be made checking the corresponding slot. I am not sure about some details: 1. What special method should be added, `__keys__` or `__items__`? The former returns keys, it needs to call `__getitem__` repeatedly to get values. The latter returns key-value pairs, it does not need to call `__getitem__`, but you should always pay the cost of creating a tuple even if you do not need values. 2. What type should they return? * An iterator. * An iterable which can be iterated only once. Easy to implement in Python as generator methods. * An iterable which can be iterated multiple times. * More complex view object which may support other protocols (for example support `__or__` and `__and__`). What do you think about this?
On Tue, Feb 18, 2020 at 7:25 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
* The keys() method is not called in the dict constructor. Just the existence of the keys attribute is checked, its value is not used.
Given that that's already been the case, my preferred colour for the bike shed is...
1. What special method should be added, `__keys__` or `__items__`? The former returns keys, it needs to call `__getitem__` repeatedly to get values. The latter returns key-value pairs, it does not need to call `__getitem__`, but you should always pay the cost of creating a tuple even if you do not need values.
... __items__, because it'd be more efficient for the dict constructor; and if anything needs the opposite behaviour, it can check for the presence of __items__ and then simply iterate over the object, to get the keys. (Document that calling __items__ should return tuples in the same order as iteration returns keys, just like a dict does.)
2. What type should they return?
* An iterator. * An iterable which can be iterated only once. Easy to implement in Python as generator methods. * An iterable which can be iterated multiple times. * More complex view object which may support other protocols (for example support `__or__` and `__and__`).
I'd be inclined to mandate as little as possible; to be precise: it must return an iterable, but it's okay if that iterable be single-use, and it's okay either way whether it's a snapshot or a view. So any of the above would be compliant. +1 for (eventually) removing the special-case of using keys() as a signal. ChrisA
Sounds reasonable to me On Tue, Feb 18, 2020 at 12:32 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
18.02.20 10:34, Chris Angelico пише:
Given that that's already been the case, my preferred colour for the bike shed is...
Thank you for your feedback. This is my preferable color too, for both parts. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5YF4BS... Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov
+1 On Tue, Feb 18, 2020 at 5:01 AM Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
Sounds reasonable to me
On Tue, Feb 18, 2020 at 12:32 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
18.02.20 10:34, Chris Angelico пише:
Given that that's already been the case, my preferred colour for the bike shed is...
Thank you for your feedback. This is my preferable color too, for both parts. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/5YF4BS...
Code of Conduct: http://python.org/psf/codeofconduct/
-- Thanks, Andrew Svetlov _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/A6QP5J... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 2020-02-18 5:34 a.m., Chris Angelico wrote:
On Tue, Feb 18, 2020 at 7:25 PM Serhiy Storchaka <storchaka@gmail.com> wrote:
* The keys() method is not called in the dict constructor. Just the existence of the keys attribute is checked, its value is not used.
Given that that's already been the case, my preferred colour for the bike shed is...
1. What special method should be added, `__keys__` or `__items__`? The former returns keys, it needs to call `__getitem__` repeatedly to get values. The latter returns key-value pairs, it does not need to call `__getitem__`, but you should always pay the cost of creating a tuple even if you do not need values.
... __items__, because it'd be more efficient for the dict constructor; and if anything needs the opposite behaviour, it can check for the presence of __items__ and then simply iterate over the object, to get the keys. (Document that calling __items__ should return tuples in the same order as iteration returns keys, just like a dict does.)
I like __items__. additionally, lists should implement it to return enumerate(self), and sets should implement it to return (v, v for v in self), and as such there should be no requirement with regards to __items__ and __iter__, and whether __iter__ returns keys or values. we can then call it __pairs__ instead of __items__. __items__ feels a lot like "outputs from __getitem__" rather than "pairs representing valid inputs and outputs of __getitem__".
2. What type should they return?
* An iterator. * An iterable which can be iterated only once. Easy to implement in Python as generator methods. * An iterable which can be iterated multiple times. * More complex view object which may support other protocols (for example support `__or__` and `__and__`).
I'd be inclined to mandate as little as possible; to be precise: it must return an iterable, but it's okay if that iterable be single-use, and it's okay either way whether it's a snapshot or a view. So any of the above would be compliant.
+1 for (eventually) removing the special-case of using keys() as a signal.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YZFEQQ... Code of Conduct: http://python.org/psf/codeofconduct/
18.02.20 19:00, Soni L. пише:
I like __items__. additionally, lists should implement it to return enumerate(self), and sets should implement it to return (v, v for v in self), and as such there should be no requirement with regards to __items__ and __iter__, and whether __iter__ returns keys or values.
The point is that it **should not** be implemented by list and set. dict([(1, 2)]) should return {1: 2}, not {0: (1, 2)}. dict({(1, 2)}) should return {1: 2}, not {(1, 2): (1, 2)}. If you want to propose a new special method common for dict, list and set, please open a separate thread for this.
On 02/18/2020 12:24 AM, Serhiy Storchaka wrote:
1. What special method should be added, `__keys__` or `__items__`? The former returns keys, it needs to call `__getitem__` repeatedly to get values. The latter returns key-value pairs, it does not need to call `__getitem__`, but you should always pay the cost of creating a tuple even if you do not need values.
`__items__`
2. What type should they return?
* An iterator. * An iterable which can be iterated only once. Easy to implement in Python as generator methods. * An iterable which can be iterated multiple times. * More complex view object which may support other protocols (for example support `__or__` and `__and__`).
Whatever `dict.items()` returns, both for consistency and because `dict.items()` can become a simple alias for `dict.__items__`.
What do you think about this?
+1 -- ~Ethan~
+1 on the proposed addition. I think that `__items__` would be a better choice for the method name.
On Tue, Feb 18, 2020 at 11:47:16AM -0800, Ethan Furman wrote:
Whatever `dict.items()` returns, both for consistency and because `dict.items()` can become a simple alias for `dict.__items__`.
That means that every object wanting to make use of this protocol needs to create a complex set-like view object, which is neither easy nor obvious, and may not even be appropriate for some classes. -- Steven
Serhiy Storchaka wrote:
1. What special method should be added, `__keys__` or `__items__`? The former returns keys, it needs to call `__getitem__` repeatedly to get values. The latter returns key-value pairs, it does not need to call `__getitem__`, but you should always pay the cost of creating a tuple even if you do not need values.
Between __keys__ and __items__, I find that __keys__ is specifically more unique to mapping types; whereas __items__ can be reasonably applied to any type of container. Also, I personally find that I more frequently have the need to access just all of the keys and some or none of the values, rather than all of the key and value pairs. That being said, if the consensus ends up being to use __items__ instead, I would be okay with that. Serhiy Storchaka wrote:
2. What type should they return?
* An iterator.
+0.
* An iterable which can be iterated only once. Easy to implement in Python as generator methods.
-1.
* An iterable which can be iterated multiple times.
+1.
* More complex view object which may support other protocols (for example support `__or__` and `__and__`).
Strong -1. I find that the requirement of an iterable that can be iterated multiple times would be the most reasonable option, as this would be compatible with the existing dict.keys(). A simple iterator seems a bit too generic and similar to __iter__, and requiring it to be an iterable that can be iterated over only once would be incompatible with dict.keys() (which seems rather strange). As for requiring a view object... Steven D'Aprano wrote:
That means that every object wanting to make use of this protocol needs to create a complex set-like view object, which is neither easy nor obvious, and may not even be appropriate for some classes.
Mainly for the above, I'm -1 on requirement of returning a view; that seems needlessly restrictive and would significantly limit the dunder method's usefulness for any user-created mappings which don't inherit from dict. If it's intended to be used for any mapping object, it should be as (reasonably) simple to implement as possible, IMO. On Tue, Feb 18, 2020 at 7:26 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Feb 18, 2020 at 11:47:16AM -0800, Ethan Furman wrote:
Whatever `dict.items()` returns, both for consistency and because `dict.items()` can become a simple alias for `dict.__items__`.
That means that every object wanting to make use of this protocol needs to create a complex set-like view object, which is neither easy nor obvious, and may not even be appropriate for some classes.
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YZCK3X... Code of Conduct: http://python.org/psf/codeofconduct/
participants (9)
-
Andrew Svetlov
-
Chris Angelico
-
Ethan Furman
-
Guido van Rossum
-
Kyle Stanley
-
Serhiy Storchaka
-
Soni L.
-
Steven D'Aprano
-
waszka23@gmail.com