Intended Usage of collections.abc for Custom Collections
Dear Python-Dev, I am the author of bidict, a bidirectional map implementation for Python. A user recently filed a bug that bidict should be a subclass of dict, so that isinstance(mybidict, dict) would return True. I replied that the user should instead use isinstance(mybidict, collections.abc.Mapping), which does already return True, and is more polymorphic to boot. But before I put the issue to bed, I want to make sure I'm correctly understanding the intended usage of collections.abc, as well as any relevant interfaces I'm not currently using (collections.UserDict? __subclasshook__?), since the documentation leaves me with some doubt. Could any collections experts on this list please confirm whether bidict is implemented as the language intends it should be? Some quick references: https://bidict.readthedocs.org/en/latest/other-bidict-types.html#bidict-type... https://github.com/jab/bidict/blob/master/bidict/_bidict.py I would be happy to try to capture what I learn from this thread and write up a guide for collections library authors in the future, or otherwise pay your help forward however I can. Thanks and best wishes. -jab
On Wed, 28 Oct 2015 at 08:47 <jab@math.brown.edu> wrote:
Dear Python-Dev,
I am the author of bidict, a bidirectional map implementation for Python. A user recently filed a bug that bidict should be a subclass of dict, so that isinstance(mybidict, dict) would return True. I replied that the user should instead use isinstance(mybidict, collections.abc.Mapping), which does already return True, and is more polymorphic to boot.
I would argue that chances are they don't need isinstance() in either case. :)
But before I put the issue to bed, I want to make sure I'm correctly understanding the intended usage of collections.abc, as well as any relevant interfaces I'm not currently using (collections.UserDict? __subclasshook__?), since the documentation leaves me with some doubt. Could any collections experts on this list please confirm whether bidict is implemented as the language intends it should be?
ABCs are meant to make sure you implement key methods for an interface/protocol. So in the case of collections.abc.Mapping, it's to make sure you implement __getitem__. In exchange for subclassing the ABC you also gain some methods for free like get(). So you subclass an ABC because you want your object to be acceptable in any code that expects an object that implements that interface/protocol and you want the help ABCs provide in making sure you don't accidentally miss some key method. Subclassing a concrete implementation of the Mapping ABC -- which is what dict is -- should be done if it is beneficial to you, but not simply to satisfy an isinstance() check. I think the ABC registration is the right thing to do and the user requesting the dict subclass should actually be doing what you suggested and testing for the interface/protocol and not the concrete implementation. And if you want another way to hit this point home, with type hints people should only be expecting abstract types like typing.Mapping as input: https://docs.python.org/3/library/typing.html#typing.Mapping . Restricting yourself to only a dict locks out other completely viable types that implement the mapping interface/protocol. Much like working with data, you should be as flexible as possible on your inputs (e.g., specifying typing.Mapping as the parameter type), but as strict as possible on the return type (.e.g, specifying dict/typing.Dict as the return type). I honestly would want to know why the user cares about an isinstance() check to begin with since they might want to go with a try/except when using the object how they want it to be and erroring out if they get passed an object that doesn't quack like a dict thanks to duck typing. -Brett
Some quick references:
https://bidict.readthedocs.org/en/latest/other-bidict-types.html#bidict-type... https://github.com/jab/bidict/blob/master/bidict/_bidict.py
I would be happy to try to capture what I learn from this thread and write up a guide for collections library authors in the future, or otherwise pay your help forward however I can.
Thanks and best wishes.
-jab _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org
On Wed, Oct 28, 2015 at 1:16 PM, Brett Cannon <brett@python.org> wrote:
On Wed, 28 Oct 2015 at 08:47 <jab@math.brown.edu> wrote:
Dear Python-Dev,
I am the author of bidict, a bidirectional map implementation for Python. A user recently filed a bug that bidict should be a subclass of dict, so that isinstance(mybidict, dict) would return True. I replied that the user should instead use isinstance(mybidict, collections.abc.Mapping), which does already return True, and is more polymorphic to boot.
I would argue that chances are they don't need isinstance() in either case. :)
But before I put the issue to bed, I want to make sure I'm correctly understanding the intended usage of collections.abc, as well as any relevant interfaces I'm not currently using (collections.UserDict? __subclasshook__?), since the documentation leaves me with some doubt. Could any collections experts on this list please confirm whether bidict is implemented as the language intends it should be?
ABCs are meant to make sure you implement key methods for an interface/protocol. So in the case of collections.abc.Mapping, it's to make sure you implement __getitem__. In exchange for subclassing the ABC you also gain some methods for free like get(). So you subclass an ABC because you want your object to be acceptable in any code that expects an object that implements that interface/protocol and you want the help ABCs provide in making sure you don't accidentally miss some key method.
Subclassing a concrete implementation of the Mapping ABC -- which is what dict is -- should be done if it is beneficial to you, but not simply to satisfy an isinstance() check. I think the ABC registration is the right thing to do and the user requesting the dict subclass should actually be doing what you suggested and testing for the interface/protocol and not the concrete implementation.
And if you want another way to hit this point home, with type hints people should only be expecting abstract types like typing.Mapping as input: https://docs.python.org/3/library/typing.html#typing.Mapping . Restricting yourself to only a dict locks out other completely viable types that implement the mapping interface/protocol. Much like working with data, you should be as flexible as possible on your inputs (e.g., specifying typing.Mapping as the parameter type), but as strict as possible on the return type (.e.g, specifying dict/typing.Dict as the return type).
I honestly would want to know why the user cares about an isinstance() check to begin with since they might want to go with a try/except when using the object how they want it to be and erroring out if they get passed an object that doesn't quack like a dict thanks to duck typing.
-Brett
Thanks very much for the thorough and thoughtful reply. I'll take this as authoritative approval of the current design, barring any further recommendations to the contrary. As for the isinstance check, it turned out that this wasn't actually in the user's code; the offending code is actually in the pandas library, which he was using. I just submitted a PR there in case anyone is interested: https://github.com/pydata/pandas/pull/11461 Thanks again for making my first experience on this list so positive. -jab
participants (2)
-
Brett Cannon
-
jab@math.brown.edu