I keep finding myself needing to test for objects that support subscripting. This is one case where EAFP is *not* actually easier: try: obj[0] except TypeError: subscriptable = False except (IndexError, KeyError): subscriptable = True else: subscriptable = True if subscriptable: ... But I don't like manually testing for it like this: if getattr(obj, '__getitem__', None) is not None: ... because it is wrong. (Its wrong because an object with __getitem__ defined as an instance attribute isn't subscriptable; it has to be in the class, or a superclass.) But doing it correctly is too painful: if any(getattr(T, '__getitem__', None) is not None for T in type(obj).mro()) and besides I've probably still got it wrong in some subtle way. What I'd really like to do is use the collections.abc module to do the check: if isinstance(obj, collections.abc.Subscriptable): ... in the same way we can check for Sized, Hashable etc. Alternatively, if we had a getclassattr that skipped the instance attributes, I could say: if getclassattr(obj, '__getitem__', None) is not None: ... (1) Am I doing it wrong? Perhaps I've missed some already existing solution to this. (2) If not, is there any reason why we shouldn't add Subscriptable to the collection.abc module? I think I have the implementation: class Subscriptable(metaclass=ABCMeta): __slots__ = () @abstractmethod def __getitem__(self, idx): return None @classmethod def __subclasshook__(cls, C): if cls is Subscriptable: return _check_methods(C, "__getitem__") return NotImplemented Comments, questions, flames? -- Steven
Just FYI there is a closely related issue on b.p.o. https://bugs.python.org/issue25988. FWIW I am in favor of the idea, but some people are against it (see e.g. the issue). -- Ivan On Fri, 27 Sep 2019 at 17:12, Steven D'Aprano <steve@pearwood.info> wrote:
I keep finding myself needing to test for objects that support subscripting. This is one case where EAFP is *not* actually easier:
try: obj[0] except TypeError: subscriptable = False except (IndexError, KeyError): subscriptable = True else: subscriptable = True if subscriptable: ...
But I don't like manually testing for it like this:
if getattr(obj, '__getitem__', None) is not None: ...
because it is wrong. (Its wrong because an object with __getitem__ defined as an instance attribute isn't subscriptable; it has to be in the class, or a superclass.)
But doing it correctly is too painful:
if any(getattr(T, '__getitem__', None) is not None for T in type(obj).mro())
and besides I've probably still got it wrong in some subtle way. What I'd really like to do is use the collections.abc module to do the check:
if isinstance(obj, collections.abc.Subscriptable): ...
in the same way we can check for Sized, Hashable etc.
Alternatively, if we had a getclassattr that skipped the instance attributes, I could say:
if getclassattr(obj, '__getitem__', None) is not None: ...
(1) Am I doing it wrong? Perhaps I've missed some already existing solution to this.
(2) If not, is there any reason why we shouldn't add Subscriptable to the collection.abc module? I think I have the implementation:
class Subscriptable(metaclass=ABCMeta): __slots__ = () @abstractmethod def __getitem__(self, idx): return None @classmethod def __subclasshook__(cls, C): if cls is Subscriptable: return _check_methods(C, "__getitem__") return NotImplemented
Comments, questions, flames?
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/JCQJP5... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Sep 27, 2019 at 9:14 AM Steven D'Aprano <steve@pearwood.info> wrote:
But doing it correctly is too painful:
if any(getattr(T, '__getitem__', None) is not None for T in type(obj).mro())
For what it's worth, walking the MRO isn't necessary, and the None trick is only necessary if you want to support people that reuse the __getitem__ name for something else. (And it's often reasonable to say "nope, I don't support that.")
class A(object): ... def __getitem__(self, i): return i class B(A): pass ... hasattr(B, '__getitem__') True
So for a given object, one could check hasattr(type(obj), '__getitem__'). Or do the None trick with it, but without checking the MRO. (Or replace None with a sentinel object()...) -- Devin
On Sep 27, 2019, at 09:05, Steven D'Aprano <steve@pearwood.info> wrote:
What I'd really like to do is use the collections.abc module to do the check:
if isinstance(obj, collections.abc.Subscriptable): ...
What about isinstance(obj, (Sequence, Mapping))? That isn’t quite the same thing, since you can have types that are subscriptable, but its subscripts don’t mean either index or key. _GenericAlias is probably not the best example here, but it is an example. But it is close to the same thing. I can’t think of too many cases where you want to work with sequences and mappings and generic static types the same way. Plus, it seems like it’s actually a more direct LBYL translation of what you’re EAFPing with your except (IndexError, KeyError) test. I think the only real issue is that it’s no too uncommon to create a type that acts like a sequence but doesn’t register with Sequence. Maybe that’s enough to make it unusable for some important uses? Your Indexable class could be implicit (like Sized, Iterable, etc.), which would solve that problem if it’s a problem (at the cost of possibly unintentionally including things like _GenericAlias, but that may not be much cost—and may not even be unintentional?). Or I may be missing something that makes (Sequence, Mapping) inappropriate, of course.
On Fri, Sep 27, 2019 at 10:11:15AM -0700, Andrew Barnert wrote:
On Sep 27, 2019, at 09:05, Steven D'Aprano <steve@pearwood.info> wrote:
What I'd really like to do is use the collections.abc module to do the check:
if isinstance(obj, collections.abc.Subscriptable): ...
What about isinstance(obj, (Sequence, Mapping))?
No, they require more than just being subscriptable. py> class Test(object): ... def __getitem__(self, idx): ... return idx**2 ... py> squares = Test() py> squares[5] 25 py> from collections.abc import Sequence, Mapping py> isinstance(squares, (Sequence, Mapping)) False One of the tests I want is for something which is subscriptable but not sized, like squares above. To be a (virtual) subclass of Sequence, the object has to provide __getitem__ and __len__, and to be a Mapping it also has to provide __iter__. Neither __len__ nor __iter__ are necessary for my purposes, and in fact __len__ may be undesirable.
That isn’t quite the same thing, since you can have types that are subscriptable, but its subscripts don’t mean either index or key.
Its not even close to the same thing :-(
_GenericAlias is probably not the best example here, but it is an example.
Where is that from?
Plus, it seems like it’s actually a more direct LBYL translation of what you’re EAFPing with your except (IndexError, KeyError) test.
Sorry, that doesn't fly. There's nothing in my except test which requires the existence of __len__. All I need is an object that supports subscripting, but the current state of the ABCs requires that I either roll my own test which will probably be wrong, or test for methods that I don't need or want. -- Steven
On 2019-09-27 17:05, Steven D'Aprano wrote:
I keep finding myself needing to test for objects that support subscripting. This is one case where EAFP is *not* actually easier:
try: obj[0] except TypeError: subscriptable = False except (IndexError, KeyError): subscriptable = True else: subscriptable = True if subscriptable: ...
But I don't like manually testing for it like this:
if getattr(obj, '__getitem__', None) is not None: ...
because it is wrong. (Its wrong because an object with __getitem__ defined as an instance attribute isn't subscriptable; it has to be in the class, or a superclass.)
But doing it correctly is too painful:
if any(getattr(T, '__getitem__', None) is not None for T in type(obj).mro())
[snip] Is there are reason why you're using 'getattr' instead of 'hasattr'?
On Sep 27, 2019, at 10:15, MRAB <python@mrabarnett.plus.com> wrote:
On 2019-09-27 17:05, Steven D'Aprano wrote:
But doing it correctly is too painful: if any(getattr(T, '__getitem__', None) is not None for T in type(obj).mro()) [snip]
Is there are reason why you're using 'getattr' instead of 'hasattr'?
There is a difference here: if some class defines __getitem__ = None, blocks it from looking like a subscriptable type. That idiom is used for other things like __hash__, even if it’s not used 100% consistently for all protocols.
On Fri, Sep 27, 2019 at 12:03:52PM -0700, Andrew Barnert via Python-ideas wrote:
On Sep 27, 2019, at 10:15, MRAB <python@mrabarnett.plus.com> wrote:
On 2019-09-27 17:05, Steven D'Aprano wrote:
But doing it correctly is too painful: if any(getattr(T, '__getitem__', None) is not None for T in type(obj).mro()) [snip]
Is there are reason why you're using 'getattr' instead of 'hasattr'?
There is a difference here: if some class defines __getitem__ = None, blocks it from looking like a subscriptable type.
I wish I had been clever enough to have thought of that, but I didn't. I used getattr because I completely forgot about the existence of hasattr. And I walked the MRO because oops. On the other hand, the collections.abc walks the MRO explicitly checking for None, so perhaps I might have accidentally been less wrong than had I just tried hasattr(type(obj)). See the _check_methods function here: https://github.com/python/cpython/blob/3.7/Lib/_collections_abc.py This supports my point that this ought to be handled once, correctly, in the standard library, like the other ABCs, instead of leaving it up to people like me to get it wrong.
That idiom is used for other things like __hash__, even if it’s not used 100% consistently for all protocols.
If you look at the 3.7 collections.abc, _check_methods is now used consistently by all of the __subclasshook__ methods. -- Steven
This supports my point that this ought to be handled once, correctly, in the standard library, like the other ABCs, instead of leaving it up to people like me to get it wrong.
It would be pretty reasonable to assume that most users would get this incorrectly. My first assumption to this problem would've been something like ``hasattr(T, "__getitem__")``, but based on the current comments, it seems like that would be inadequate for this scenario. I'm not aware of any solution that would be remotely intuitive to users.
(2) If not, is there any reason why we shouldn't add Subscriptable to the collection.abc module? I think I have the implementation:
Another important question might be "would users utilize the addition of Subscriptable in collections.abc enough for it to justify its addition to the standard library?". I think it looks interesting, but I think this should be implemented as a PyPI package (if something similar hasn't been already) so that we can evaluate it's usage rate before implementing it. Raymond also brought up a strong point to consider in https://bugs.python.org/issue25988:
The OP has a sense that Mapping and Sequence are "too heavy" but I think the reality that useful classes almost never use __getitem__ in isolation; rather, it is part of a small constellation of methods that are typically used together.
On Fri, Sep 27, 2019 at 7:21 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Sep 27, 2019 at 12:03:52PM -0700, Andrew Barnert via Python-ideas wrote:
On Sep 27, 2019, at 10:15, MRAB <python@mrabarnett.plus.com> wrote:
On 2019-09-27 17:05, Steven D'Aprano wrote:
But doing it correctly is too painful: if any(getattr(T, '__getitem__', None) is not None for T in
type(obj).mro())
[snip]
Is there are reason why you're using 'getattr' instead of 'hasattr'?
There is a difference here: if some class defines __getitem__ = None, blocks it from looking like a subscriptable type.
I wish I had been clever enough to have thought of that, but I didn't. I used getattr because I completely forgot about the existence of hasattr.
And I walked the MRO because oops.
On the other hand, the collections.abc walks the MRO explicitly checking for None, so perhaps I might have accidentally been less wrong than had I just tried hasattr(type(obj)). See the _check_methods function here:
https://github.com/python/cpython/blob/3.7/Lib/_collections_abc.py
This supports my point that this ought to be handled once, correctly, in the standard library, like the other ABCs, instead of leaving it up to people like me to get it wrong.
That idiom is used for other things like __hash__, even if it’s not used 100% consistently for all protocols.
If you look at the 3.7 collections.abc, _check_methods is now used consistently by all of the __subclasshook__ methods.
-- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7TSCXL... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, 29 Sep 2019 at 08:28, Kyle Stanley <aeros167@gmail.com> wrote:
Raymond also brought up a strong point to consider in https://bugs.python.org/issue25988:
The OP has a sense that Mapping and Sequence are "too heavy" but I think the reality that useful classes almost never use __getitem__ in isolation; rather, it is part of a small constellation of methods that are typically used together.
That's the point that I would make as well. What can you do with an object that is only known to be Subscriptable? With a Mapping I know that `keys` or `__iter__` give me the subscripts that can be used with it. With a Sequence `__len__` tells me the valid subscripts. Either of those makes it possible for me to build a sensible algorithm using subscripts. Without either of those what can I do in practice in a situation like this: def do_subscriptable_things(obj): if isinstance(obj, Subscriptable): # Now what? -- Oscar
That's the point that I would make as well. What can you do with an
object that is only known to be Subscriptable?
def do_subscriptable_things(obj): if isinstance(obj, Subscriptable): # Now what?
Maybe if you want to use/abuse it as an alternative function calling syntax-- square brackets rather than parentheses. The language itself already contains one example of this: generic typing syntax.
On Sun, Sep 29, 2019 at 5:04 AM Ricky Teachey <ricky@teachey.org> wrote:
That's the point that I would make as well. What can you do with an
object that is only known to be Subscriptable?
def do_subscriptable_things(obj): if isinstance(obj, Subscriptable): # Now what?
I'm going to echo this one -- the OP has shown that it's awkward to know whether an arbitrary object is "subscriptable" -- but not why you'd want to do that check at all. In fact, you do not use "EAFTP" to do a type check -- the whole point is that you use it like you expect ot use it, and if you get an error, you know it can't be used that way: So the OP's code: try: obj[0] except TypeError: subscriptable = False except (IndexError, KeyError): subscriptable = True else: subscriptable = True if subscriptable: ... Is exactly NOT EAFTP -- it's LBYL, but using exception catching to do the type check. And after this code is run, now what? you don't know anything about HOW you can subscript that object -- can you use an index, can you use a key, can you do something really wierd and arbitrary: Maybe if you want to use/abuse it as an alternative function calling
syntax-- square brackets rather than parentheses.
Like the example given a bit earlier: class Test(object): def __getitem__(self, idx): return idx**2 py> squares = Test() py> squares[5] 25 So if you think you *may* have a "callable_with_an_integer" -- the thing to do is: try: result = squares[5] except TypeError: print("Opps, couldn't do that") Though what you'd probably really want to do is have your oddball class raise appropriate exceptions is the wrong kind of "index" was passed in. If you do want to do this, maybe a "callable_with_brackets" ABC ??? (but please don't -- there is already a way to make objects callable...) Now that I think about it a bit more, that example *could* make sense, but only if what you are trying to do is make an indexable lazy-evaluated infinite sequence of all the squares -- in which case, you'd want to put a bit more in there. And now that I think about it, there doesn't appear to be an ABC for something that can be indexed and iterated, but does not have a length. However, we need to think more about what ABCs are *for* anyway -- given Python's "magic method" system and Duck Typing, you can essentially create types that have any arbitrary combination of functionality -- do we need an ABC for every one? Obviously not. When I first learned about ABCs, I thought some about what the point was -- they are NOT the same as, say, ABCs in C++ -- Python doesn't need those. I came to the conclusion that the point of ABCs was to allow users to write code that expects standard built-ins that will also work with custom types that match that interface. But it's only useful for standard Base Classes -- making a custom ABC that matches the spec of a particular use case isn't very helpful -- unless various third parties are going to be using that spec, there's not point in an ABC. Modifying my point -- ABCs are useful for built ins or large projects with a wide user base that may have third parties writing extensions, plug ins, etc). However, if a spec gets complicated, the ABC system breaks down anyway -- go look at what numpy is trying to do with "duck arrays" -- ABCs are not very helpful there. In short: in python, __getitem__ can be [ab]used for virtually ANYTHING -- so if you have an object that isn't a mapping (KeyError), or a Sequence (IndexError), they you would need to know something more about what it is anyway in order to use it -- I can't see how simply knowing that you can toss some unknown type(s) of objects into the square brackets is useful. So back to the OP: What do you want to be able to do here? What types do you want to be able to support, and how do you want to use them? -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Sun, Sep 29, 2019 at 09:05:19AM -0700, Christopher Barker wrote:
And after this code is run, now what? you don't know anything about HOW you can subscript that object -- can you use an index, can you use a key, can you do something really wierd and arbitrary:
That applies to many interface checks. Both the built-in callable() and the Callable ABC don't tell you what arguments the object accepts, or what it does; the Sequence ABC check doesn't tell you what valid indexes are accepted. People may assume that valid indexes go from 0 through len-1 but we can make sequences that work with 1-based sequences, or like Fortran, with any bounds at all. Including virtual "sequences" that accept indexes from -infinity to infinity, with no length. "Consenting adults" applies. The ABC checks only test for the presence of an interface, they don't necessarily make guarantees about what the interface will do.
So if you think you *may* have a "callable_with_an_integer" -- the thing to do is:
try: result = squares[5] except TypeError: print("Opps, couldn't do that")
I know that's meant in good faith, but please be careful of telling me what I should do on the basis of a wild guess of what my needs are. I've been programming in Python since 1.5 was the latest and greatest, and I think I'm capable of judging when I want to use a try...except EAFP check rather than a LBYL check for an interface. And even if I personally am not, this is not about *me*. The community as a whole is, which is why Python supports ABCs and isinstance. If you want to argue against that, that ship has not only sailed but it's already docked at the destination and the passengers disembarked :-)
Now that I think about it a bit more, that example *could* make sense, but only if what you are trying to do is make an indexable lazy-evaluated infinite sequence of all the squares --
Well spotted :-)
in which case, you'd want to put a bit more in there.
I'm listening.
And now that I think about it, there doesn't appear to be an ABC for something that can be indexed and iterated, but does not have a length.
Exactly! Don't forget that you don't need __iter__ or length to be iterable: py> class Example: ... def __getitem__(self, idx): ... if idx in range(0, 4): return "spam"*idx ... raise IndexError ... py> list(Example()) ['', 'spam', 'spamspam', 'spamspamspam']
However, we need to think more about what ABCs are *for* anyway -- given Python's "magic method" system and Duck Typing, you can essentially create types that have any arbitrary combination of functionality -- do we need an ABC for every one? Obviously not.
I don't think it is so obvious. Clearly we can't have an ABC for every imaginable combination of magic methods. There would be hundreds. What would we name them? We should (and do!) have ABCs for the most common combinations, like Sequence, MutableMapping, Container etc. And we should (so I am arguing) have ABCs for the underlying building blocks, the individual magic methods themselves. Developers can then combine those building blocks to make their own application-specific combinations. But without those composable building blocks, we're reduced to testing for dunders by hand, which is problematic because: - people get it wrong, using ``hasattr(instance, dunder)`` - ``getattr(type(instance), dunder, None) is not None`` is apparently still wrong, since that's not what collections.abc does, and presumably it has a good reason for not doing so. - being part of the implementation, rather than public interface, in principle dunders may change in the future. Obvious well-known dunders like __len__ will probably never go away. But the obvious and well-known dunder __cmp__ went away, and maybe one day the Sized protocol will change too. If you want to future-proof your code, you should test for the Sized interface rather than directly testing for the __len__ magic method. -- Steven
On 2019-09-29 21:15, Steven D'Aprano wrote:
Clearly we can't have an ABC for every imaginable combination of magic methods. There would be hundreds. What would we name them?
We should (and do!) have ABCs for the most common combinations, like Sequence, MutableMapping, Container etc.
And we should (so I am arguing) have ABCs for the underlying building blocks, the individual magic methods themselves. Developers can then combine those building blocks to make their own application-specific combinations.
But without those composable building blocks, we're reduced to testing for dunders by hand, which is problematic because:
- people get it wrong, using ``hasattr(instance, dunder)``
- ``getattr(type(instance), dunder, None) is not None`` is apparently still wrong, since that's not what collections.abc does, and presumably it has a good reason for not doing so.
- being part of the implementation, rather than public interface, in principle dunders may change in the future.
Obvious well-known dunders like __len__ will probably never go away. But the obvious and well-known dunder __cmp__ went away, and maybe one day the Sized protocol will change too. If you want to future-proof your code, you should test for the Sized interface rather than directly testing for the __len__ magic method.
I agree with this reasoning. I tend to be more of a "special cases aren't special enough to break the rules" kind of guy than a "practicality beats purity kind of guy". From my perspective, it doesn't really matter what you would or wouldn't use this ABC for. The ABC module is desirable at a conceptual level for providing just the type of implementation-agnostic facility you describe for querying the potential interface capabilities of objects. Getting items is a basic potential interface capability of objects and that alone is enough is enough justification for adding an ABC for it. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
That applies to many interface checks. Both the built-in callable() and the Callable ABC don't tell you what arguments the object accepts, or what it does; the Sequence ABC check doesn't tell you what valid indexes are accepted.
Sure — and Callable is probably the closest ABC to this proposal there is. But while I’m having trouble putting my finger on it, Callable seems to have a special place in the language that makes it important. While __getitem__ could do just about anything, I’d bet most folks would consider to be bad form to use it for something that did not “feel” like indexing. People may assume that valid indexes go from 0 through
len-1 but we can make sequences that work with 1-based sequences, or like Fortran, with any bounds at all.
Sure, but they should all raise IndexError if you violate the rules. Including virtual "sequences" that accept indexes from -infinity to
infinity, with no length.
"Consenting adults" applies. The ABC checks only test for the presence of an interface, they don't necessarily make guarantees about what the interface will do.
Sure, but again, what’s the point here? What’s the use case for knowing that the object at hand will allow you to pass something into the square brackets, but nothing else?
So if you think you *may* have a "callable_with_an_integer" -- the thing to
do is:
try: result = squares[5] except TypeError: print("Opps, couldn't do that")
I know that's meant in good faith, but please be careful of telling me what I should do on the basis of a wild guess of what my needs are. I've been programming in Python since 1.5 was the latest and greatest, and I think I'm capable of judging when I want to use a try...except EAFP check rather than a LBYL check for an interface.
I was using “you” rhetorically— but more importantly, I was suggesting that that was the how to do EAFTP, not that you should do EAFTP. The OP has posted a more convoluted form. why Python supports ABCs and isinstance. If you
want to argue against that, that ship has not only sailed but it's already docked at the destination and the passengers disembarked :-)
Interestingly, use of ABCs seems to be really growing with static type checking — relatively new.
Now that I think about it a bit more, that example *could* make sense, but
only if what you are trying to do is make an indexable lazy-evaluated infinite sequence of all the squares --
Well spotted :-)
So then you want an “IndexableIterable” ABC or some such. That might be a useful ABC. Why do I think it would be more useful? Because it would specify more than just “you can put any old thing into the square brackets”. you can essentially create
types that have any arbitrary combination of functionality -- do we need an ABC for every one? Obviously not.
I don't think it is so obvious.
Clearly we can't have an ABC for every imaginable combination of magic methods. There would be hundreds. What would we name them?
That’s why I said it was obvious :-)
And we should (so I am arguing) have ABCs for the underlying building blocks, the individual magic methods themselves. Developers can then combine those building blocks to make their own application-specific combinations.
But this is s very different proposal! If I have it right, you are now suggesting that there be an ABC for every magic method (there are a LOT). Why? So we can use isinstance() to check for a particular protocol, rather than try:except or hasattr() But this seems to be really stretching what Python ABCs are about. OK, that ship has sailed, and no, I’m not particularly happy about it. But you can make your own ABC with the particular combination of attributes you want — wouldn’t that make more sense than asking your users to call isinstance with 10 ABCs? Which brings me back to the original proposal: 1) why not make your own ABC? And 2) can someone provide even a single use case? We’ve seen one example of a class that implemented __getitem__ in an unconventional way, but not an example of how having an ABC for it would be actually useful. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Sun, Sep 29, 2019 at 11:27:32AM +0100, Oscar Benjamin wrote:
That's the point that I would make as well. What can you do with an object that is only known to be Subscriptable?
I can subscript it. What did you expect the answer to be? If you want to know if an object has a length, you can ask if it is Sized, without being forced to ask if it also supports a whole bunch of methods you don't care about. If you want to know if something supports the ``in`` operator, you can ask if it is a Container, without being forced to check for unneeded methods you have no interest in. Nobody asks, "What can you do with an object that is only known to be a Container?" because the answer is obvious: you can check whether it contains things. Likewise for awaitable, callable, hashable etc objects. But subscripting (indexing) is a conspicuous exception. There's no ABC for testing whether something supports subscripting. I don't want to get into an argument about "EAFP is better than LBYL", it's not 1999 any more and the Python community has moved past that holy war. We should acknowledge that sometimes it is better to test for an interface. That's why we have ABCs in the first place. Whether I use ABC mixins, or write my own dunders, these interfaces are composable, orthogonal building blocks. I can give a class a __contains__ method without being forced to duplicate the entire list API, or even the entire Collection API. If I want to test for that __contains__ method, why should I check for the Collection API when I don't care about iteration or length? That's a rhetorical question, because I don't have to check for unnecessary and unneeded methods that I don't care about. We have a Container ABC and I can just ask isinstance(obj, Container). Coming back to my purposes... (1) I have some classes which are infinite, lazily generated "sequences" of values. (I put sequences here in inverted commas because I am using the word in the regular English sense, not the collections.abc.Sequence sense.) I don't want to falsely advertise that these classes are Sequences by giving them an unwanted __len__ method. Is that unreasonable? I just want to not give them a __len__ method at all, which Python is perfectly happy with me doing. And I want to test for the minimal set of methods that I need, which is getitem, not a superset of methods "getitem plus len plus iter". Is that unreasonable? (2) I have a function which does introspection on arbitrary objects. It needs to process objects that are subscriptable differently from objects that aren't subscriptable, but it doesn't care about the additional Sequence methods. In fact, it specifically needs to distinguish unsized "sequences" from sized Sequences. For my own purposes, I can solve these problems. I can write my own Subscriptable test, but I'll probably get it wrong. Or I can copy and paste the useful bits from collections.abc, but that's an anti-pattern for various reasons that I trust I don't have to explain. For *myself*, it doesn't matter whether 3.9 gains this functionality or not. We can mix-n-match hashing, iteration, indexing, containment etc in our classes, without being forced to add extra methods we don't need or care about (and sometimes, actively don't want) to meet a higher-order protocol like Mapping or Sequence. We can compose new classes from individual components, but we can't check for those individual components. -- Steven
On Sep 29, 2019, at 16:41, Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Sep 29, 2019 at 11:27:32AM +0100, Oscar Benjamin wrote:
That's the point that I would make as well. What can you do with an object that is only known to be Subscriptable?
I can subscript it. What did you expect the answer to be?
I think the idea here is that your type follows the “old sequence protocol”, where you can index it with 0, 1, … until you get an IndexError, without having to check __len__ first. Since there are various bits of Python (including the iter function) that support this protocol, it’s odd that there’s no ABC for it. And it’s not just that it’s odd, it’s different from almost every other collection-related protocol in Python. I think the only reasonable argument against this is that “old-style” means “legacy”, as in you shouldn’t be creating new classes like this. But I don’t think that argument is true. There are perfectly good use cases—like your infinite sequences—that are perfectly good old-style sequences and are not valid Sequences, and how else are you supposed to implement them? (Iteration is great, but sometimes random access matters.) If I’m right about what you’re asking for, I think it’s a useful addition. Of course the same protocol would accept all kinds of bizarre other things that support __getitem__ for different reasons, like the “I want to spell calling differently” thing, plus (somewhat less uselessly) typing._GenericAlias subclasses like typing.List. But I don’t think that’s a problem. Sure, it’s not ideal that your test would accept typing.List, but who’s going to pass the List pseudo-type to a function that clearly expects some kind of collection? If they get a different exception than they expected (a TypeError a few lines down, most likely), who cares? That seems like a consenting adults issue.
On Monday, September 30, 2019 at 12:24:30 AM UTC-4, Andrew Barnert via Python-ideas wrote:
On Sep 29, 2019, at 16:41, Steven D'Aprano <st...@pearwood.info <javascript:>> wrote:
On Sun, Sep 29, 2019 at 11:27:32AM +0100, Oscar Benjamin wrote:
That's the point that I would make as well. What can you do with an object that is only known to be Subscriptable?
I can subscript it. What did you expect the answer to be?
I think the idea here is that your type follows the “old sequence protocol”, where you can index it with 0, 1, … until you get an IndexError, without having to check __len__ first.
Since there are various bits of Python (including the iter function) that support this protocol, it’s odd that there’s no ABC for it. And it’s not just that it’s odd, it’s different from almost every other collection-related protocol in Python.
I think the only reasonable argument against this is that “old-style” means “legacy”, as in you shouldn’t be creating new classes like this. But I don’t think that argument is true. There are perfectly good use cases—like your infinite sequences—that are perfectly good old-style sequences and are not valid Sequences, and how else are you supposed to implement them? (Iteration is great, but sometimes random access matters.)
That's funny, I always thought of that as legacy. The iterator protocol has been a special case in so many proposals I've seen on this list. I think it's really ugly. Instead of collections.abc.OldStyleSequence, what do you think of adding something like InfiniteSequence to collections.abc instead? It's basically Sequence without __len__, __reversed__, or __count__. I don't see it getting much use though.
If I’m right about what you’re asking for, I think it’s a useful addition.
Of course the same protocol would accept all kinds of bizarre other things that support __getitem__ for different reasons, like the “I want to spell calling differently” thing, plus (somewhat less uselessly) typing._GenericAlias subclasses like typing.List. But I don’t think that’s a problem. Sure, it’s not ideal that your test would accept typing.List, but who’s going to pass the List pseudo-type to a function that clearly expects some kind of collection? If they get a different exception than they expected (a TypeError a few lines down, most likely), who cares? That seems like a consenting adults issue. _______________________________________________ Python-ideas mailing list -- python...@python.org <javascript:> To unsubscribe send an email to python-id...@python.org <javascript:> https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/LPQXOM... Code of Conduct: http://python.org/psf/codeofconduct/
Steven D'Aprano writes:
On Sun, Sep 29, 2019 at 11:27:32AM +0100, Oscar Benjamin wrote:
That's the point that I would make as well. What can you do with an object that is only known to be Subscriptable?
I can subscript it. What did you expect the answer to be?
Technical questions: does "Subscriptable" mean non-negative ints only, or does it include the negative "count from the end" protocol? How about slices? Steve
On Mon, 30 Sep 2019 at 10:02, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Steven D'Aprano writes:
On Sun, Sep 29, 2019 at 11:27:32AM +0100, Oscar Benjamin wrote:
That's the point that I would make as well. What can you do with an object that is only known to be Subscriptable?
I can subscript it. What did you expect the answer to be?
Technical questions: does "Subscriptable" mean non-negative ints only, or does it include the negative "count from the end" protocol? How about slices?
I think we've established that what Steven wants is nothing more than a "robust" version of hasattr(obj, "__getitem__"). It's (in my view) sad that the simple hasattr test is no longer sufficient, and in particular that if you want robustness, Python has changed to the point where a pseudo subclass check is the "right" way to check for an object that has certain properties. But I guess this is the case, and therefore the omission of a Subscriptable ABC is something that may need to be addressed. Maybe rather than proliferating ABCs like this, exposing the internal function in the ABC that does the "robust" version of the hasattr test would be more flexible? I guess that depends on whether you find the "subclass of an ABC" approach acceptable. To me, it feels too much like "object oriented everywhere" languages like Java, which was always an issue I had with ABCs, but as Steven said, that ship has probably sailed by now[1]. Paul [1] Much like typing, where it's getting harder and harder if you work on larger projects to view it as "optional" when the project standards mandate it...
On Sep 30, 2019, at 02:38, Paul Moore <p.f.moore@gmail.com> wrote:
It's (in my view) sad that the simple hasattr test is no longer sufficient, and in particular that if you want robustness, Python has changed to the point where a pseudo subclass check is the "right" way to check for an object that has certain properties. But I guess this is the case, and therefore the omission of a Subscriptable ABC is something that may need to be addressed.
The thing is, hasattr was _never_ the right way for some protocols, because the standard idiom to prevent inheriting a handful of special methods like __hash__ has always been to override it with __hash__ = None, and there have always been a few classes that did this for good reasons. I’m pretty sure it’s been like that since new-style classes were added, types and classes were unified, and all protocols were given dunder methods. And how else would you do this? You could add custom syntax (as C++ did with `= delete`), but you’d need some way to record somewhere that this method should not be looked up in the mro.
Maybe rather than proliferating ABCs like this, exposing the internal function in the ABC that does the "robust" version of the hasattr test would be more flexible?
I’m pretty sure I suggested this around 3.4, and it was rejected, but I can’t remember why. At the time there _was_ no internal function, and the ABCs weren’t all consistent about handling None. Now the function exists, and is used consistently, but it’s private. (Actually, I think what exists is a function that checks N methods at once, while a public function you’d probably want to fit the API of hasattr… but that would be trivial.) But meanwhile, as far as I know, nobody’s asked for that function in the years since 3.5—and, AFAIK, this is only the second time (after Reversible) that someone has asked for a new implicit ABC that wasn’t related to a brand-new protocol (like all the async stuff). So, I’m not sure we need to worry about proliferating too many of these too quickly. In theory there are dozens of dunder methods and vastly more possible combinations of them that you could come up with names for and ask for, but in practice that hasn’t happened, so why worry?
I guess that depends on whether you find the "subclass of an ABC" approach acceptable. To me, it feels too much like "object oriented everywhere" languages like Java, which was always an issue I had with ABCs, but as Steven said, that ship has probably sailed by now[1].
But Python has always (or at least since 2.2) been even more objects-everywhere under the hood than Java. Python has types for classes, functions, bound methods, all kinds of things that aren’t even accessible at runtime except through reflection APIs in Java. The difference is that Python allows duck-typing the interfaces of those types. You can stick anything that meets the descriptor protocol by returning something that meets the callable protocol in place of a method, and it works. Python never had a good way to check for its protocols until implicit ABCs. And notice that implicit protocol ABCs like Iterable are not simulating inheritance-based subtyping like Java’s interfaces, they’re simulating structure subtyping like Go’s protocols. (And even the explicit ABCs are often there to simulate something you just can’t spell implicitly, like the distinction between Sequence and Mapping even though they use the same dunder method.) I think the main discomfort is that Python allows both kinds of ABCs, spells them the same way, and gives you a bunch of both kinds in the stdlib, so it _feels_ like you’re doing Java-style interfaces even though you usually aren’t. Spelling them both the same way is weird, and I hated the idea when I first saw it, but in practice it turns out to work really well. (Having half of those ABCs also be mixins to add useful methods is a lot more weird and wrong theoretically—and even more useful practically.) Also, what we’re checking for really is subtyping. And if we’d done that by having collections, numbers, etc. sprout a whole bunch of new isspam functions instead of types to check with isinstance, imagine what a nightmare extending that to annotations would have been. We would have had to come up with some kind of syntax to let you write an arbitrary predicate or something as an annotation, so you could write something like `def f(things: isiterable(things) and all(isnumber(thing) for thing in things)` instead of `def d(things: Iterable[Number])`. (Of course if you hate static typing, that might have made you happy, because coming up with a way to statically check those annotations would have been a decades-long project like the one C++ is still fighting over for bounding its generics…)
On Mon, Sep 30, 2019 at 10:08 AM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
Also, what we’re checking for really is subtyping.
Is it? Subtyping in type theory satisfies some axioms, one of which is transitivity. The addition of the ABCs broke transitivity: >>> issubclass(list, object) True >>> issubclass(object, collections.abc.Hashable) True >>> issubclass(list, collections.abc.Hashable) False ABC membership is a subtype relationship in some sense, and ordinary Python subclassing is a subtype relationship in some sense, but they aren't quite the same sense, and merging them creates an odd hybrid system in which I'm no longer sure which subclass relationships should hold, let alone which do. For example: >>> class A(collections.abc.Hashable): ... __hash__ = None ... >>> issubclass(A, collections.abc.Hashable) True >>> hash(A()) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'A' I didn't know what the issubclass call would return before I keyed in the example, and I can't decide what it should return. In contrast, I have no trouble deciding that the equivalent test implemented as a predicate ought to return False, since instances of A are in fact not hashable. I don't know how predicates would work in type annotations. And the ship has sailed. But I do think there's something wrong with the ABCs.
On Tue, Oct 1, 2019 at 8:48 AM Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
Is it? Subtyping in type theory satisfies some axioms, one of which is transitivity. The addition of the ABCs broke transitivity:
>>> issubclass(list, object) True >>> issubclass(object, collections.abc.Hashable) True >>> issubclass(list, collections.abc.Hashable) False
I am not as much of a Pythonist as others here, but this looks like a broken API design to me. I checked the official doc and indeed it says there: issubclass(class, classinfo) Return true if class is a subclass (direct, indirect or *virtual*) of classinfo. A class is considered a subclass of itself. classinfo may be a tuple of class objects, in which case every entry in classinfo will be checked. In any other case, a TypeError exception is raised. Where *virtual* is a hyperlink to an ABC definition. `issubclass` will be fine without virtual and I am not sure why virtual has been added to it, if apparently it does something else (even though it pretends to be doing the same thing), and could be implemented as a function in abc. Plus while the property of being a subclass is pretty solid, from the discussion here about "Subscriptable" it looks like the ABC "subclassing" is partly a wishful thinking, like may actually be subscriptable, but no one really knows what happens until __getitem__ is called, or it may be understood differently by different people, or different application as well, as already mentioned here too: On Tue, Oct 1, 2019 at 12:23 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Sep 30, 2019 at 06:00:44PM +0900, Stephen J. Turnbull wrote:
Technical questions: does "Subscriptable" mean non-negative ints only, or does it include the negative "count from the end" protocol? How about slices?
It means the class defines a __getitem__ method. Like __call__, the semantics of that method, and the range of acceptable arguments, is out of scope of the ABC.
And since the semantic is really out of the scope, why pretend that we know, when we can only guess and the actual resulting information from the test is really "class or subclass defines __getitem__ attribute and it is not None (not even saying if it is a function, and what this function might do)". From that point of view, I would see the OP mentioning:
if getclassattr(obj, '__getitem__', None) is not None
better fitting (while ignoring for a moment that `getclassattr` does not exist), because it does not pretend to do something it does not, and is totally explicit about what is the result. Then it is up to the caller to decide whether this particular condition defines "subscriptable" class for his particular purpose, or some other conditions should be met as well (__getitem__ is a function, returns a value, etc.). Richard
On Tue, Oct 1, 2019 at 8:48 AM Ben Rudiak-Gould <benrudiak@gmail.com <mailto:benrudiak@gmail.com>> wrote:
Is it? Subtyping in type theory satisfies some axioms, one of which is transitivity. The addition of the ABCs broke transitivity:
>>> issubclass(list, object) True >>> issubclass(object, collections.abc.Hashable) True >>> issubclass(list, collections.abc.Hashable) False
This isn't really the fault of ABCs, it's a consequence of the fact that subclasses can nobble certain methods by setting them to None. The non-transitivity was already there, it's just that issubclass() doesn't always detect it. -- Greg
On Tue, Oct 1, 2019 at 7:02 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
>>> issubclass(list, object) True >>> issubclass(object, collections.abc.Hashable) True >>> issubclass(list, collections.abc.Hashable) False
This isn't really the fault of ABCs, it's a consequence of the fact that subclasses can nobble certain methods by setting them to None.
Or replace them with methods that do ANYTHING — always raise an exception, mutate the object, etc. Python is s dynamic language — ABCs do not change that one bit. They can be useful, as it is a way to express an intended interface, but they can be abused in an infinite number of ways. So the fact that calling issubclass() on an object and an ABC does guarantee that the object will work as intended is simply part of the language. If you want guarantees about types, use a statically typed language. All this is why I don’t see a use for an ABC that indicates the presence of __getitem__ and nothing else. How is that useful? By the way, the fact that you can derive from an ABC and override a method with None indicates that an isinstance check with an ABC isn’t really more robust than checking for the attribute anyway. Using an ABC expresses the *intent* of the class author to provide a certain interface. Adding a certain magic method also expresses the intent to provide that particular functionality. Thinking about it a bit more, where ABCs are most useful is when the capture the interface of a built in type: folks can then write type checked code ( either statically or dynamically) that works with externally defined objects: e.g. MutableSequence rather than list. This is actually more powerful that “classic” duck typing— for example, if you have a MutableSequence, you not only know that you can index it, but that it takes integer indices, and will raise a IndexError if it is an invalid index. Whereas a MutableMapping will take hashable objects as keys, and you will get a KeyError if it’s invalid. Plus, of course all sorts of other implied interface and behavior. So we could have an ABC for each magic method, but what would that actually accomplish?
--
Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Oct 1, 2019, at 06:23, Richard Musil <risa2000x@gmail.com> wrote:
Where *virtual* is a hyperlink to an ABC definition.
`issubclass` will be fine without virtual and I am not sure why virtual has been added to it
Virtual subclassing is key to the way ABCs work. It would be nice if it were explained better in the docs, but I suppose usually people don’t need to know the details unless they’re actually getting involved in something like designing new ABCs, so… The general feature is that any class can define a subclass-testing dunder method to tell issubclass what to do, and likewise for isinstance. That’s what virtual subclassing means: your class’s subclass hook can say that something is a subclass even though it didn’t inherit from you. This is what allows ABCs like Iterable to simulate structural subtyping. In Java, you can implement all the methods you want, but you won’t be usable as an iterable unless you also inherit from the IIterable interface. In Go, if you implement the right methods, you are automatically usable as an Iterable without inheriting anything. ObjC allows you to define both kinds of relationships, because sometimes one is useful, sometimes the other. Python gives you a general framework that lets you define both kinds of relationships—and more; e.g., you can handle relationships that aren’t quite structurally (Go-style) checkable, like Sequence, and you can fudge around legacy issues rather than having to get everything perfect before you can use new features. And it does it without needing any special language-level magic like Go, by just treating issubclass itself (and likewise isinstance) as a protocol, with a dunder method, like __getattr__ or anything else. The ABC metaclass gives you some helpers to make it easier to implement virtual subclassing correctly and consistently, like a single simpler method you can override to handle both hooks at once, a registry to allow classes to explicitly claim they match your ABC, and a few other things. The collections.abc module adds an additional helper on top of that to help its ABCs test for methods properly (because it’s not as simple as it should be). But the real magic is making issubclass a protocol that’s overridable the same as everything else in Python.
On Tue, Oct 1, 2019 at 6:48 PM Andrew Barnert <abarnert@yahoo.com> wrote:
Virtual subclassing is key to the way ABCs work. It would be nice if it were explained better in the docs, but I suppose usually people don’t need to know the details unless they’re actually getting involved in something like designing new ABCs, so…
Thanks for the concise explanation. I guess my objection was not about the ABC (or how it works), but about the single fact that the function `issubclass` gives the results which are counter intuitive. Yes, we know, that those three calls: >>> issubclass(list, object) True >>> collections.ishashable(object) True >>> collections.ishashable(list): False are correct, i.e. there is nothing wrong with each one individually. It is just that all together are contradicting. And it is not because of the objects (how they are), but because what the function does, and in particular that it does two different things (but pretends it is one). If `issubclass` tested the inheritance, and there was a function `collections.abc.islikeabc` doing the other thing, I would not say a word. The problem, as I perceive it, is not in a fact that a class can mangle its dunder methods in a way it becomes "contradicting" itself, as in, for example, deriving from object and then disabling __hash__, or deriving from list and removing __getitem__, but in the fact that someone decided to merge two different and basically independent functionalities into one function. And these functionalities are different, because they answer two different questions. First is about the class origin (heritage), the other is about the class behavior (interface), where neither one implicate necessarily the other, but I digress. For what concerns "issubscriptable" my only comment is that so far it does not seem there is a consensus about what exactly "subscriptable" means (or should mean) and the regular user (like myself) could probably be even more confused (like I am about `issubclass` right now). Richard
On Oct 1, 2019, at 11:53, Richard Musil <risa2000x@gmail.com> wrote:
On Tue, Oct 1, 2019 at 6:48 PM Andrew Barnert <abarnert@yahoo.com> wrote:
Virtual subclassing is key to the way ABCs work. It would be nice if it were explained better in the docs, but I suppose usually people don’t need to know the details unless they’re actually getting involved in something like designing new ABCs, so…
Thanks for the concise explanation. I guess my objection was not about the ABC (or how it works), but about the single fact that the function `issubclass` gives the results which are counter intuitive.
But it only gives counterintuitive results when you do something counterintuitive. That’s how all consenting-adults features work:
Yes, we know, that those three calls:
>>> issubclass(list, object) True >>> collections.ishashable(object) True >>> collections.ishashable(list): False
are correct, i.e. there is nothing wrong with each one individually. It is just that all together are contradicting. And it is not because of the objects (how they are), but because what the function does, and in particular that it does two different things (but pretends it is one).
Not really. If you want to be a purist, Python doesn’t do subtyping—or, at best, subtyping is defined as “whatever the subclass hooks say”, which is meaningless because it’s completely unrestricted. By the same argument, Python doesn’t do attributes, or at best attribution is defined as “whatever the getattribute/etc. hooks say”, which is meaningless because it’s completely unrestricted. But in practice, Python objects almost always act pretty much like things that have attributes. And in practice, Python objects almost always act pretty much like things that have subtyping relationships. The fact that Python allows you to do otherwise, and that there are occasional practical reasons to do otherwise, doesn’t really change that. (At least not for people trying to read and write Python code; for someone trying to write an automated correctness prover, it radically breaks everything, and for someone trying to work on the design of ABCs, it makes less difference than that but still more than zero.)
If `issubclass` tested the inheritance, and there was a function `collections.abc.islikeabc` doing the other thing, I would not say a word.
Inheritance is nearly always useless to test for. And islikeabc would also be nearly always useless. The only reason either one is worth testing for is to approximate subtyping. You know that when isinstance(spam, Eggs) is true, you can use spam as an Eggs and it will generally work. That’s what you care about. You don’t care why it’s true, you just care that it’s true. And it’s generally (but not absolutely always, much less provably always) true if at least one of the following is true: * Eggs is a base class of type(spam) * Eggs is a structural-testing ABC where type(spam) meets the test * Eggs is an ABC where someone explicitly called Eggs.register on type(spam) * Eggs is a class with a complicated instance hook that does something very unusual but appropriate for this special case, and that unusual thing likes spam And that’s why there’s a single function that returns true if any of those is true. None of those things on its own is useful; all of them are useful as approximations of subtyping; all of them combined are in practice a more useful approximation of subtyping than any of them on its own.
For what concerns "issubscriptable" my only comment is that so far it does not seem there is a consensus about what exactly "subscriptable" means (or should mean) and the regular user (like myself) could probably be even more confused (like I am about `issubclass` right now).
Python already defines the term “subscription” for the syntactic form spam[eggs], and has done so consistently since 1.x. And it’s not a word that has other meanings in the Python community, or in programming in general. Given that, the most obvious meaning of “subscriptable” is “can be used as the left thing in a subscription”. And that’s what Steven is asking for here. The problem here is not that you’re confused about the meaning of subscriptable, but that it’s a concept that comes up rarely enough that you never even learned it. Which could be an argument that subscriptable isn’t useful. Or that we really need indexable and keyable as well/instead. Or even that Steven is wrong to think that he wants subscriptable, and what he really wants is indexable, so we don’t even have a use case for subscriptable. (I don’t know that these are very compelling arguments, but at least they’re conceivable ones.) But I can’t see an argument that something called Subscriptable could (or should) actually mean indexable rather than subscriptable.
But I can’t see an argument that something called Subscriptable could (or should) actually mean indexable rather than subscriptable.
Now we are getting into naming things, which is hard, but not the critical step. But as you point out, it’s actually not that clear exactly what the OP (or anyone else advocating for a new ABC) wants. So we need to know: Exactly what Interface would this new ABC imply? How/why would it be used? Ideally with a non-toy example or two. -CHB _______________________________________________
Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5TMSAS... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Oct 1, 2019, at 16:01, Christopher Barker <pythonchb@gmail.com> wrote:
But I can’t see an argument that something called Subscriptable could (or should) actually mean indexable rather than subscriptable.
Now we are getting into naming things, which is hard, but not the critical step.
But as you point out, it’s actually not that clear exactly what the OP (or anyone else advocating for a new ABC) wants.
Granted. Maybe I’m giving too much slack here because of who the OP is, assuming he will have a good answer to these questions as soon as people stop debating irrelevant stuff about whether the whole design of Python needs to be redone from scratch and asks him for those answers… Conceptually, all of these things make sense: * Subscriptable, which can be implicitly checked. * Indexable and Keyable, which cannot be implicitly checked, which would inherit from Subscriptable, and would also be superclasses to Sequence and Mapping respectively (which also can’t be implicitly checked) Practically, nobody has asked for Indexable and Keyable, and someone who generally knows what he’s talking about has asked for Subscriptable. But that’s just the start of the discussion, not the end.
On Tue, Oct 01, 2019 at 09:47:56AM -0700, Andrew Barnert via Python-ideas wrote:
On Oct 1, 2019, at 06:23, Richard Musil <risa2000x@gmail.com> wrote:
Where *virtual* is a hyperlink to an ABC definition.
`issubclass` will be fine without virtual and I am not sure why virtual has been added to it
Virtual subclassing is key to the way ABCs work.
Virtual subclassing is the key to the way ABCs work *in Python*, which means we could have designed them differently had we chosen to. In Python, ABCs combine the notions of *abstract* classes with virtual subclassing via registration. But we could have made them more like "abstract" in C++ and C# and required actual inheritence. In fact, Python ABCs also include the notion of *interface testing* and mixins so they really conflate FIVE orthogonal but related concepts into one: - the notion of abstract classes (which cannot be instantiated) and abstract methods (which must be overridden); - mixins for default implementations of certain common operations; - inheritence and nominal subtyping, where one class is included in the MRO of another class; - virtual subclassing using registration; - and protocol or interface testing (duck typing). Aside from the first two, they involve three independent notions of subtyping: - if I inherit from a duck, I am a duck; - if I say I am a duck, I am a duck; - if I quack like a duck and swim like a duck, I am a duck; and being independent, we can construct examples where they conflict. If I inherit from a duck, but say I'm a goose, and honk rather than quack, what am I? I don't wish to get into a debate over whether Python's design is good, bad or indifferent. Whatever the reasons, ABCs as defined in the abc and collections.abc modules are what we have to live with, and I'm not proposing any change to that. Nor am I proposing the addition of a dozen new ABCs. If anyone wants to redesign Python's ABCs and interface-testing, please write a PEP and don't hijack this thread :-) All I'm proposing is one tiny but useful change, the addition of a single ABC for testing whether or not an object quacks like something subscriptable, akin to five existing ABCs: - Callable tests for __call__ - Container tests for __contains__ - Hashable tests for __hash__ - Iterable tests for __iter__ - Sized tests for __len__ Nor am I proposing to change the way we test for these dunder methods. The current implementation tests for the existence of an *attribute* of that name on the class or a superclass, whether it is callable or not, so long as it is not None. That is a private implementation detail of the collections.abc module. If anyone wants to change the implemention of that test, please start a new thread and don't hijack this one :-) -- Steven
On Sep 30, 2019, at 23:46, Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
On Mon, Sep 30, 2019 at 10:08 AM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
Also, what we’re checking for really is subtyping.
Is it? Subtyping in type theory satisfies some axioms, one of which is transitivity. The addition of the ABCs broke transitivity:
Python subtyping isn’t perfect. The very fact that you’re allowed to disable inherited methods by assigning them to None breaks this; you don’t need ABCs for that: def spam(x: object): hash(x) spam([]) Even though [] is-a object, and objects have the __hash__ method, this raises a TypeError because lists don’t have the __hash__ method. That violates substitutability. [] is-a object is wrong. The Hashable ABC correctly reflects that incorrect relationship:
issubclass(list, object) True issubclass(object, collections.abc.Hashable) True issubclass(list, collections.abc.Hashable) False
ABC membership is a subtype relationship in some sense, and ordinary Python subclassing is a subtype relationship in some sense, but they aren't quite the same sense,
But in this case, they actually match. Hashable is correctly checking for structural subtyping, and the problem is that list isn’t actually a proper subtype of object, not that object isn’t a proper subtype of Hashable.
and merging them creates an odd hybrid system in which I'm no longer sure which subclass relationships should hold, let alone which do.
Let’s say instead of ABCs that test structural subtyping, we added a bunch of callable-like predicates to do the tests. You would have the exact same problem here: >>> issubclass(list, object) True >>> collections.ishashable(object) True >>> collections.ishashable(list): False The problem is with list is-a object, not with the way you test.
For example:
class A(collections.abc.Hashable): ... __hash__ = None ... issubclass(A, collections.abc.Hashable) True
This one is a direct consequence of the fact that you can lie to ABCs—if you inherit from an ABC you are treated as a subtype even if you don’t qualify, and the check isn’t perfect. You are explicitly, and obviously, lying to the system here. Should ABCs check whether the required methods are actually methods (with a Method ABC, or by calling _get__ and checking callable on the result, or whatever)? Or at least not None? I don’t know. Maybe. But that wouldn’t eliminate the ability to lie to them, because they have the register method. Even if you couldn’t inherit from an ABC to lie to it, you could still register with it, and there is no check at all there, and that’s definitely working as designed. And allowing you to lie isn’t really a bug; it’s a consenting-adults feature that can be misused but can also be useful for migrating legacy code. Of course register isn’t used only, or even primarily, for lying—Sequence can’t be tested structurally (at least not if you want to distinguish Sequence indexing from Mapping lookup when both protocols use the same dunder method), so list and tuple register with Sequence. But I believe range also used to register with Sequence to lie even before it became a proper sequence in 3.2, because it was “close enough” to being a Sequence and often used in real-life code that used sequences and worked. Registration can also be used for “file-like object” code that was written to the vague 2.x definition and worked fine in practice, but didn’t actually meet the ABCs in the io module which are no longer vague about what it means (because they require methods your legacy code never used). And so on. If ABCs had been in Python since 2.2, I’m not sure this feature would be a good idea, but adding it after the fact, I think it was.
hash(A()) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'A'
I didn't know what the issubclass call would return before I keyed in the example, and I can't decide what it should return. In contrast, I have no trouble deciding that the equivalent test implemented as a predicate ought to return False, since instances of A are in fact not hashable.
What about this: class A: __hash__ = None ishashable.register(A) ishashable(A) Would you be surprised if this returned true? If that registration were never useful, it would be a bizarre feature, a pointless bug magnet. But if it were useful for lots of legacy code and added for that reason, and you deliberately misused the feature like this, would you say the bug is in ishashable, in the predicate system, or in your code? Also, how would you write the issequence and is ismapping predicates?
On Tue, Oct 1, 2019 at 9:26 AM Andrew Barnert <abarnert@yahoo.com> wrote:
On Sep 30, 2019, at 23:46, Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
ABC membership is a subtype relationship in some sense, and ordinary Python subclassing is a subtype relationship in some sense, but they aren't quite the same sense,
But in this case, they actually match. Hashable is correctly checking for structural subtyping, and the problem is that list isn’t actually a proper subtype of object, not that object isn’t a proper subtype of Hashable.
If you take the perspective that the ABC notion of subtyping is the correct one, then list isn't a proper subtype of object. I wasn't taking the perspective that either one is the correct one. I think they're probably both okay in isolation. I can understand the ordinary subclass relation in Python as the partial order generated by the inheritance dag, which is a perfectly good notion of subtyping. (Yes a subclass instance might not *actually* work where a superclass instance is expected, but that's true in any OOP language. A 1234 may not work where an int instance is expected either. Perfect substitutability is not achievable, but at least there's a partial ordering of classes.) I think I can understand the ABC subclass test as a subset relation, which is also a perfectly good notion of subtyping. The problem is that when you combine them, you have neither of those things. I'm not even certain how they're being combined (is it just the union of the graphs of the two relations?) but the result has none of the properties that you'd expect a subtype relation to have, and that the two subtype relations do have in isolation.
class A(collections.abc.Hashable): ... __hash__ = None ... issubclass(A, collections.abc.Hashable) True
This one is a direct consequence of the fact that you can lie to ABCs—if you inherit from an ABC you are treated as a subtype even if you don’t qualify, and the check isn’t perfect. You are explicitly, and obviously, lying to the system here.
My problem is not that I can't justify the current behavior, it's that if it behaved differently, I could justify that behavior too. I feel like you're using CPython as an oracle of what ABCs should do, and that if issubclass had returned False in this example, you would have been ready with an explanation for that too - namely that I broke the subtyping relation by deleting __hash__, the same explanation you used earlier in the case where it did return False. What *should* it mean to inherit from an ABC? The module encourages you to use them as mixins, but maybe that isn't the intended meaning, just a side hack? Is the primary meaning to do the equivalent of registering the class with the predicate? I was worried that someone would complain about my A not making sense, and thought about using a more complicated example: class A(Hashable): def __hash__(self): return 4 class B(A): __hash__ = None issubclass(B, Hashable) # True So empirically, inheriting from Hashable registers not only that class but all subclasses of it that may later be defined with the predicate. Is that the intended behavior, or is it an accidental side effect of combining two different notions of subclassing in a single test? You could end up with a situation where you'd have to choose between using an ABC as a mixin and living with potentially incorrect ABC predicate tests in subclasses, or implementing the methods yourself (in the same way the mixin would have) to get the correct predicate behavior. Hashable isn't useful as a mixin and I think none of the other ABCs test deletable properties, but that doesn't seem to be a design principle, just a coincidence.
On Wed, Oct 2, 2019 at 5:18 AM Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
On Tue, Oct 1, 2019 at 9:26 AM Andrew Barnert <abarnert@yahoo.com> wrote:
On Sep 30, 2019, at 23:46, Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
ABC membership is a subtype relationship in some sense, and ordinary Python subclassing is a subtype relationship in some sense, but they aren't quite the same sense,
But in this case, they actually match. Hashable is correctly checking for structural subtyping, and the problem is that list isn’t actually a proper subtype of object, not that object isn’t a proper subtype of Hashable.
If you take the perspective that the ABC notion of subtyping is the correct one, then list isn't a proper subtype of object. I wasn't taking the perspective that either one is the correct one. I think they're probably both okay in isolation.
If you want to get REALLY technical, very few types - if any - are actually proper subtypes. A strict reading of the Liskov Substitution Principle would imply that the repr of every object should remain identical to object.__repr__ of that object. That's completely useless, of course, but the question is: if a subclass is allowed to make changes to its parent's behaviour, where do you draw the line? For instance, it is extremely common to expect that x == x for any object x, but float is a subtype of object, and not every float maintains that property. The base object type compares by identity, so object() != object(). Is it violating LSP if two non-identical objects compare equal? The base object is immutable. Subtypes that are also immutable will follow the principle that if a == b now, then a == b forever. Is it violating LSP to have mutable objects that can compare equal now but unequal later? The hashability issue is a logical consequence of accepting that the above violations are reasonable and practically useful. Since a==b must imply that hash(a)==hash(b), it logically follows that either the hash of every list is the same (thus destroying the value of hashing at all), or the hash of a list could change (thus destroying the very meaning of object hashes). I'm not sure what would break if every list object had a hash of 0x142857, but certainly it wouldn't be of much practical use, and would be a performance nightmare (you use a list as a dict key and suddenly every lookup is a linear search). Yes, it's not a "proper subtype", but proper subtypes are impractically restrictive. ChrisA
Chris Angelico wrote:
The hashability issue is a logical consequence of accepting that the above violations are reasonable and practically useful.
A more principled way to handle this wouild be for object not to be hashable, and have another base type for hashable objects. Hashable would then be a subtype of object, not the other way around. -- Greg
On Wed, Oct 2, 2019 at 8:29 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Chris Angelico wrote:
The hashability issue is a logical consequence of accepting that the above violations are reasonable and practically useful.
A more principled way to handle this wouild be for object not to be hashable, and have another base type for hashable objects. Hashable would then be a subtype of object, not the other way around.
The question then would be: why is object() not hashable? It can't be mutable, because then you violate LSP the other way (for the same reason frozenset isn't a subclass of set), and there'd be no logical reason for equality to be defined in any way that would violate hashability. ChrisA
Chris Angelico wrote:
The question then would be: why is object() not hashable?
It's not hashable because it's supposed to be the ultimate base type for all other objects, and not every object is hashable. It only seems odd if you're used to the idea that you get a bunch of default behaviours from object, including hashability. But if you want strict subtyping and also an ultimate base type, the base type has to include very little behaviour. -- Greg
On Wed, Oct 2, 2019 at 8:51 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Chris Angelico wrote:
The question then would be: why is object() not hashable?
It's not hashable because it's supposed to be the ultimate base type for all other objects, and not every object is hashable.
It only seems odd if you're used to the idea that you get a bunch of default behaviours from object, including hashability. But if you want strict subtyping and also an ultimate base type, the base type has to include very little behaviour.
Should the default object be comparable? Or would there be a "Comparable" subtype? If object() can be compared for equality with another object(), then it'd be illogical to make it unhashable - there's no reason to deny hashing. If it can't be compared, why not? Why should the base type include THAT little behaviour? Python's base object has a lot of extremely useful functionality, including being able to generate a string representation, introspect its attributes, pickle and unpickle it, and more. All of these features are available to all objects unless they choose not to. Do we really need separate subclasses for Picklable and Hashable, forcing every class author subclass all of them? Or can we allow most objects to be indeed pickled and hashed, and then open files choose to be unpickleable and lists choose to be unhashable? ChrisA
On Oct 1, 2019, at 15:59, Chris Angelico <rosuav@gmail.com> wrote:
On Wed, Oct 2, 2019 at 8:51 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Chris Angelico wrote:
The question then would be: why is object() not hashable?
It's not hashable because it's supposed to be the ultimate base type for all other objects, and not every object is hashable.
It only seems odd if you're used to the idea that you get a bunch of default behaviours from object, including hashability. But if you want strict subtyping and also an ultimate base type, the base type has to include very little behaviour.
Should the default object be comparable? Or would there be a "Comparable" subtype? If object() can be compared for equality with another object(), then it'd be illogical to make it unhashable - there's no reason to deny hashing. If it can't be compared, why not? Why should the base type include THAT little behaviour?
This really is a practicality-vs.-purity question. Theoretically, it is wrong to include any behavior in your top type that is not supported by every other type in the system. Practically, it is useful to include behavior in your top type that is used by _most_ types in the system and would be a pain to reimplement over and over again, even if it isn’t supported by all types. Different languages solve part of that balance by having a lot of decorators, or mixin classes, or macros, or magic flags to the implementation. (Or you even could go Smalltalk style, and different environments will add different methods to Object.) Python does some of that too—functools.total_order, and some of the collections.abc types being their own mixins, etc. But it still gets tedious to include six mixins and four decorators on 90% of your types, so Python compromises purity and moves some of that stuff into object. So, object is hashable, and that means some types that are going to pass an is-a test for object are going to violate LSP. (Obviously that’s not the actual design history of Python—Guido didn’t decide to add object.__hash__ only after imagining the future of multiple inheritance and decorators, and comparing how the three different design choices would impact all the dict code people we’re going to write in future decades. But you get the idea.)
On Oct 1, 2019, at 12:17, Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
On Tue, Oct 1, 2019 at 9:26 AM Andrew Barnert <abarnert@yahoo.com> wrote:
On Sep 30, 2019, at 23:46, Ben Rudiak-Gould <benrudiak@gmail.com> wrote: ABC membership is a subtype relationship in some sense, and ordinary Python subclassing is a subtype relationship in some sense, but they aren't quite the same sense,
But in this case, they actually match. Hashable is correctly checking for structural subtyping, and the problem is that list isn’t actually a proper subtype of object, not that object isn’t a proper subtype of Hashable.
If you take the perspective that the ABC notion of subtyping is the correct one, then list isn't a proper subtype of object.
I don’t think anyone takes that perspective. The correct notion of subtyping for Python is the one that exactly reproduces duck typing in an LBYL-checkable way, which I’ll bet is provably equivalent to the halting problem. However, the whole mess of things that go into isinstance and issubclass in Python (including third-party code) together define a pretty-good-in-practice approximation of that, when user with most real-life code. And a much better one than either inheritance or structural tests alone.
My problem is not that I can't justify the current behavior, it's that if it behaved differently, I could justify that behavior too. I feel like you're using CPython as an oracle of what ABCs should do,
Well, I’m using all Python implementations, together with all of the popular third-party code out there, as used by thousands of people every day, allowing us to work with types in a way that’s more pleasant to read than Java or Go or Objective C or Ruby, as an argument that Python’s design is in practice pretty good. I can make a good argument that inheritance alone, or structural subtyping alone, would be so terrible as to be nearly useless for Python. I can’t make an argument that no possible thing you could think of could be better than what Python currently does. (In fact, I can think of things I’d suggest be done differently if I had a time machine back. Then again, I could improve even more things with even less work by taking that time machine even further back and giving Guido the 2.3 type system back in 0.9, or *args and **kw.) But I don’t see why I have to make an argument for a years-old feature just because someone wants to add a minor extension that goes along with the way that feature has always been used. If you hate ABCs, you’re not going to use collections.abc.Subscriptable now matter how it’s bikeshedded.
What *should* it mean to inherit from an ABC? The module encourages you to use them as mixins, but maybe that isn't the intended meaning, just a side hack? Is the primary meaning to do the equivalent of registering the class with the predicate?
The intended meaning clearly is a mix of multiple things: some of them do structural subtyping, some don’t; some rely heavily on registration; some act as useful mixins, some don’t; etc. Theoretically, abc is a bit of a mess, and collections.abc, numbers, and io are a horrible pile of unusable crap. But in practice, they actually work pretty nicely in a wide range of code that people read every day.
On Mon, Sep 30, 2019 at 06:00:44PM +0900, Stephen J. Turnbull wrote:
Technical questions: does "Subscriptable" mean non-negative ints only, or does it include the negative "count from the end" protocol? How about slices?
It means the class defines a __getitem__ method. Like __call__, the semantics of that method, and the range of acceptable arguments, is out of scope of the ABC. I don't believe that Python has any way to check the semantics of a method call, except to call it and see what it does. -- Steve (the other one)
On Tue, 1 Oct 2019 at 11:24, Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Sep 30, 2019 at 06:00:44PM +0900, Stephen J. Turnbull wrote:
Technical questions: does "Subscriptable" mean non-negative ints only, or does it include the negative "count from the end" protocol? How about slices?
It means the class defines a __getitem__ method. Like __call__, the semantics of that method, and the range of acceptable arguments, is out of scope of the ABC.
Then what use is this particular ABC? I assume you had a reason for wanting this when starting this thread but I still don't see what that is. The distinction between a Mapping and a Sequence is important because basic usage is different e.g.: for x in obj: print(obj[x]) # fine for Mapping but not for Sequence for n in range(len(obj)): print(obj[n]) # fine for Sequence but not for Mapping I can see situations where you might want to differentiate between these two in order to do something useful. I would still prefer not to write code that way myself and instead to simply document my expectation that e.g. a mapping is expected. I can see though why someone would want to use isinstance(obj, Mapping) in some situations. With Subscriptable I don't see a situation where the isinstance test is actually useful. I'm assuming here that your reason for wanting the Subscriptable ABC is to branch on isinstance but maybe that's not right. This discussion would be easier if you would give a clear example where this might be useful. -- Oscar
On 30.09.2019 01:41, Steven D'Aprano wrote:
[...] But subscripting (indexing) is a conspicuous exception. There's no ABC for testing whether something supports subscripting.
+1 for adding an ABC to signal support for indexing. BTW: Something I miss in the ABCs is the distinction between indexing using integers for the purpose of accessing a member by position (e.g. in a sequence) and that of indexing objects by way of a key object (let's say a string in e.g. a mapping). Algorithms will typically only work with one type of indexing and the two also use different exceptions to signal "no such member": IndexError for the positional lookups vs. KeyError for the key object lookups. There's no way to map this to special methods, since both mechanisms use .__getitem__(), so the ABC would actually provide information which you can otherwise not easily determine. Perhaps we could have PositionIndexable and KeyIndexable for this (or some other names). Indexable would then be their base class and not allow for the distinction or permit both. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Oct 01 2019)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
participants (18)
-
Andrew Barnert
-
Ben Rudiak-Gould
-
Brendan Barnwell
-
Chris Angelico
-
Christopher Barker
-
Devin Jeanpierre
-
Greg Ewing
-
Ivan Levkivskyi
-
Kyle Stanley
-
M.-A. Lemburg
-
MRAB
-
Neil Girdhar
-
Oscar Benjamin
-
Paul Moore
-
Richard Musil
-
Ricky Teachey
-
Stephen J. Turnbull
-
Steven D'Aprano