Methods in Python are defined as functions in the class namespace. When call the method of the object, the function will be called with the object as the first argument. And furthermore, unbound methods can be called with passing self as the first argument. For example, str.upper('abc') returns 'ABC'. So class can be considered as a namespace for functions related to objects of the specified type. For methods defined in Python you can pass arbitrary object as self. But in methods defined in C it should be an instance of the class in which the method was defined. On one hand, it is very convenient -- you cab be sure that self is binary compatible with the specified class. On other hand, it restricts you. I propose to add the METH_GENERAL flag, which is applicable to methods as METH_CLASS and METH_STATIC (and is mutually incompatible with them). If it is set, the check for the type of self will be omitted, and you can pass an arbitrary object as the first argument of the unbound method. I have several use cases for this. 1. Bytes and bytearray methods. Bytes and bytearray has a lot of of sequence-like and string-like methods. They also implement the buffer protocol. There are other objects which implement the buffer protocol: memoryview, BytesIO, array.array, mmap.mmap, ctypes arrays, NumPy arrays, but they lack most of these methods. To use these methods (for example index()) you need to copy the content to a bytes or bytearray, that invalidates the purpose of the buffer protocol which was designed to avoid copying of binary data. With general methods we can make bytes.index() be applicable to any object which supports the buffer protocol. bytes.index() and bytearray.index() will be equivalent, but the difference between bytes.split() and bytearray.split() will be in the result type. 2. Set methods. Some set methods accept arbitrary number of arguments and accept not only sets, but any iterables. Sou you can get a union of two lists for example:
set().union([1, 2], [2, 3]) {1, 2, 3}
It does work because a union with an empty set is a no-op. This trick does not work with other methods. You have to convert the first iterable to set explicitly.
set([1, 2]).symmetric_difference([2, 3]) {1, 3}
With general methods we can make unbound set methods accepting arbitrary iterables and convert the first one to set implicitly (or avoid creating a set if it is possible). We could use set.symmetric_difference([1, 2], [2, 3]). Maybe there are other use cases. But I do not suggest making all methods of builtin types general: only if there is a more general protocol (as the buffer protocol or the iterator protocol) and there is a profit of using such protocol without creating an object of the corresponding type explicitly. For example, I think that str methods should not be general, despite the fact that any object can be converted to string. Implicit conversions to str does not have profit and may hide bugs. Other example -- float.as_integer_ratio() should not accept int, despite the fact that most function which accept float implicitly convert int to float. It is a lossy conversion for large integers, and the loss will affect the result. I used the term "general methods" to name this feature, but if there are better proposition I will use it.
On Fri, May 08, 2020 at 10:46:45PM +0300, Serhiy Storchaka wrote:
I propose to add the METH_GENERAL flag, which is applicable to methods as METH_CLASS and METH_STATIC (and is mutually incompatible with them). If it is set, the check for the type of self will be omitted, and you can pass an arbitrary object as the first argument of the unbound method.
Does this effect code written in Python? As I understand, in Python code, unbound methods are just plain old functions, and there is no type-checking done on `self`. py> class C: ... def method(self, arg): ... return (self,) ... py> C.method(999, None) (999,) So I think your proposal will only affect builtin methods written in C. Is that correct? -- Steven
On Fri, May 8, 2020 at 3:47 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, May 08, 2020 at 10:46:45PM +0300, Serhiy Storchaka wrote:
I propose to add the METH_GENERAL flag, which is applicable to methods as METH_CLASS and METH_STATIC (and is mutually incompatible with them). If it is set, the check for the type of self will be omitted, and you can pass an arbitrary object as the first argument of the unbound method.
Does this effect code written in Python? As I understand, in Python code, unbound methods are just plain old functions, and there is no type-checking done on `self`.
py> class C: ... def method(self, arg): ... return (self,) ... py> C.method(999, None) (999,)
So I think your proposal will only affect builtin methods written in C. Is that correct?
Yes, that is correct. I think Serhiy's idea has merit. The key thing is that, even though it seems an obscure CPython implementation detail, it actually affects the language definition (or at least the specification of selected stdlib methods) because when we say that e.g. bytes.index(bytearray(...), b) works, we should document it in the stdlib docs, and other Python implementations will have to support this too. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 09May2020 08:38, Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, May 08, 2020 at 10:46:45PM +0300, Serhiy Storchaka wrote:
I propose to add the METH_GENERAL flag, which is applicable to methods as METH_CLASS and METH_STATIC (and is mutually incompatible with them). If it is set, the check for the type of self will be omitted, and you can pass an arbitrary object as the first argument of the unbound method.
Does this effect code written in Python? As I understand, in Python code, unbound methods are just plain old functions, and there is no type-checking done on `self`.
Aye. The whole purpose it to make it possible to provide this convenience in C for certain methods. From the post: For methods defined in Python you can pass arbitrary object as self. But in methods defined in C it should be an instance of the class in which the method was defined. which is the impediment which motivates the proposal. Cheers, Cameron Simpson <cs@cskk.id.au>
On May 8, 2020, at 15:44, Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, May 08, 2020 at 10:46:45PM +0300, Serhiy Storchaka wrote:
I propose to add the METH_GENERAL flag, which is applicable to methods as METH_CLASS and METH_STATIC (and is mutually incompatible with them). If it is set, the check for the type of self will be omitted, and you can pass an arbitrary object as the first argument of the unbound method.
Does this effect code written in Python? As I understand, in Python code, unbound methods are just plain old functions, and there is no type-checking done on `self`.
py> class C: ... def method(self, arg): ... return (self,) ... py> C.method(999, None) (999,)
So I think your proposal will only affect builtin methods written in C. Is that correct?
Maybe the best way to see it is this: For classes implemented in Python, you have to go out of your way to typecheck self. For classes implemented in C, you have to go out of your way to _not_ typecheck self. It’s probably way too big of a change to make them consistent at this point, so Serhiy is just proposing a way to make it a lot easier for C methods to act like Python ones when you need them to. And, given that he has some solid use cases, it’s hard to see any problem with that.
participants (5)
-
Andrew Barnert
-
Cameron Simpson
-
Guido van Rossum
-
Serhiy Storchaka
-
Steven D'Aprano