In a recent tracker issue about OrderedDict [1] we've had some discussion about the use of type(od) as a replacement for od.__class__. It came up because the pure Python implementation of OrderedDict uses self.__class__ in 3 different methods (__repr__, __reduce__, and copy). The patch in that issue changes the C implementation to use Py_TYPE(). [2] So I wanted to get some feedback on the practical implications of such a change and if we need to clarify the difference more formally. In this specific case [3] there are 3 questions: * Should __repr__() for a stdlib class use type(self).__name__ or self.__class__.__name__? * Should __reduce__() return type(self) or self.__class__? * Should copy() use type(self) or self.__class__? The more general question of when we use type(obj) vs. obj.__class__ applies to both the language and to all the stdlib as I expect consistency there would result in fewer surprises. I realize that there are some places where using obj.__class__ makes more sense (e.g. for some proxy support). There are other places where using type(obj) is the way to go (e.g. special method lookup). However, the difference is muddled enough that usage is inconsistent in the stdlib. For example, C-implemented types use Py_TYPE() almost exclusively. So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__". -eric [1] http://bugs.python.org/issue25410 [2] I'm going to open a separate thread about the issue of compatibility and C accelerated types. [3] https://hg.python.org/cpython/file/default/Lib/collections/__init__.py#l238
On 18.10.15 00:45, Eric Snow wrote:
In a recent tracker issue about OrderedDict [1] we've had some discussion about the use of type(od) as a replacement for od.__class__. It came up because the pure Python implementation of OrderedDict uses self.__class__ in 3 different methods (__repr__, __reduce__, and copy). The patch in that issue changes the C implementation to use Py_TYPE(). [2] So I wanted to get some feedback on the practical implications of such a change and if we need to clarify the difference more formally.
In this specific case [3] there are 3 questions:
* Should __repr__() for a stdlib class use type(self).__name__ or self.__class__.__name__? * Should __reduce__() return type(self) or self.__class__? * Should copy() use type(self) or self.__class__?
The more general question of when we use type(obj) vs. obj.__class__ applies to both the language and to all the stdlib as I expect consistency there would result in fewer surprises. I realize that there are some places where using obj.__class__ makes more sense (e.g. for some proxy support). There are other places where using type(obj) is the way to go (e.g. special method lookup). However, the difference is muddled enough that usage is inconsistent in the stdlib. For example, C-implemented types use Py_TYPE() almost exclusively.
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
-eric
[1] http://bugs.python.org/issue25410 [2] I'm going to open a separate thread about the issue of compatibility and C accelerated types. [3] https://hg.python.org/cpython/file/default/Lib/collections/__init__.py#l238
Want to add that in common case type(obj) is the same as obj.__class__. When you set obj.__class__ (assignment is restricted by issue24912), type(obj) is changed as well. You can make obj.__class__ differ from type(obj) if set __class__ as class attribute at class creation time, or made __class__ a property, or like.
class A: pass ... class B: __class__ = A ... type(B()) <class '__main__.B'> B().__class__ <class '__main__.A'>
The only places where obj.__class__ made different from type(obj) in the stdlib, besides tests, are mock object (hence related to tests), and proxy stream in xml.sax.saxutils (I'm not sure that the latter use is correct). About pickling. Default implementation of __reduce_ex__ uses obj.__class__ in protocols 0 and 1, and type(obj) in protocols 2+.
B().__reduce_ex__(1) (<function _reconstructor at 0xb705965c>, (<class '__main__.A'>, <class 'object'>, None)) B().__reduce_ex__(2) (<function __newobj__ at 0xb70596ec>, (<class '__main__.B'>,), None, None, None)
But pickler rejects classes with mismatched type(obj) and obj.__class__ in protocols 2+.
pickle.dumps(B(), 2) Traceback (most recent call last): File "<stdin>", line 1, in <module> _pickle.PicklingError: args[0] from __newobj__ args has the wrong class
On Sat, Oct 17, 2015 at 03:45:19PM -0600, Eric Snow wrote:
In a recent tracker issue about OrderedDict [1] we've had some discussion about the use of type(od) as a replacement for od.__class__. [...] The more general question of when we use type(obj) vs. obj.__class__ applies to both the language and to all the stdlib as I expect consistency there would result in fewer surprises. I realize that there are some places where using obj.__class__ makes more sense (e.g. for some proxy support). There are other places where using type(obj) is the way to go (e.g. special method lookup). However, the difference is muddled enough that usage is inconsistent in the stdlib. For example, C-implemented types use Py_TYPE() almost exclusively.
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
I for one would like to see a definitive explanation for when they are different, and when you should use one or the other. The only obvious example I've seen is the RingBuffer from the Python Cookbook: http://code.activestate.com/recipes/68429-ring-buffer/ -- Steve
On 18 October 2015 at 05:55, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Oct 17, 2015 at 03:45:19PM -0600, Eric Snow wrote:
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
I for one would like to see a definitive explanation for when they are different, and when you should use one or the other. The only obvious example I've seen is the RingBuffer from the Python Cookbook:
It looks like this example just assigns to the existing __class__ attribute, to switch to a different class. I haven’t seen this ability mentioned in the documentation, but I suspect it is meant to be supported. However assigning to __class__ like that should automatically update the type() return value, so type(ring_buffer) == ring_buffer.__class__ is still maintained. Perhaps some of this confusion comes from Python 2. I don’t know the details, but I know in Python 2, type() can do something different, so you have to use __class__ directly if you want to be compatible with Python 2 classes. But in Python 3 code I prefer using direct function calls like type() to “special attributes” like __class__ where practical. The documentation says that __*__ names are reserved for Python and its built-in library, rather than user code. So user code that creates a class attribute or property called __class__ is asking for trouble IMO, and we shouldn’t spend much effort accommodating such cases. For __repr__() I would use type(), which seems to agree with what object.__repr__() uses.
It's mostly a historical accident -- for classic classes, type(inst) was always `instance' while inst.__class__ was the user-defined class object. For new-style classes, the idea is that you can write a proxy class that successfully masquerades as another class. Because __class__ is an attribute, a proxy class can fake this attribute. But type() would reveal the proxy class. IIRC __class__ is used by the isinstance() implementation, although the code is complicated and I wouldn't be surprised if isinstance(x, type(x)) was also true for proxy instances. (I haven't looked at the code in a long time and it's not easy to follow, alas.) C code that checks the type instead of __class__ is probably one reason why proxy classes have never taken off -- there just are too many exceptions, so the experience is never very smooth, and everyone ends up cursing the proxy class. Maybe this kind of "strong" proxy class is just not a good idea. And maybe then we needn't worry about the distinction between type() and __class__. On Sat, Oct 17, 2015 at 10:55 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Oct 17, 2015 at 03:45:19PM -0600, Eric Snow wrote:
In a recent tracker issue about OrderedDict [1] we've had some discussion about the use of type(od) as a replacement for od.__class__. [...] The more general question of when we use type(obj) vs. obj.__class__ applies to both the language and to all the stdlib as I expect consistency there would result in fewer surprises. I realize that there are some places where using obj.__class__ makes more sense (e.g. for some proxy support). There are other places where using type(obj) is the way to go (e.g. special method lookup). However, the difference is muddled enough that usage is inconsistent in the stdlib. For example, C-implemented types use Py_TYPE() almost exclusively.
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
I for one would like to see a definitive explanation for when they are different, and when you should use one or the other. The only obvious example I've seen is the RingBuffer from the Python Cookbook:
http://code.activestate.com/recipes/68429-ring-buffer/
-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)
This recipe looks like a bad design to me to start with. It's too-clever-by-half, IMO. If I were to implement RingBuffer, I wouldn't futz around with the __class__ attribute to change it into another thing when it was full. A much more obvious API for users would be simply to implement a RingBuffer.isfull() method, perhaps supported by an underlying RingBuffer._full boolean attribute. That's much friendlier than expecting people to introspect the type of the thing for a question that only occasionally matters; and when it does matter, the question is always conceived exactly as "Is it full?" not "What class is this currently?" So I think I'm still waiting for a compelling example where type(x) != x.__class__ would be worthwhile (yes, of course it's *possible*) On Sat, Oct 17, 2015 at 10:55 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Oct 17, 2015 at 03:45:19PM -0600, Eric Snow wrote:
In a recent tracker issue about OrderedDict [1] we've had some discussion about the use of type(od) as a replacement for od.__class__. [...] The more general question of when we use type(obj) vs. obj.__class__ applies to both the language and to all the stdlib as I expect consistency there would result in fewer surprises. I realize that there are some places where using obj.__class__ makes more sense (e.g. for some proxy support). There are other places where using type(obj) is the way to go (e.g. special method lookup). However, the difference is muddled enough that usage is inconsistent in the stdlib. For example, C-implemented types use Py_TYPE() almost exclusively.
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
I for one would like to see a definitive explanation for when they are different, and when you should use one or the other. The only obvious example I've seen is the RingBuffer from the Python Cookbook:
http://code.activestate.com/recipes/68429-ring-buffer/
-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
I re-coded the "too clever by half" RingBuffer to use the same design but with delegation ... and it ran 50% slower. (Code available on request) Then I changed it to switch implementations of append() and get() when it got full (the code is below) and it ran at essentially the same speed as the original. So, there's no need to be so clever with __class__. Of course, this trick of replacing a method is also "too clever by half"; but an instance variable for "full" slows it down by 15%. class RingBuffer(object): def __init__(self, size_max): self.max = size_max self.data = [] self.cur = 0 def append(self, x): self.data.append(x) if len(self.data) == self.max: self.append = self.append_full def append_full(self, x): self.data[self.cur] = x self.cur = (self.cur + 1) % self.max def get(self): return self.data[self.cur:] + self.data[:self.cur] On 18 October 2015 at 08:45, David Mertz <mertz@gnosis.cx> wrote:
This recipe looks like a bad design to me to start with. It's too-clever-by-half, IMO.
If I were to implement RingBuffer, I wouldn't futz around with the __class__ attribute to change it into another thing when it was full. A much more obvious API for users would be simply to implement a RingBuffer.isfull() method, perhaps supported by an underlying RingBuffer._full boolean attribute. That's much friendlier than expecting people to introspect the type of the thing for a question that only occasionally matters; and when it does matter, the question is always conceived exactly as "Is it full?" not "What class is this currently?"
So I think I'm still waiting for a compelling example where type(x) != x.__class__ would be worthwhile (yes, of course it's *possible*)
On Sat, Oct 17, 2015 at 10:55 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Oct 17, 2015 at 03:45:19PM -0600, Eric Snow wrote:
In a recent tracker issue about OrderedDict [1] we've had some discussion about the use of type(od) as a replacement for od.__class__. [...] The more general question of when we use type(obj) vs. obj.__class__ applies to both the language and to all the stdlib as I expect consistency there would result in fewer surprises. I realize that there are some places where using obj.__class__ makes more sense (e.g. for some proxy support). There are other places where using type(obj) is the way to go (e.g. special method lookup). However, the difference is muddled enough that usage is inconsistent in the stdlib. For example, C-implemented types use Py_TYPE() almost exclusively.
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
I for one would like to see a definitive explanation for when they are different, and when you should use one or the other. The only obvious example I've seen is the RingBuffer from the Python Cookbook:
http://code.activestate.com/recipes/68429-ring-buffer/
-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann%40google.com
I'm not sure what benchmark you used to define the speed of RingBuffer. I'm sure you are reporting numbers accurately for your tests, but there are "lies, damn lies, and benchmarks", so "how fast" has a lot of nuance to it. In any case, redefining a method in a certain situation feels a lot less magic to me than redefining .__class__, and clarity and good API are much more important than micro-optimization for something unlikely to be on a critical path. That's interesting about the `self._full` variable slowing it down, I think I'm not surprised (but obviously it depends on just how it's used). But one can also simply define RingBuffer.isfull() using `self.max==len(self.data)` if you prefer that approach. I doubt `myringbuffer.isfull()` is something you need to call in an inner loop. That said, I think my implementation of RingBuffer would probably look more like (completely untested): class RingBuffer(object): def __init__(self, size_max): self.data = [None] * size_max self.size_max = size_max self.used = 0 self.cur = 0 def append(self, val): self.data[self.cur] = val self.cur = (self.cur+1) % self.size_max self.used = max(self.used, self.cur+1) def isfull(self): self.used == self.size_max Feel free to try this version against whatever benchmark you have in mind. On Sun, Oct 18, 2015 at 5:09 PM, Peter Ludemann <pludemann@google.com> wrote:
I re-coded the "too clever by half" RingBuffer to use the same design but with delegation ... and it ran 50% slower. (Code available on request) Then I changed it to switch implementations of append() and get() when it got full (the code is below) and it ran at essentially the same speed as the original. So, there's no need to be so clever with __class__. Of course, this trick of replacing a method is also "too clever by half"; but an instance variable for "full" slows it down by 15%.
class RingBuffer(object): def __init__(self, size_max): self.max = size_max self.data = [] self.cur = 0 def append(self, x): self.data.append(x) if len(self.data) == self.max: self.append = self.append_full def append_full(self, x): self.data[self.cur] = x self.cur = (self.cur + 1) % self.max def get(self): return self.data[self.cur:] + self.data[:self.cur]
On 18 October 2015 at 08:45, David Mertz <mertz@gnosis.cx> wrote:
This recipe looks like a bad design to me to start with. It's too-clever-by-half, IMO.
If I were to implement RingBuffer, I wouldn't futz around with the __class__ attribute to change it into another thing when it was full. A much more obvious API for users would be simply to implement a RingBuffer.isfull() method, perhaps supported by an underlying RingBuffer._full boolean attribute. That's much friendlier than expecting people to introspect the type of the thing for a question that only occasionally matters; and when it does matter, the question is always conceived exactly as "Is it full?" not "What class is this currently?"
So I think I'm still waiting for a compelling example where type(x) != x.__class__ would be worthwhile (yes, of course it's *possible*)
On Sat, Oct 17, 2015 at 10:55 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Oct 17, 2015 at 03:45:19PM -0600, Eric Snow wrote:
In a recent tracker issue about OrderedDict [1] we've had some discussion about the use of type(od) as a replacement for od.__class__. [...] The more general question of when we use type(obj) vs. obj.__class__ applies to both the language and to all the stdlib as I expect consistency there would result in fewer surprises. I realize that there are some places where using obj.__class__ makes more sense (e.g. for some proxy support). There are other places where using type(obj) is the way to go (e.g. special method lookup). However, the difference is muddled enough that usage is inconsistent in the stdlib. For example, C-implemented types use Py_TYPE() almost exclusively.
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
I for one would like to see a definitive explanation for when they are different, and when you should use one or the other. The only obvious example I've seen is the RingBuffer from the Python Cookbook:
http://code.activestate.com/recipes/68429-ring-buffer/
-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann%40google.com
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Mon, Oct 19, 2015 at 11:35 AM, David Mertz <mertz@gnosis.cx> wrote:
That's interesting about the `self._full` variable slowing it down, I think I'm not surprised (but obviously it depends on just how it's used). But one can also simply define RingBuffer.isfull() using `self.max==len(self.data)` if you prefer that approach. I doubt `myringbuffer.isfull()` is something you need to call in an inner loop.
That said, I think my implementation of RingBuffer would probably look more like (completely untested):
class RingBuffer(object): def __init__(self, size_max): self.data = [None] * size_max self.size_max = size_max self.used = 0 self.cur = 0 def append(self, val): self.data[self.cur] = val self.cur = (self.cur+1) % self.size_max self.used = max(self.used, self.cur+1) def isfull(self): self.used == self.size_max
Feel free to try this version against whatever benchmark you have in mind.
What does this provide that collections.deque(maxlen=size_max) doesn't? I'm a little lost. ChrisA
On 18 October 2015 at 17:41, Chris Angelico <rosuav@gmail.com> wrote:
That's interesting about the `self._full` variable slowing it down, I
On Mon, Oct 19, 2015 at 11:35 AM, David Mertz <mertz@gnosis.cx> wrote: think
I'm not surprised (but obviously it depends on just how it's used). But one can also simply define RingBuffer.isfull() using `self.max==len(self.data)` if you prefer that approach. I doubt `myringbuffer.isfull()` is something you need to call in an inner loop.
That said, I think my implementation of RingBuffer would probably look more like (completely untested):
class RingBuffer(object): def __init__(self, size_max): self.data = [None] * size_max self.size_max = size_max self.used = 0 self.cur = 0 def append(self, val): self.data[self.cur] = val self.cur = (self.cur+1) % self.size_max self.used = max(self.used, self.cur+1) def isfull(self): self.used == self.size_max
Feel free to try this version against whatever benchmark you have in mind.
What does this provide that collections.deque(maxlen=size_max) doesn't? I'm a little lost.
I was merely re-implementing the "clever" code in a slightly less clever way, for the same performance, to demonstrate that there's no need to assign to __class__. collections.deque is about 5x faster. (My simple benchmark tests the cost of x.append(i)) - p
ChrisA _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann%40google.com
On Mon, Oct 19, 2015 at 11:41:44AM +1100, Chris Angelico wrote:
What does this provide that collections.deque(maxlen=size_max) doesn't? I'm a little lost.
The Ringbuffer recipe predates deque by quite a few years. These days I would consider it only useful in a pedagogical context, giving a practical use for changing the class of an object on-the-fly. -- Steve
On Sun, Oct 18, 2015 at 05:35:14PM -0700, David Mertz wrote:
In any case, redefining a method in a certain situation feels a lot less magic to me than redefining .__class__
That surprises me greatly. As published in the Python Cookbook[1], there is a one-to-one correspondence between the methods used by an object and its class. If you want to know what instance.spam() method does, you look at the class type(instance) or instance.__class__, and read the source code for spam. With your suggestion of re-defining the methods on the fly, you no longer have that simple relationship. If you want to know what instance.spam() method does, first you have to work out what it actually is, which may not be that easy. In the worst case, it might not be possible at all: class K: def method(self): if condition: self.method = random.choice([lambda self: ..., lambda self: ..., lambda self: ...]) Okay, that's an extreme example, and one can write bad code using any technique. But even with a relatively straight-forward version: def method(self): if condition: self.method = self.other_method I would classify "change the methods on the fly" as self-modifying code, which strikes me as much more hacky and hard to maintain than something as simple as changing the __class__ on the fly. Changing the __class__ is just a straight-forward metamorphosis: what was a caterpillar, calling methods defined in the Caterpillar class, is now a butterfly, calling methods defined in the Butterfly class. (The only change I would make from the published recipe would be to make the full Ringbuffer a subclass of the regular one, so isinstance() tests would work as expected. But given that the recipe pre-dates the wide-spread use of isinstance, the author can be forgiven for not thinking of that.) If changing the class on the fly is a metamorphosis, then it seems to me that self-modifying methods are like something from The Fly, where a horrible teleporter accident grafts body parts and DNA from one object into another object... or at least *repurposes* existing methods, so that what was your leg is now your arm. I've done that, and found it harder to reason about than the alternative: "okay, the object is an RingBuffer, but is the append method the RingBuffer.append method or the RingBuffer.full_append method?" versus "okay, the object is a RingBuffer, therefore the append method is the RingBuffer.append method". In my opinion, the only tricky thing about the metamorphosis tactic is that: obj = Caterpillar() # later assert type(obj) is Caterpillar may fail. You need a runtime introspection to see what the type of obj actually is. But that's not exactly unusual: if you consider Caterpillar to be a function rather than a class constructor (a factory perhaps?), then it's not that surprising that you can't know what *specific* type a function returns until runtime. There are many functions with polymorphic return types. [1] The first edition of the Cookbook was edited by Python luminaries Alex Martelli and David Ascher, so this recipe has their stamp of approval. This isn't some dirty hack. -- Steve
My intuition differs from Steven's here. But that's fine. In any case, my simple implementation of RingBuffer in this thread avoids either rebinding methods or changing .__class__. And yes, of course collections.deque is better than any of these implementations. I was just trying to show that any such magic is unlikely to be necessary... and in particular that the recipe given as an example doesn't show it is. But still, you REALLY want your `caterpillar = Caterpillar()` to become something of type "Butterfly" later?! Obviously I understand the biological metaphor. But I'd much rather have an API that provided me with .has_metamorphosed() then have to look for the type as something new. Btw. Take a look at Alex' talk with Anna at PyCon 2015. They discuss various "best practices" that have been superseded by improved language facilities. They don't say anything about this "mutate the __class__ trick", but I somehow suspect he'd put that in that category. On Sun, Oct 18, 2015 at 6:47 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Oct 18, 2015 at 05:35:14PM -0700, David Mertz wrote:
In any case, redefining a method in a certain situation feels a lot less magic to me than redefining .__class__
That surprises me greatly. As published in the Python Cookbook[1], there is a one-to-one correspondence between the methods used by an object and its class. If you want to know what instance.spam() method does, you look at the class type(instance) or instance.__class__, and read the source code for spam.
With your suggestion of re-defining the methods on the fly, you no longer have that simple relationship. If you want to know what instance.spam() method does, first you have to work out what it actually is, which may not be that easy. In the worst case, it might not be possible at all:
class K: def method(self): if condition: self.method = random.choice([lambda self: ..., lambda self: ..., lambda self: ...])
Okay, that's an extreme example, and one can write bad code using any technique. But even with a relatively straight-forward version:
def method(self): if condition: self.method = self.other_method
I would classify "change the methods on the fly" as self-modifying code, which strikes me as much more hacky and hard to maintain than something as simple as changing the __class__ on the fly.
Changing the __class__ is just a straight-forward metamorphosis: what was a caterpillar, calling methods defined in the Caterpillar class, is now a butterfly, calling methods defined in the Butterfly class.
(The only change I would make from the published recipe would be to make the full Ringbuffer a subclass of the regular one, so isinstance() tests would work as expected. But given that the recipe pre-dates the wide-spread use of isinstance, the author can be forgiven for not thinking of that.)
If changing the class on the fly is a metamorphosis, then it seems to me that self-modifying methods are like something from The Fly, where a horrible teleporter accident grafts body parts and DNA from one object into another object... or at least *repurposes* existing methods, so that what was your leg is now your arm.
I've done that, and found it harder to reason about than the alternative:
"okay, the object is an RingBuffer, but is the append method the RingBuffer.append method or the RingBuffer.full_append method?"
versus
"okay, the object is a RingBuffer, therefore the append method is the RingBuffer.append method".
In my opinion, the only tricky thing about the metamorphosis tactic is that:
obj = Caterpillar() # later assert type(obj) is Caterpillar
may fail. You need a runtime introspection to see what the type of obj actually is. But that's not exactly unusual: if you consider Caterpillar to be a function rather than a class constructor (a factory perhaps?), then it's not that surprising that you can't know what *specific* type a function returns until runtime. There are many functions with polymorphic return types.
[1] The first edition of the Cookbook was edited by Python luminaries Alex Martelli and David Ascher, so this recipe has their stamp of approval. This isn't some dirty hack.
-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
Assigning __class__ is a precarious stunt (look at the implementation, it requires lots of checks for various things like __slots__ and implementation-specific special cases). The gesture that looks like "overriding a method" is merely setting a new instance attribute that hides the method, and quite tame in comparison. -- --Guido van Rossum (python.org/~guido)
On 18.10.15 00:45, Eric Snow wrote:
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
My conclusion of this discussion. In Python 3 type(obj) and obj.__class__ are the same in common case. Assigning obj.__class__ is a way to change type(obj). If the assignment is successful, type(obj) becomes the same as obj.__class__. This is used in importlib for lazy importing and some clever classes like the RingBuffer recipe. But __class__ assignment has many restrictions, and changing Python to C implementation or adding __slots__ for sure adds new restrictions. obj.__class__ is different from type(obj) in proxy classes like weakref or Mock. isinstance() and pickle take __class__ to account to support proxies. Unless we write proxy class or code that should handle proxy classes, we shouldn't care about the difference between type(obj) and obj.__class__, and can use what is the more convenient. In Python this is obj.__class__ (avoids globals lookup), and in C this is type(obj) (much simpler and reliable code).
On 20 October 2015 at 10:21, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 18.10.15 00:45, Eric Snow wrote:
So, would it make sense to establish some concrete guidelines about when to use type(obj) vs. obj.__class__? If so, what would those be? It may also be helpful to enumerate use cases for "type(obj) is not obj.__class__".
My conclusion of this discussion. In Python 3 type(obj) and obj.__class__ are the same in common case. Assigning obj.__class__ is a way to change type(obj). If the assignment is successful, type(obj) becomes the same as obj.__class__. This is used in importlib for lazy importing and some clever classes like the RingBuffer recipe. But __class__ assignment has many restrictions, and changing Python to C implementation or adding __slots__ for sure adds new restrictions.
obj.__class__ is different from type(obj) in proxy classes like weakref or Mock. isinstance() and pickle take __class__ to account to support proxies.
Unless we write proxy class or code that should handle proxy classes, we shouldn't care about the difference between type(obj) and obj.__class__, and can use what is the more convenient. In Python this is obj.__class__ (avoids globals lookup), and in C this is type(obj) (much simpler and reliable code).
Right, this is a good summary. Weakref proxies provide one of the simplest demonstrations of cases where the two diverge:
from weakref import proxy class C: pass ... obj = C() ref = proxy(obj) type(ref) <class 'weakproxy'> ref.__class__ <class '__main__.C'>
When we use "obj.__class__", we're treating proxies as their target, when we use "type(obj)", we're treating them as the proxy object. Which of those to use depends greatly on what we're doing. For Eric's original question that started the thread: proxy types shouldn't inherit from a concrete container type like OrderedDict, so type(self) and self.__class__ should *always* give the same answer, even in subclasses. Cheers, Nick. P.S. Proxy types are actually quite hard to right correctly, so if anyone *does* need to implement one, they're likely to be best served by starting with an existing library like wrapt: http://wrapt.readthedocs.org/en/latest/wrappers.html -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (9)
-
Chris Angelico
-
David Mertz
-
Eric Snow
-
Guido van Rossum
-
Martin Panter
-
Nick Coghlan
-
Peter Ludemann
-
Serhiy Storchaka
-
Steven D'Aprano