Return type of alternative constructors

Some types have alternative constructors -- class methods used to create an instance of the class. For example: int.from_bytes(), float.fromhex(), dict.fromkeys(), Decimal.from_float(). But what should return these methods for subclasses? Should they return an instance of base class or an instance of subclass? Almost all alternative constructors return an instance of subclass (exceptions are new in 3.6 bytes.fromhex() and bytearray.fromhex() that return bare bytes and bytearray). But there is a problem, because this allows to break invariants provided by the main constructor. For example, there are only two instances of the bool class: False and True. But with the from_bytes() method inherited from int you can create new boolean values! >>> Confusion = bool.from_bytes(b'\2', 'big') >>> isinstance(Confusion, bool) True >>> Confusion == True False >>> bool(Confusion) True >>> Confusion False >>> not Confusion False bool is just the most impressive example, the same problem exists with IntEnum and other enums derived from float, Decimal, datetime. [1] The simplest solution is to return an instance of base class. But this can breaks a code, and for this case we should be use static method (like str.maketrans), not class method. Should alternative constructor call __new__ and __init__ methods? Thay can change signature in derived class. Should it complain if __new__ or __init__ were overridden? [1] http://bugs.python.org/issue23640

On 05/07/2016 03:39 PM, Serhiy Storchaka wrote:
Some types have alternative constructors -- class methods used to create an instance of the class. For example: int.from_bytes(), float.fromhex(), dict.fromkeys(), Decimal.from_float().
But what should return these methods for subclasses? Should they return an instance of base class or an instance of subclass? Almost all alternative constructors return an instance of subclass (exceptions are new in 3.6 bytes.fromhex() and bytearray.fromhex() that return bare bytes and bytearray). But there is a problem, because this allows to break invariants provided by the main constructor.
Please ignore my comments in that issue. I actually prefer that class constructors go through the subclass' __new__ and __init__. Overriding parent class methods for the sole purpose of getting the subclass's type is quite irritating. -- ~Ethan~

IMO bool is a special case because it's meant to be a final class, and the implementation of int (which is in C and so can violate most rules) doesn't respect that. But in general I think the only reasonable approach is that a construction class method should return an instance of the subclass; these class methods have a signature that's constrained by their signature in the base class. OTOH operators like __add__ cannot be expected to return an instance of the subclass, because these typically construct an instance using __new__/__init__, whose signatures are *not* constrained by the base class.

On 8 May 2016 at 08:39, Serhiy Storchaka <storchaka@gmail.com> wrote:
Should alternative constructor call __new__ and __init__ methods? Thay can change signature in derived class.
I think this is typically the way to go (although, depending on the specific type, the unpickling related methods may be a more appropriate way for the alternate constructor to populate the instance state)
Should it complain if __new__ or __init__ were overridden?
If there are alternate constructors that depend on either the signature of __new__/__init__, unpickling support, or some other mechanism for creating new instances, this should be mentioned in the class documentation as a constraint on subclasses - if subclasses don't want to meet the constraint, they'll need to override the affected alternate constructors. Cheers, Nick. P.S. The potential complexity of that is one of the reasons the design philosophy of "prefer composition to inheritance" has emerged - subclassing is a powerful tool, but it does mean you often end up needing to care about more interactions between the subclass and the base class than you really wanted to. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, May 8, 2016 at 4:49 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 8 May 2016 at 08:39, Serhiy Storchaka <storchaka@gmail.com> wrote:
Should alternative constructor call __new__ and __init__ methods? Thay can change signature in derived class.
I think this is typically the way to go (although, depending on the specific type, the unpickling related methods may be a more appropriate way for the alternate constructor to populate the instance state)
Honestly, either of these sounds like fragile, even though I really want the alternative constructor to return an instance of the subclass (else why invoke it through the subclass).
Should it complain if __new__ or __init__ were overridden?
If there are alternate constructors that depend on either the signature of __new__/__init__, unpickling support, or some other mechanism for creating new instances, this should be mentioned in the class documentation as a constraint on subclasses - if subclasses don't want to meet the constraint, they'll need to override the affected alternate constructors.
Putting this constraint in the docs sounds fragile too. :-(
Cheers, Nick.
P.S. The potential complexity of that is one of the reasons the design philosophy of "prefer composition to inheritance" has emerged - subclassing is a powerful tool, but it does mean you often end up needing to care about more interactions between the subclass and the base class than you really wanted to.
Indeed! We could also consider this a general weakness of the "alternative constructors are class methods" pattern. If instead these alternative constructors were folded into the main constructor (e.g. via special keyword args) it would be altogether clearer what a subclass should do. -- --Guido van Rossum (python.org/~guido)

Guido van Rossum wrote:
We could also consider this a general weakness of the "alternative constructors are class methods" pattern. If instead these alternative constructors were folded into the main constructor (e.g. via special keyword args) it would be altogether clearer what a subclass should do.
A useful guideline might be that class methods can be provided as sugar for alternative constructors, but they should all funnel through the main constructor. There's a convention like this in the Objective-C world where one of an object's "init" methods is supposed to be documented as the "designated initialiser", so that there is just one thing that subclasses need to override. -- Greg

On 9 May 2016 at 08:50, Guido van Rossum <guido@python.org> wrote:
On Sun, May 8, 2016 at 4:49 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
P.S. The potential complexity of that is one of the reasons the design philosophy of "prefer composition to inheritance" has emerged - subclassing is a powerful tool, but it does mean you often end up needing to care about more interactions between the subclass and the base class than you really wanted to.
Indeed!
We could also consider this a general weakness of the "alternative constructors are class methods" pattern. If instead these alternative constructors were folded into the main constructor (e.g. via special keyword args) it would be altogether clearer what a subclass should do.
Unfortunately, even that approach gets tricky when the inheritance relationship crosses the boundary between components with independent release cycles. In my experience, this timeline is the main one that causes the pain: * Base class is released in Component A (e.g. CPython) * Subclass is released in Component B (e.g. PyPI module) * Component A releases a new base class construction feature Question: does the new construction feature work with the existing subclass in Component B if you combine it with the new version of Component A? When alternate constructors can be implemented as class methods that work by creating a default instance and using existing public API methods to mutate it, then the answer to that question is "yes", since the default constructor hasn't changed, and the new convenience constructor isn't relying on any other new features. The answer is also "yes" for existing subclasses that only add new behaviour without adding any new state, and hence just use the base class __new__ and __init__ without overriding either of them. It's when the existing subclasses overrides __new__ or __init__ and one or both of the following is true that things can get tricky: - you're working with an immutable type - the API implementing the post-creation mutation is a new one In both of those cases, the new construction feature of the base class probably won't work right without updates to the affected subclass to support the new capability (whether that's supporting a new parameter in __new__ and __init__, or adding their own implementation of the new alternate constructor). I'm genuinely unsure that's a solvable problem in the general case - it seems to be an inherent consequence of the coupling between subclasses and base classes during instance construction, akin to the challenges with subclass compatibility of the unpickling APIs when a base class adds new state. However, from a pragmatic perspective, the following approach seems to work reasonably well: * assume subclasses don't change the signature of __new__ or __init__ * note the assumptions about the default constructor signature in the alternate constructor docs to let implementors of subclasses that change the signature know they'll need to explicitly test compatibility and perhaps provide their own implementation of the alternate constructor You *do* still end up with some cases where a subclass needs to be upgraded before a new base class feature works properly for that particular subclass, but subclasses that *don't* change the constructor signature "just work". Cheers, Nick. P.S. It occurs to me that a sufficiently sophisticated typechecker might be able to look at all of the calls to "cls(*args, **kwds)" in class methods and "type(self)(*args, **kwds)" in instance methods, and use those to define a set of type constraints for the expected constructor signatures in subclassses, even if the current code base never actually invokes those code paths. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, May 8, 2016 at 7:52 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 9 May 2016 at 08:50, Guido van Rossum <guido@python.org> wrote:
On Sun, May 8, 2016 at 4:49 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
P.S. The potential complexity of that is one of the reasons the design philosophy of "prefer composition to inheritance" has emerged - subclassing is a powerful tool, but it does mean you often end up needing to care about more interactions between the subclass and the base class than you really wanted to.
Indeed!
We could also consider this a general weakness of the "alternative constructors are class methods" pattern. If instead these alternative constructors were folded into the main constructor (e.g. via special keyword args) it would be altogether clearer what a subclass should do.
Unfortunately, even that approach gets tricky when the inheritance relationship crosses the boundary between components with independent release cycles.
In my experience, this timeline is the main one that causes the pain:
* Base class is released in Component A (e.g. CPython) * Subclass is released in Component B (e.g. PyPI module) * Component A releases a new base class construction feature
Question: does the new construction feature work with the existing subclass in Component B if you combine it with the new version of Component A?
When alternate constructors can be implemented as class methods that work by creating a default instance and using existing public API methods to mutate it, then the answer to that question is "yes", since the default constructor hasn't changed, and the new convenience constructor isn't relying on any other new features.
The answer is also "yes" for existing subclasses that only add new behaviour without adding any new state, and hence just use the base class __new__ and __init__ without overriding either of them.
It's when the existing subclasses overrides __new__ or __init__ and one or both of the following is true that things can get tricky:
- you're working with an immutable type - the API implementing the post-creation mutation is a new one
In both of those cases, the new construction feature of the base class probably won't work right without updates to the affected subclass to support the new capability (whether that's supporting a new parameter in __new__ and __init__, or adding their own implementation of the new alternate constructor).
OTOH it's not the end of the world -- until B is updated, you can't use the new construction feature with subclass B, you have to use the old way of constructing instances of B. Presumably that old way is still supported, otherwise the change to A has just broken all of B, regardless of the new construction feature. Which is possible, but it's a choice that A's author has to make after careful deliberation. Or maybe the "class construction" machinery in A is so prominent that it really is part of the interface between A and any of its subclasses, and then that API had better be documented. Just saying "you can subclass this" won't be sufficient.
I'm genuinely unsure that's a solvable problem in the general case - it seems to be an inherent consequence of the coupling between subclasses and base classes during instance construction, akin to the challenges with subclass compatibility of the unpickling APIs when a base class adds new state.
Yup. My summary of it is that versioning sucks.
However, from a pragmatic perspective, the following approach seems to work reasonably well:
* assume subclasses don't change the signature of __new__ or __init__
I still find that a distasteful choice, because *in general* there is no requirement like that and there are good reasons why subclasses might have a different __init__/__new__ signature. (For example dict and defaultdict.)
* note the assumptions about the default constructor signature in the alternate constructor docs to let implementors of subclasses that change the signature know they'll need to explicitly test compatibility and perhaps provide their own implementation of the alternate constructor
Yup, you end up having to design the API for subclasses carefully and then document it precisely. This is what people too often forget when they complain e.g. "why can't I subclass EventLoop more easily" -- we don't want to have a public API for that, so we discourage it, but people mistakenly believe that anything that's a class should be subclassable.
You *do* still end up with some cases where a subclass needs to be upgraded before a new base class feature works properly for that particular subclass, but subclasses that *don't* change the constructor signature "just work".
The key is that there's an API requirement and that you have to design and document that API with future evolution in mind. If you don't do and let people write subclasses that just happen to work, you're in a lot of pain. The interface between a base class and a subclass just is very complex so must designers of base classes get this wrong initially.
Cheers, Nick.
P.S. It occurs to me that a sufficiently sophisticated typechecker might be able to look at all of the calls to "cls(*args, **kwds)" in class methods and "type(self)(*args, **kwds)" in instance methods, and use those to define a set of type constraints for the expected constructor signatures in subclassses, even if the current code base never actually invokes those code paths.
Could you restate that as a concrete code example? (Examples of the problems with "construction features" would also be helpful, probably -- abstract descriptions of problems often lead me astray.) -- --Guido van Rossum (python.org/~guido)

On 10 May 2016 at 02:30, Guido van Rossum <guido@python.org> wrote:
On Sun, May 8, 2016 at 7:52 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
P.S. It occurs to me that a sufficiently sophisticated typechecker might be able to look at all of the calls to "cls(*args, **kwds)" in class methods and "type(self)(*args, **kwds)" in instance methods, and use those to define a set of type constraints for the expected constructor signatures in subclassses, even if the current code base never actually invokes those code paths.
Could you restate that as a concrete code example? (Examples of the problems with "construction features" would also be helpful, probably -- abstract descriptions of problems often lead me astray.)
Rectangle/Square is a classic example of the constructor signature changing, so I'll try to use that to illustrate the point with a "displaced_copy" alternate constructor: class Rectangle: def __init__(self, top_left_point, width, height): self.top_left_point = top_left_point self.width = width self.height = height @classmethod def displaced_copy(cls, other_rectangle, offset): """Create a new instance from an existing one""" return cls(other.top_left_point + offset, other.width, other.height) class Square: def __init__(self, top_left_point, side_length): super().__init__(top_left_point, side_length, side_length) At this point, a typechecker *could* have enough info to know that "Square.displaced_copy(some_rectangle, offset)" is necessarily going to fail, even if nothing in the application actually *calls* Square.displaced_copy. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, May 10, 2016 at 6:21 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sun, May 8, 2016 at 7:52 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
P.S. It occurs to me that a sufficiently sophisticated typechecker might be able to look at all of the calls to "cls(*args, **kwds)" in class methods and "type(self)(*args, **kwds)" in instance methods, and use those to define a set of type constraints for the expected constructor signatures in subclassses, even if the current code base never actually invokes those code paths.
Could you restate that as a concrete code example? (Examples of the
On 10 May 2016 at 02:30, Guido van Rossum <guido@python.org> wrote: problems
with "construction features" would also be helpful, probably -- abstract descriptions of problems often lead me astray.)
Rectangle/Square is a classic example of the constructor signature changing, so I'll try to use that to illustrate the point with a "displaced_copy" alternate constructor:
class Rectangle: def __init__(self, top_left_point, width, height): self.top_left_point = top_left_point self.width = width self.height = height
@classmethod def displaced_copy(cls, other_rectangle, offset): """Create a new instance from an existing one""" return cls(other.top_left_point + offset, other.width, other.height)
(But why is it a class method? I guess the example could also use an instance method and it would still have the same properties relevant for this discussion.)
class Square: def __init__(self, top_left_point, side_length): super().__init__(top_left_point, side_length, side_length)
At this point, a typechecker *could* have enough info to know that "Square.displaced_copy(some_rectangle, offset)" is necessarily going to fail, even if nothing in the application actually *calls* Square.displaced_copy.
The question remains of course whether the type checker should flag Square to be an invalid subclass or merely as not implementing displaced_copy(). Anyway, at this point I believe we're just violently agreeing, so no need for another response. Though Serhiy may be unhappy with the lack of guidance he's received... -- --Guido van Rossum (python.org/~guido)
participants (5)
-
Ethan Furman
-
Greg Ewing
-
Guido van Rossum
-
Nick Coghlan
-
Serhiy Storchaka