[docs] [issue27561] Warn against subclassing builtins, and overriding their methods

Raymond Hettinger report at bugs.python.org
Mon Jul 18 17:31:02 EDT 2016


Raymond Hettinger added the comment:

This is an example of the open-closed-principle (the class is open for extension but closed for modification).  It is good thing to have because it allows subclassers to extend or override a method without unintentionally triggering behavior changes in others and without breaking the classes's invariants.

We even do this in pure python code as well; for example, inside the pure python ordered dict code, the class local call from __init__ to update() is done using name mangling.  This allows a subclasser to override update() without accidentally breaking __init__.

Sometimes, this is inconvenient.  It means that a subclasser has to override every method whose behavior they want to change including get(), update(), and others.  However, there are offsetting benefits (protection of internal invariants, preventing implementation details from leaking from the abstraction, and allowing users to assume the methods are independent of one another).

This style (chosen by Guido from the outset) is the default for the builtin types (otherwise we would forever be fighting segfaulting invariant violations) and for some pure python classes.

We do document when there is a departure from the default.  For example, the cmd module uses the framework design pattern, letting the user define various do_action() methods.  Also, some of the http modules do the same, specifically documentation that the user's do_get method gets called and that some specifically named user handler method gets called.

In the absence of specifically documented method hooks (i.e. those listed above or methods like __missing__), a subclasser should presume method independence.  Otherwise, how are you to know whether __getitem__ calls get() under the hood or vice-versa?

FWIW, this isn't unique to Python.  It comes up quite a bit in object oriented programming.  Correctly designed classes either document root methods that affect the behavior of other methods or they are presumed to be independent.

There may need to be a FAQ for this, but nothing is broken or wrong here (other than Python having way to many dict variants to chose from).  Also, R David Murray almost no one would be helped by a note in some random place in the docs.  If someone assumes or believes that __getitem__ must be called by the other accessor methods, they find out very quickly that assumption is wrong (that is if they run even minimal tests on the code).

----------
assignee: docs at python -> rhettinger
nosy: +rhettinger

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue27561>
_______________________________________


More information about the docs mailing list