[Python-Dev] Confirming status of new modules in 3.4

Sun Mar 16 03:45:52 CET 2014

On 16 March 2014 09:00, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Nick Coghlan wrote:
>>
>> On 16 March 2014 01:40, Guido van Rossum <guido at python.org> wrote:
>>
>>> This downside of using subclassing as an API should be well known by now
>>> and
>>> widely warned against.
>>
>>
>> I've actually pondered the idea of suggesting we explicitly recommend
>> the "procedural facade around an object oriented implementation" API
>> design model in PEP 8,
>
>
> I don't think I would call this a "procedural" API. To my
> mind, any API that exposes objects with methods is an
> object-oriented API, whether it encourages subclassing those
> objects or not.

There are actually two variants of the approach. One is obviously
procedural: you put basic data types (strings, numbers, containers,
etc) in, you get basic data types out, and the fact an object managed
the calculation in the middle is completely opaque to you as a user of
the API. (from the outside, it still looks like a pure function, even
though the internal calculation may have been highly stateful)

The second variant is better exemplified by an API like
contextlib.contextmanager(). That decorator returns a custom object
type (now called contextlib._GeneratorContextManager). That type was
originally undocumented with the name
contextlib.GeneratorContextManager, and when the discrepancy was
pointed out, I resolved it by adding the missing underscore (in 3.3
IIRC, could have been 3.2), rather than by documenting something I
considered to be an implementation detail.

The key problem this addresses is that telling people "this is a
class" often overspecifies your API, because classes have a very broad
interface. They support inheritance, typechecks, etc, etc. It's a
really big commitment to your API users, and not one you should make
lightly. Subclassing an object is also an *extremely* high coupling
technique - it's genuinely difficult to refactor classes in any
meaningful way with risking breakage of subclasses.

As an example of unanticipated coupling issues that inhibit
refactoring, consider a class where "method B" is originally written
to call "method A". You later notice that this would be better
structured by having both methods A and B call a new helper method C,
rather than having B call A. If "method A" was a public API you now
have a problem, since after the change, a subclass that previously
only overrode A will now see different behaviour in method B (because
that is now bypassing the override in the subclass). It can all get
very, very messy, so experience has taught me that making your classes
part of your public API before you're 100% certain you're happy with
not only the public methods, but also the relationships between them,
is a fine recipe for locking in bad design decisions in a hard to fix
way. Sometimes the trade-off is worth it due to the power and
flexibility that API users gain, but it needs to be recognised as the
very high coupling technique that it is.

By contrast, a callable API like contextlib.contextmanager just says
"given a certain set of inputs, you will get some kind of object back
with these characteristics, no more, no less". It is, in essence,
ducktyping as applied to return types :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia