[Python-ideas] sequence.apply(function)

Nick Coghlan ncoghlan at gmail.com
Sun Sep 2 04:14:50 CEST 2012


On Sun, Sep 2, 2012 at 8:02 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Sun, 2 Sep 2012 00:55:39 +0300
> Yuval Greenfield <ubershmekel at gmail.com>
> wrote:
>> On Sat, Sep 1, 2012 at 8:06 PM, Guido van Rossum <guido at python.org> wrote:
>>
>> > It's less Pythonic, because every sequence-like type (not just list)
>> > would have to reimplement it.
>> >
>> > Similar things get proposed for iterators (e.g. it1 + it2, it[:n],
>> > it[n:]) regularly and they are (and should be) rejected for the same
>> > reason.
>> >
>> >
>> Python causes some confusion because some things are methods and others
>> builtins. Is there a PEP or rationale that defines what goes where?
>
> When something only applies to a single type or a couple of types, it is
> a method. When it is generic enough, it is a builtin.
> Of course there are grey areas but that's the basic idea.

Yes, it comes down to the fact that we are *very* reluctant to impose
required base classes (I believe the only ones currently enforced
anywhere are object, BaseException and str - everything else should
fall back to a protocol method, ABC or interface specific registration
mechanism. Most interfaces that used to require actual integer objects
are now using operator.index, or one of its C API equivalents).

In Python, we also actively discourage "reopening" classes to add new
methods (this is mostly a cultural thing, though - the language
doesn't actually contain any mechanism to stop you by default,
although it's possible to add such enforcement via metaclasses)

Thus, protocols are born which define "has this behaviour", rather
than "is one of these". That's why we have the len() builtin and
associated __len__() protocol to say "taking the length of this object
is a meaningful operation" rather than mandatory inheritance from a
Container class that has a ".len()" method.

They're most obviously beneficial when there are *multiple* protocols
that can be used to implement a particular behaviour. For example,
with iter(), the __iter__ protocol is only the first option tried. If
that fails, then it will instead check for __getitem__ and if that
exists, return a standard sequence iterator instead. Similarly,
reversed() checks for __reversed__ first, and then checks for __len__
and __getitem__, producing a reverse sequence iterator in the latter
case.

Similarly, next() was moved from a standard method to a builtin
function in 3.x? Why? Mainly to add the "if not found, return this
default value" behaviour. That kind of thing is much easier to add
when the object is only handling a piece of the behaviour, with
additional standard mechanisms around it (in this case, optionally
returning a default value when StopIteration is thrown by the
iterator).

Generators are another good illustration of the principle: For iter()
and next(), they follow the standard protocol and rely on the
corresponding builtins. However, g.send() and g.throw() require deep
integration with the interpreter's eval loop. There's currently no way
to implement either of those behaviours as an ordinary type, thus
they're exposed as ordinary methods, since they're genuinely generator
specific.

As to *why* this is a good thing: procedural APIs encourage low
coupling. Yes, object oriented programming is a good way to scale an
application architecture up to more complicated problems. The issue is
with fetishising OOP to the point where you disallow the creation of
procedural APIs that hide the OOP details. That approach sets a
minimum floor to the complexity of your implementations, as even if
you don't *need* the power of OOP, you're forced to deal with it
because the language doesn't offer anything else, and that way lies
Java. There's a reason Java is significantly more popular on large
enterprise projects than it is in small teams - it takes a certain,
rather high, level of complexity for the reasons behind any of that
boilerplate to start to become clear :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list