[AstroPy] Coding Guidelines draft (comments encouraged)
Erik Bray
embray at stsci.edu
Tue Aug 2 10:15:44 EDT 2011
On 07/25/2011 04:45 PM, Erik Bray wrote:
> On 07/25/2011 08:09 AM, Erik Tollerud wrote:
>> I have made a new version of the coding guidelines and posted them to:
>> http://astropy.org/development/codeguide.html
>>
>> This version incorporates most of the feedback we've gotten so far on
>> this thread, so if you've suggested something, you might want to look
>> to make sure what's in there is what you had in mind. In particular,
>> Michael, you may want to look at the new phrasing for the data file
>> limit. Additionally, a few of you may want to look over the new
>> phrasing for when C extensions are acceptable.
>>
>>
>> One thing I have *not* yet altered is the section on super() and
>> multiple inheritance, as the discussion does not seem to have
>> concluded.
>
> Allow me to add a little more fuel to the discussion then :)
>
> First, for some additional motivating background, here's a post by Guido
> on the history of class method resolution orders (MROs) in Python, and
> the rationale for the current solution:
> http://python-history.blogspot.com/2010/06/method-resolution-order.html
>
> It's easier reading than any PEP on the topic, so I would consider it
> worth reading for anyone who would like more background.
>
> The recommendation given in the current Coding Guidelines completely
> breaks the intended MRO for __init__(). In this example A.__init__() is
> called twice (as well as object.__init__(), though this does nothing)!
> The effective MRO is D->B->A->object->C->A->object. You can see this
> for yourself if you take the original example and add some print statements:
>
> >>> class A(object):
> ... inits = 0
> ... def __init__(self):
> ... self.inits += 1
> ... if self.inits> 1:
> ... print 'Calling A.__init__() again!'
> ... else:
> ... print 'Calling A.__init__() for the first time.'
> ...
> >>> class B(A):
> ... def __init__(self):
> ... print 'Calling B.__init__().'
> ... A.__init__(self)
> ...
> >>> class C(A):
> ... def __init__(self):
> ... print 'Calling C.__init__().'
> ... A.__init__(self)
> ...
> >>> class D(C, B):
> ... def __init__(self):
> ... print 'Calling D.__init__().'
> ... B.__init__(self)
> ... C.__init__(self)
> >>> D()
> Calling D.__init__().
> Calling B.__init__().
> Calling A.__init__() for the first time.
> Calling C.__init__().
> Calling A.__init__() again!
> <__main__.D object at 0x2afb22dd7750>
>
> While this may be harmless in some cases, it's most likely not what the
> developer really wants. And it's completely different from what Python
> would otherwise do which is D->C->B->A->object. This is a much more
> sensible MRO (and based on a well thought out algorithm that handles
> numerous corner cases that you might not think of immediately). This is
> also very difficult to do without cooperative super() calls.
>
> Having some classes *not* using super() will break any code that does
> try to use them as part of a cooperative super() call. So the better
> policy is to *require* the use of super(), especially in __init__()
> which is the most common case--I can't emphasize this enough :).
>
> The problems with this, I think, are overstated: In the simplest and
> most common case of single-inheritance using super(Foo, self).__init__()
> is equivalent to Foo.__init__(self). In the case of inheriting from two
> classes with no overlapping functionality--orthogonal bases as James put
> it--again it's a moot point.
>
> super() *only* adds complexity and/or confusion in diamond-like or even
> more complex class hierarchies. But in these cases super() will always
> make sure that the "right" thing is done. What Python's algorithm
> considers "right" may not always be exactly what the developer wants,
> but in the case of multiple inheritance Python follows its Zen by having
> One Right Way to do it. And for a complex and confusing subject like
> multiple inheritance that's a good thing.
>
> Admittedly, the semantics for using super() in Python 2.x are
> cumbersome. Python 3 at least allows calling super() with no arguments,
> which is the same as super(__class__, self)--the most common way of
> calling it. But that's just Python 3 :)
I know people have been busy, myself included, but I wonder if anyone
has had time to consider the use of super() and the defense thereof.
I would change the coding guidelines read opposite of how they do
currently: Use of super() should be encouraged except where one knows
for sure that it's not going to be appropriate, due to the need for a
customized method resolution order. What *should* be discouraged is the
use of multiple inheritance except in the cases of orthogonal bases, or
if one is really sure what they're doing.
On an unrelated matter, I have one other suggestion for the coding
guidelines:
A top level package's __init__.py should be importable under all
circumstances, and should not too much "stuff" in it, for lack of a
better word. To explain: Several of our own packages have an
__init__.py that contains a lot of functionality, such that when I try
to import it, it might crash (due to some prerequisite not being met),
take a very long time (due to it importing many submodules
unnecessarily), or it might have actual side effects like writing to the
filesystem.
All of these behaviors should be discouraged. In particular it can
cause problems for installation and test frameworks which use the Python
import mechanism to discover packages and modules on the path. Merely
importing the package should not fail due to, for example, an
environment variable not being set.
There are valid reasons for the __init__.py containing functionality, in
particular for ease of use at the Python command line. But they should
be designed so that they can at least be imported without side-effects.
This would mean cleanly disabling certain functionality if any
prerequisites are not bet, and delaying any side-effects until it's
certain that they are necessary.
Just to give a concrete example (that's long since been fixed, actually)
matplotlib used to try to write out a default cache/config directory
upon import, which could fail if the process it was running in didn't
have permission to create that directory. That's the kind of thing that
should be avoided. I think that a note to that effect should be added
to the coding guidelines. I'd be happy to write up a draft if people
agree with this.
Thanks,
Erik
More information about the AstroPy
mailing list