[AstroPy] Coding Guidelines draft (comments encouraged)

Perry Greenfield perry at stsci.edu
Mon Jul 11 15:31:35 EDT 2011


On Jul 11, 2011, at 12:05 PM, Michael Droettboom wrote:

> With regard to: "C extensions will be allowed only when the provide  
> a significant performance enhancement over pure python. When C  
> extensions are used, the python interface must meet interface  
> guidelines, and the use of Cython is strongly recommended."
>
> In the case of something like pywcs, a C extension is used because  
> the WCS standard is so difficult to implement that it was easier to  
> use existing and well-tested code.  It's not clear whether there was  
> a performance advantage over a Numpy-based implementation (I'm not  
> taking a position either way, I'm saying that such an experiment was  
> never done, and performance was not a motivating factor.)  There are  
> a lot of good reasons beyond performance to use non-Python extension  
> code, so I don't know if we should artificially limit ourselves here.
>

That's a good point and we should be clearer about that. Where a good  
C library already exists, and it doesn't pose any serious design  
compromises to use it, it should be fine to do so (that's what much of  
scipy is all about after all).

> With respect to: "Packages can include data in [path tbd] as long as  
> it is less than about 100 kb. These data should be accessed via the  
> astropy.config.[funcname tbd] mechanism. If the data exceeds this  
> size, it should be hosted outside the source code repository and  
> downloaded using the astropy.config.[funcname tbd] mechanism."
>
> I worry that there are often version dependencies between the code  
> and the data.  (Matplotlib has a dependency on specific versions of  
> the STIX math fonts, for example).  Having the data in a separate  
> code repository makes keeping this synchronized trickier.  I would  
> instead say: large data required for functioning of the library goes  
> in the repository, but under a special directory that is not  
> distributed as part of the binary distribution.  The mechanism for  
> accessing this data when needed should be aware of the source code  
> revision and download the corresponding revision of the data.

The intent was to prevent the inclination of just dumping all binary  
data into repositories. This has happened with FITS files for us in  
the past and it wasn't a good thing to do in general (and there  
usually aren't significant version couplings there). We've had people  
continue to do it with some our repositories even when the policy says  
not to. There are certainly good exceptions to this (and the  
matplotlib example of one of these). I suppose the exception has to be  
illustrated by likely version dependencies in the binary data. Does  
that seem reasonable? But I think the default should be to disallow it  
and require someone to argue for doing it rather than the other way  
around.

Perry




More information about the AstroPy mailing list