[AstroPy] Coding Guidelines draft (comments encouraged)
Perry Greenfield
perry at stsci.edu
Mon Jul 11 15:31:35 EDT 2011
On Jul 11, 2011, at 12:05 PM, Michael Droettboom wrote:
> With regard to: "C extensions will be allowed only when the provide
> a significant performance enhancement over pure python. When C
> extensions are used, the python interface must meet interface
> guidelines, and the use of Cython is strongly recommended."
>
> In the case of something like pywcs, a C extension is used because
> the WCS standard is so difficult to implement that it was easier to
> use existing and well-tested code. It's not clear whether there was
> a performance advantage over a Numpy-based implementation (I'm not
> taking a position either way, I'm saying that such an experiment was
> never done, and performance was not a motivating factor.) There are
> a lot of good reasons beyond performance to use non-Python extension
> code, so I don't know if we should artificially limit ourselves here.
>
That's a good point and we should be clearer about that. Where a good
C library already exists, and it doesn't pose any serious design
compromises to use it, it should be fine to do so (that's what much of
scipy is all about after all).
> With respect to: "Packages can include data in [path tbd] as long as
> it is less than about 100 kb. These data should be accessed via the
> astropy.config.[funcname tbd] mechanism. If the data exceeds this
> size, it should be hosted outside the source code repository and
> downloaded using the astropy.config.[funcname tbd] mechanism."
>
> I worry that there are often version dependencies between the code
> and the data. (Matplotlib has a dependency on specific versions of
> the STIX math fonts, for example). Having the data in a separate
> code repository makes keeping this synchronized trickier. I would
> instead say: large data required for functioning of the library goes
> in the repository, but under a special directory that is not
> distributed as part of the binary distribution. The mechanism for
> accessing this data when needed should be aware of the source code
> revision and download the corresponding revision of the data.
The intent was to prevent the inclination of just dumping all binary
data into repositories. This has happened with FITS files for us in
the past and it wasn't a good thing to do in general (and there
usually aren't significant version couplings there). We've had people
continue to do it with some our repositories even when the policy says
not to. There are certainly good exceptions to this (and the
matplotlib example of one of these). I suppose the exception has to be
illustrated by likely version dependencies in the binary data. Does
that seem reasonable? But I think the default should be to disallow it
and require someone to argue for doing it rather than the other way
around.
Perry
More information about the AstroPy
mailing list