[AstroPy] Organizing AstroPy (was Re: Proliferating py-astro-libs)

Mark Sienkiewicz sienkiew at stsci.edu
Mon Jun 13 11:46:42 EDT 2011

Marshall Perrin wrote:
> I'm going to be provocative here:  /As a community, we are doing 
> something wrong /if everyone wants to start their own new module 
> rather than contributing to a common shared open-source core.  We are 
> clearly doing something wrong if people repeatedly implement the same 
> basic functions rather than building on what's already there. What do 
> we need to do differently? How can we make it easier to use a shared 
> repository and shared namespace for all this?
>  Currently we're just forming dwarf galaxies; how can we get them to 
> accrete together to build a grand design spiral?

If you make a universe that contains a lot of hydrogen, all you have to 
do is wait, and you will get spiral galaxies. Our problem is more like 
trying to arrange the stars onto the faces of a cube and keep them there 
when they start to drift away.  It is not a naturally occurring state, 
but you can do it, if you apply enough energy in the right places.

If you make a universe that contains people who have similar software 
needs and expect each of them to do their own programming, you naturally 
get hundreds of disjoint pieces that do not fit together nicely, contain 
a lot of duplication, and are mostly unmaintained.  Except for small 
groups, communities and software projects are not self-organizing.  But 
you can organize the work into a single well-coordinated project, if you 
apply enough energy in the right places.

The most effective approach would require:

- one person willing to coordinate the project -- but maybe as many 5 at 
most.  (Unless you subdivide the project into independent parts, it 
doesn't work as well with more.)  Basically, we need a Guido -- he acted 
as BDFL for python at the request of (and with the consent of) the 
python community.  He provided valuable organization that could not have 
existed without somebody working at it.

- some small number of people to act as core developers.  These people 
design the system and coordinate contributions.  This is also a lot of 
work, and you're looking for people with software design skills.  Maybe 
3 to 10 people.

- a bunch of people willing be contributing developers.  These people 
contribute subsets.  This is less work than being a core developer, but 
still a bunch more than hacking together a custom solution for your 
specific needs and putting it up on your web site.

- users who are interested enough to overcome the learning curve.  Why 
should I use your simple coordinate transformation when I can write my 
own faster than I can figure out how to use yours?  But, implement and 
document enough useful capabilities in your package, and it is worth my 
time to use them instead of writing my own copy of everything.  
(Remember:  The user of your package only knows what you tell them -- as 
far as the user knows, anything that isn't documented doesn't exist.)

This rough model is typical of successful free software projects.  Think 
of examples like python, perl, linux, GCC, gnuplot, freebsd, or gnome.  
They all work this way.

I'm informed that this model is more top-down than this community likes, 
but you have to consider whether the benefits of organizing are greater 
than the costs of failing to organize.  After all, if you reject the 
possibility that someone will coordinate the activities, then the 
activities will necessarily be un-coordinated.

In fact, I think that this community _wants_ some organization -- if 
not, we would not be having this discussion.  (again - check the mailing 
list archives.)

In an organization like what I am describing, the "management" layers 
actually work for the "lower" levels.  As a developer who makes an 
occasional contribution, I demand that the organizer give me the 
information necessary to make my contribution fit into the integrated 
whole.  If you think in terms of traditional business management models, 
it may look like the organizer is "assigning work", but in fact, the 
organizer is helping me by telling me how my piece can fit.

In other words, make it easy for me to contribute an integrated part.  
(I'm speaking of coming up with the right software, not mechanism for 
sending my contribution.)

In the python community, the acronymn "BDFL" stands for "Benevolent 
Dictator For Life", but in fact, Guido does not dictate anything.  He 
aids the community in resolving differences.  When the python community 
does not reach consensus, somebody (Guido) has to decide.  We all go 
along with his decision because we are better off staying organized 
together even when some specific choice does not go the way I like.  
(See my opinions on python 3, for example.)

We need an organizer to help the community reach consensus and to make 
choices when consensus is not achievable.  There is another problem, though:

Who can do it?

I can't answer that, but I've seen plenty of open source projects fail 
because they can't get an organizer.  If nobody loves the idea of 
AstroPy enough to be the organizer (maybe even to take it on independent 
of their primary job), then maybe we should accept that and lower our 
expectations.  There is a less labor-intensive alternative, where 
AstroPy becomes just a collection of different packages with no unified 
model and no attempt to make them fit nicely together.  This also needs 
an organizer, but with a lesser workload.  It doesn't make nearly as 
good a product, but it is better to succeed at the "best you can do with 
the resources available" than to fail at "the ideal result".

It all starts with a big decision.  Will we organize AstroPy?  Who will 
volunteer to work for us as the organizer?  Is our community willing to 
buy in to this development model?  If not, what is the alternative?

Mark S.

p.s.  I normally don't bother saying this because it is usually obvious, 
but:  These comments are mine, based on decades of involvement in 
programming and software engineering.  I do not speak for STScI.

More information about the AstroPy mailing list