[AstroPy] Organizing AstroPy (was Re: Proliferating py-astro-libs)
sienkiew at stsci.edu
Mon Jun 13 11:46:42 EDT 2011
Marshall Perrin wrote:
> I'm going to be provocative here: /As a community, we are doing
> something wrong /if everyone wants to start their own new module
> rather than contributing to a common shared open-source core. We are
> clearly doing something wrong if people repeatedly implement the same
> basic functions rather than building on what's already there. What do
> we need to do differently? How can we make it easier to use a shared
> repository and shared namespace for all this?
> Currently we're just forming dwarf galaxies; how can we get them to
> accrete together to build a grand design spiral?
If you make a universe that contains a lot of hydrogen, all you have to
do is wait, and you will get spiral galaxies. Our problem is more like
trying to arrange the stars onto the faces of a cube and keep them there
when they start to drift away. It is not a naturally occurring state,
but you can do it, if you apply enough energy in the right places.
If you make a universe that contains people who have similar software
needs and expect each of them to do their own programming, you naturally
get hundreds of disjoint pieces that do not fit together nicely, contain
a lot of duplication, and are mostly unmaintained. Except for small
groups, communities and software projects are not self-organizing. But
you can organize the work into a single well-coordinated project, if you
apply enough energy in the right places.
The most effective approach would require:
- one person willing to coordinate the project -- but maybe as many 5 at
most. (Unless you subdivide the project into independent parts, it
doesn't work as well with more.) Basically, we need a Guido -- he acted
as BDFL for python at the request of (and with the consent of) the
python community. He provided valuable organization that could not have
existed without somebody working at it.
- some small number of people to act as core developers. These people
design the system and coordinate contributions. This is also a lot of
work, and you're looking for people with software design skills. Maybe
3 to 10 people.
- a bunch of people willing be contributing developers. These people
contribute subsets. This is less work than being a core developer, but
still a bunch more than hacking together a custom solution for your
specific needs and putting it up on your web site.
- users who are interested enough to overcome the learning curve. Why
should I use your simple coordinate transformation when I can write my
own faster than I can figure out how to use yours? But, implement and
document enough useful capabilities in your package, and it is worth my
time to use them instead of writing my own copy of everything.
(Remember: The user of your package only knows what you tell them -- as
far as the user knows, anything that isn't documented doesn't exist.)
This rough model is typical of successful free software projects. Think
of examples like python, perl, linux, GCC, gnuplot, freebsd, or gnome.
They all work this way.
I'm informed that this model is more top-down than this community likes,
but you have to consider whether the benefits of organizing are greater
than the costs of failing to organize. After all, if you reject the
possibility that someone will coordinate the activities, then the
activities will necessarily be un-coordinated.
In fact, I think that this community _wants_ some organization -- if
not, we would not be having this discussion. (again - check the mailing
In an organization like what I am describing, the "management" layers
actually work for the "lower" levels. As a developer who makes an
occasional contribution, I demand that the organizer give me the
information necessary to make my contribution fit into the integrated
whole. If you think in terms of traditional business management models,
it may look like the organizer is "assigning work", but in fact, the
organizer is helping me by telling me how my piece can fit.
In other words, make it easy for me to contribute an integrated part.
(I'm speaking of coming up with the right software, not mechanism for
sending my contribution.)
In the python community, the acronymn "BDFL" stands for "Benevolent
Dictator For Life", but in fact, Guido does not dictate anything. He
aids the community in resolving differences. When the python community
does not reach consensus, somebody (Guido) has to decide. We all go
along with his decision because we are better off staying organized
together even when some specific choice does not go the way I like.
(See my opinions on python 3, for example.)
We need an organizer to help the community reach consensus and to make
choices when consensus is not achievable. There is another problem, though:
Who can do it?
I can't answer that, but I've seen plenty of open source projects fail
because they can't get an organizer. If nobody loves the idea of
AstroPy enough to be the organizer (maybe even to take it on independent
of their primary job), then maybe we should accept that and lower our
expectations. There is a less labor-intensive alternative, where
AstroPy becomes just a collection of different packages with no unified
model and no attempt to make them fit nicely together. This also needs
an organizer, but with a lesser workload. It doesn't make nearly as
good a product, but it is better to succeed at the "best you can do with
the resources available" than to fail at "the ideal result".
It all starts with a big decision. Will we organize AstroPy? Who will
volunteer to work for us as the organizer? Is our community willing to
buy in to this development model? If not, what is the alternative?
p.s. I normally don't bother saying this because it is usually obvious,
but: These comments are mine, based on decades of involvement in
programming and software engineering. I do not speak for STScI.
More information about the AstroPy