On Tue, Feb 24, 2009 at 5:20 PM, Anne Archibald <peridot.faceted@gmail.com> wrote:
2009/2/24 Robert Kern <robert.kern@gmail.com>:
> On Tue, Feb 24, 2009 at 15:13, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
>
>> I think at this point we would be better off trying to recruit at least one
>> person to "own" each package. For new packages that is usually the person
>> who committed it but we also need ownership of older packages. Someone with
>> a personal stake in a package is likely to do more for quality assurance at
>> this point than any amount of required review.
>
> "Ownership" has a bad failure mode. Case in point: nominally, I am the
> "owner" of scipy.stats and numpy.random and completely failed to move
> Josef's patches along.

It seems to me that scipy's development model is a classic open-source
"scratch an itch": it bothered me that people were forever asking
questions that needed spatial data structures, so I took a weekend and
wrote some. I don't foresee this changing without some major change
(e.g. a company suddenly hiring ten people to work full-time on
scipy). So the question is how to make this model produce reliable
code.

Suggestions people have made to accomplish this:

(1) Don't allow anything into SVN without tests and documentation.
(2) Make sure everything gets reviewed before it goes in.
(3) Appoint owners for parts of scipy.

Of these, I strongly approve of (1). It's really not a barrier.
Writing tests is easy. Every programmer does *some* testing (well
maybe not Knuth, but everybody else) to make sure the code does what
it's supposed to. Writing these tests in nose-compatible form really
isn't hard. Documentation is more of an obstacle, just because it's
extra work. But I think it's not too much to ask.

(2) I'm not so sure of. For an example, a few days ago I fixed a
couple of spatial bugs. In both cases, the bug fix was a one-line
change to scipy proper, plus a unit test that would have caught the
bug but now passes. What would be gained by waiting until somebody
else got around to looking at those fixes before committing them?

I am tempted to suggest a weaker standard: optional code review. If
you want to submit a piece of code to scipy and don't have SVN access,
or do but want someone else to take a look at it (as, e.g., I did for
scipy.spatial as a whole), post it; people can review it and when it's
been adequately reviewed it goes in. Of course, here we return to
infrastructure: as far as I know we don't have any reasonable tool for
doing these reviews, or for connecting them to bug reports.

(3) I am highly dubious of. Certainly we'll have informal owners - I
fixed the bugs in spatial in part because I wrote the code and was
embarrassed to see it broken. I know the spatial code pretty well, so
I will probably have an easier time assessing patches to it. But I am
often busy - if those spatial bugs had been reported a month earlier I
would not have been able to get to them any sooner. Making it my fault
if patches don't get in to scipy.spatial - which is, really, what
we're talking about - is a recipe for driving people like me away from
developing scipy. Don't do it.


I don't think that's what we are "really talking about", rather, I think we need folks who feel an informal ownership about parts of scipy. I simply pointed out where I felt responsible as an example. Your sense of "owning" scipy.spatial is another example. And I think the best way to get folks attached to orphaned bits of code that have languished untouched all these years is to let them make actual changes without jumping through umpteen legal hoops. I also think we need more developers, and the place to find them is among folks who have contributed patches. We should actively offer commit privileges to such folks. The main advantage of a DVCS in such a situation is that commit privilege becomes less important and additions can be reviewed offline and brought in easily when ready. But until we have such a system I think more folks need the ability to touch SVN.

Chuck