Hi all,

Long time list lurker, thought I might chime in on this topic quickly --

On Mon, May 25, 2015 at 3:10 PM, <josef.pktd@gmail.com> wrote:


On Mon, May 25, 2015 at 3:37 PM, Jaime Fernández del Río <jaime.frio@gmail.com> wrote:
Hi all,

This year I am mentoring Aman for one of the GSoC projects we have underway, "Rewriting ndimage in Cython." By its very nature it doesn't conform very well to the "many small pull requests" model: from the point of view of scipy, things are going to be broken up until almost the very last commit. I am not sure what the best way to set up a collaborative code development environment would be, and so am asking for the collective wisdom to help guide us.

Aman could simply create one ginormous pull request that will grow, and grow, and not be merged until everything was ready. I don't like this idea too much, as it is going to eventually be a confusing mess, and I think it would also make it difficult for others than Aman (that would mostly be me) to contribute code.

I think we could also use a branch, either on my fork of scipy or on Aman's, as the repository on which development would happen, and against which PRs would be created, and once completed send a single PR to the main scipy repo. This may work, but I don't like it much either.

What probably makes more sense is to create a new branch **in the main scipy repository**, and have PRs sent and merged against that branch, which would eventually be merged with master upon completion. NumPy seems to have a couple such experimental branches ('with_maskna' and 'enable_separate_by_default'), although there is none in SciPy that I see. This would also allow us to keep the project in a controlled environment, even if by the end of the summer not every single bit of ndimage has been ported.

If this third path is really the preferred way of doing things, I could probably set things up myself (Ralf gave me commit rights when I became a mentor for this project), but I'd like to hear what others think, before abusing my powers.



I don't see much difference between options two and three for working with github since it''s easy to create and merge pull requests across forks.

One consideration is whether you want to trigger the TravisCI runs, which I guess would be automatic with a branch in the main scipy repo.
I think involving CI is a nice (who doesn't love seeing the green check mark?). Since scipy already has all of the Travis plumbing, Jaime or Aman could develop on a branch and push to their respective fork after having "turned on" (they use the term "flick on") that branch on Travis.

This strategy would remove scipy list notifications while leveraging the usefulness of CI. The only issue is that if the work is restricted to Aman/Jaime's forks, then PR reviewing would likely have fewer eyes until the final PR into scipy.

Just a thought,
Matt

Another consideration is whether this triggers a large amount of notifications for scipy developers that are subscribed to changes, PRs and issues. (and if it's easy for me to filter those out visually in gmail)

In statsmodels all the extra branches in the main repo are stale or stalled and are waiting for someone to pick up. Actual development is in developer forks.

(I'm not involved enough in scipy development to have an opinion.)

Josef
 

Thanks!

Jaime

--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.

_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev



_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev




--
Matthew Gidden, Ph.D.
Postdoctoral Associate, Nuclear Engineering
The University of Wisconsin -- Madison
Ph. 225.892.3192