[Numpy-discussion] Issue Tracking

Tue May 1 15:34:49 EDT 2012

On Tue, May 1, 2012 at 9:12 AM, Charles R Harris
<charlesr.harris at gmail.com>wrote:

>
>
> On Tue, May 1, 2012 at 12:52 AM, Travis Oliphant <travis at continuum.io>wrote:
>
>>
>> On Apr 30, 2012, at 10:14 PM, Jason Grout wrote:
>>
>> On 4/30/12 6:31 PM, Travis Oliphant wrote:
>>
>> Hey all,
>>
>>
>> We have been doing some investigation of various approaches to issue
>> tracking.      The last time the conversation left this list was with
>> Ralf's current list of preferences as:
>>
>>
>> 1) Redmine
>>
>> 2) Trac
>>
>> 3) Github
>>
>>
>> Since that time, Maggie who has been doing a lot of work settting up
>> various issue tracking tools over the past couple of months, has set up a
>> redmine instance and played with it.   This is a possibility as a future
>> issue tracker.
>>
>>
>> However, today I took a hard look at what the IPython folks are doing
>> with their issue tracker and was very impressed by the level of community
>> integration that having issues tracked by Github provides.    Right now, we
>> have a major community problem in that there are 3 conversations taking
>> place (well at least 2 1/2).   One on Github, one on this list, and one on
>> the Trac and it's accompanying wiki.
>>
>>
>> I would like to propose just using Github's issue tracker.    This just
>> seems like the best move overall for us at this point.    I like how the
>> Pull Request mechanism integrates with the issue tracking.    We could
>> setup a Redmine instance but this would just re-create the same separation
>> of communities that currently exists with the pull-requests, the mailing
>> list, and the Trac pages.   Redmine is nicer than Trac, but it's still a
>> separate space.   We need to make Github the NumPy developer hub and not
>> have it spread throughout several sites.
>>
>>
>> The same is true of SciPy.    I think if SciPy also migrates to use
>> Github issues, then together with IPython we can really be a voice that
>> helps Github.   I will propose to NumFOCUS that the Foundation sponsor
>> migration of the Trac to Github for NumPy and SciPy.    If anyone would
>> like to be involved in this migration project, please let me know.
>>
>>
>> Comments, concerns?
>>
>>
>> I've been pretty impressed with the lemonade that the IPython folks have
>> made out of what I see as pretty limiting shortcomings of the github
>> issue tracker.  I've been trying to use it for a much smaller project
>> (https://github.com/sagemath/sagecell/), and it is a lot harder, in my
>> (somewhat limited) experience, than using trac or the google issue
>> tracker.  None of these issues seems like it would be too hard to solve,
>> but since we don't even have the source to the tracker, we're somewhat
>> at github's mercy for any improvements.  Github does have a very nice
>> API for interacting with the data, which somewhat makes up for some of
>> the severe shortcomings of the web interface.
>>
>> In no particular order, here are a few that come to mind immediately:
>>
>> 1. No key:value pairs for labels (Fernando brought this up a long time
>> ago, I think).  This is brilliant in Google code's tracker, and allows
>> for custom fields that help in tracking workflow (like status, priority,
>> etc.).  Sure, you can do what the IPython folks are doing and just
>> create labels for every possible status, but that's unwieldy and takes a
>> lot of discipline to maintain.  Which means it takes a lot of developer
>> time or it becomes inconsistent and not very useful.
>>
>>
>> I'm not sure how much of an issue this is.  A lot of tools use single
>> tags for categorization and it works pretty well.  A simple "key:value"
>> label communicates about the same information together with good query
>> tools.
>>
>>
>> 2. The disjointed relationship between pull requests and issues.  They
>> share numberings, for example, and both support discussions, etc.  If
>> you use the API, you can submit code to an issue, but then the issue
>> becomes a pull request, which means that all labels on the issue
>> disappear from the web interface (but you can still manage to set labels
>> using the list view of the issue tracker, if I recall correctly).  If
>> you don't attach code to issues, it means that every issue is duplicated
>> in a pull request, which splits the conversation up between an issue
>> ticket and a pull request ticket.
>>
>>
>> Hmm..  So pull requests *are* issues.    This sounds like it might
>> actually be a feature and also means that we *are* using the Github issue
>> tracker (just only those issues that have a pull-request attached).
>> Losing labels seems like a real problem (are they really lost or do they
>> just not appear in the pull-request view?)
>>
>>
>> 3. No attachments for issues (screenshots, supporting documents, etc.).
>>  Having API access to data won't help you here.
>>
>>
>> Using gists and references to gists can overcome this.   Also using an
>> attachment service like http://uploading.com/ or dropbox makes this
>> problem less of an issue really.
>>
>>
>> 4. No custom queries.  We love these in the Sage trac instance; since we
>> have full access to the database, we can run any sort of query we want.
>>  With API data access, you can build your own queries, so maybe this
>> isn't insurmountable.
>>
>>
>> yes, you can build your own queries.    This seems like an area where
>> github can improve (and tools can be written which improve the experience).
>>
>>
>>
>> 5. Stylistically, the webpage is not very dense on information.  I get
>> frustrated when trying to see the issues because they only come 25 at a
>> time, and never grouped into any sort of groupings, and there are only 3
>> options for sorting issues.  Compare the very nice, dense layout of
>> Google Code issues or bitbucket.  Google Code issues also lets you
>> cross-tabulate the issues so you can quickly triage them.  Compare also
>> the pretty comprehensive options for sorting and grouping things in trac.
>>
>>
>> Yes, it looks like you can group via labels, milestones, and "your"
>> issues.   This is also something that can be over-come with tools that use
>> the github API.
>>
>>
>> It would be good to hear from users of the IPython github issue tracker
>> to see how they like it "in the wild".   How problematic are these issues
>> in practice.   Does it reduce or increase the participation in issue
>> tracking both by users and by developers.
>>
>> Thanks,
>>
>> -Travis
>>
>>
>>
>>
>>
>> 6. Side-by-side diffs are nice to have, and I believe bitbucket and
>> google code both have them.  Of course, this isn't a deal-breaker
>> because you can always pull the branch down, but it would be nice to
>> have, and there's not really a way we can put it into the github tracker
>> ourselves.
>>
>> How does, for example, the JIRA github connector work?  Does it pull in
>> code comments, etc.?
>>
>> Anyways, I'm not a regular contributor to numpy, but I have been trying
>> to get used to the github tracker for about a year now, and I just keep
>> getting more frustrated at it.  I suppose the biggest frustrating part
>> about it is that it is closed source, so even if I did want to scratch
>> an itch, I can't.
>>
>> That said, it is nice to have code and dev conversations happening in
>> one place.  There are great things about github issues, of course.  But
>> I'm not so sure, for me, that they outweigh some of the administrative
>> issues listed above.
>>
>>
> I'm thinking we could do worse than simply take Ralf's top pick. Github
> definitely sounds a bit clunky for issue tracking, and while we could put
> together workarounds, I think Jason's point about the overall frustration
> is telling. And while we could, maybe, put together tools to work with it,
> I think what we want is something that works out of the box. Implementing
> workarounds for a frustrating system doesn't seem the best use of developer
> time.
>

Having looked at the IPython issues and Jason's example, it's still my
impression that Github is inferior to Trac/Redmine as a bug tracker -- but
not as much as I first thought. The IPython team has  managed to make it
work quite well (assuming you can stand the multi-colored patchwork of
labels...).

At this point it's probably good to look again at the problems we want to
solve:
1. responsive user interface (must absolutely have)
2. mass editing of tickets (good to have)
3. usable API (good to have)
4. various ideas/issues mentioned at
http://projects.scipy.org/numpy/wiki/ImprovingIssueWorkflow

Note that Github does solve 1, 2 and 3 (as does Redmine). It does come with
some new problems that require workarounds, but we can probably live with
them. I'm not convinced that being on Github will actually get more eyes on
the tickets, but there certainly won't be less.

The main problem with Github (besides the issues/PRs thing and no
attachments, which I can live with) is that to make it work we'll have to
religiously label everything. And because users aren't allowed to attach
labels, it will require a larger time investment from maintainers. Are we
okay with that? If everyone else is and we can distribute this task, it's
fine with me.

David has been investigating bug trackers long before me, and Pauli has
done most of the work administering Trac as far as I know, so I'd like to
at least hear their preferences too before we make a decision. Then I hope
we can move this along quickly, because any choice will be a huge
improvement over the current situation.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120501/67bbfdd8/attachment.html>