[SciPy-User] Proposal for a new data analysis toolbox

Keith Goodman kwgoodman at gmail.com
Wed Nov 24 17:39:31 EST 2010


On Wed, Nov 24, 2010 at 2:04 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
> On Wed, Nov 24, 2010 at 12:05 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
>> On Wed, Nov 24, 2010 at 4:43 AM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>
>>> I am not for placing arbitrary restrictions or having a strict
>>> enumeration on what goes in this library. I think having a practical,
>>> central dumping ground for data analysis tools would be beneficial. We
>>> could decide about having "spin-off" libraries later if we think
>>> that's appropriate.
>>
>> I'd like to start small (I've already bitten off more than I can chew)
>> by delivering a well thought out (and implemented) small feature set.
>> Functions of the form:
>>
>> sum(arr, axis=None)
>> move_sum(arr, window, axis=0)
>> group_sum(arr, label, axis)
>>
>> where sum can be replaced by a long (to be decided) list of functions
>> such as std, max, median, etc.
>>
>> Once that is delivered and gets some use, I'm sure we'll want to push
>> into new territory. What do you suggest for the next feature to add?
>
> I have no problem if you would like to develop in this way-- but I
> don't personally work well like that. I think having a library with 20
> 80% solutions would be better than a library with 5 100% solutions. Of
> course over time you eventually want to build out those 20 80%
> solutions into 100% solutions, but I think that approach is of greater
> utility overall.
>
>> So it could be that we are talking about the same end point but are
>> thinking about different development models. I cringe at the thought
>> of the package becoming a dumping ground.
>
> I find that the best and most useful code gets written (and gets
> written fastest) when the person writing it has a concrete problem
> they are trying to solve. So if someone comes along and says "I have
> problem X", where X lives in the general problem domain we are talking
> about, I might say, "Well I've never had problem X but I have no
> problem with you writing code to solve it and putting it in my library
> for this problem domain". So "dumping ground" here is a bit too
> pejorative but you get the idea. Personally if you or someone else
> told me "don't put that code here, we are only working on a small set
> of features for now" I would be kind of bothered (assuming that the
> code was related to the general problem domain).

Let's talk about a specific value of X, either now or when it pops up.



More information about the SciPy-User mailing list