Re: [SciPy-Dev] SciPy Goal

4 Jan 2012

      On Wed, Jan 4, 2012 at 9:22 PM, Fernando Perez <fperez.net@gmail.com> wrote:
...
Hi all,
On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant <travis@continuum.io> wrote:
...
What do others think is missing?  Off the top of my head:   basic wavelets
(dwt primarily) and more complete interpolation strategies (I'd like to
finish the basic interpolation approaches I started a while ago).
Originally, I used GAMS as an "overview" of the kinds of things needed in
SciPy.   Are there other relevant taxonomies these days?
Well, probably not something that fits these ideas for scipy
one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View
from Berkeley' paper on parallel computing is not a bad starting
point; summarized here they are:
   Dense Linear Algebra
   Sparse Linear Algebra [1]
   Spectral Methods
   N-Body Methods
   Structured Grids
   Unstructured Grids
   MapReduce
   Combinational Logic
   Graph Traversal
   Dynamic Programming
   Backtrack and Branch-and-Bound
   Graphical Models
   Finite State Machines
Descriptions of each can be found here:
http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is
here:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html
That list is biased towards the classes of codes used in
supercomputing environments, and some of the topics are probably
beyond the scope of scipy (say structured/unstructured grids, at least
for now).
But it can be a decent guiding outline to reason about what are the
'big areas' of scientific computing, so that scipy at least provides
building blocks that would be useful in these directions.
One area that hasn't been directly mentioned too much is the situation
with statistical tools.  On the one hand, we have the phenomenal work
of pandas, statsmodels and sklearn, which together are helping turn
python into a great tool for statistical data analysis (understood in
a broad sense).  But it would probably be valuable to have enough of a
statistical base directly in numpy/scipy so that the 'out of the box'
experience for statistical work is improved.  I know we have
scipy.stats, but it seems like it needs some love.
(I didn't send something like the first part earlier, because I didn't
want to talk so much.)

Every new code and sub-package need additional topic specific maintainers.

Pauli, Warren and Ralf are doing a great job as default, general
maintainers, and especially Warren and Ralf have been pushing
bug-fixes and enhancements into stats (and I have been reviewing
almost all of it).

If there is a well defined set of enhancements that could go into
stats, then I wouldn't mind, but I don't see much reason in
duplicating code and maintenance work with statsmodels.

Of course there are large parts that statsmodels doesn't cover either,
and it is useful to extend the coverage of statistics in either
package.

However, adding code that is not low maintenance (because it's fully
tested) or doesn't have committed maintainers doesn't make much sense
in my opinion.

Cheers,

Josef
...
Cheers,
f
_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev

Re: [SciPy-Dev] SciPy Goal

josef.pktd＠gmail.com