On Wed, Jan 4, 2012 at 9:22 PM, Fernando Perez <fperez.net@gmail.com> wrote:
Hi all,
On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant <travis@continuum.io> wrote:
What do others think is missing? Off the top of my head: basic wavelets (dwt primarily) and more complete interpolation strategies (I'd like to finish the basic interpolation approaches I started a while ago). Originally, I used GAMS as an "overview" of the kinds of things needed in SciPy. Are there other relevant taxonomies these days?
Well, probably not something that fits these ideas for scipy one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View from Berkeley' paper on parallel computing is not a bad starting point; summarized here they are:
Dense Linear Algebra Sparse Linear Algebra [1] Spectral Methods N-Body Methods Structured Grids Unstructured Grids MapReduce Combinational Logic Graph Traversal Dynamic Programming Backtrack and Branch-and-Bound Graphical Models Finite State Machines
Descriptions of each can be found here: http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is here:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html
That list is biased towards the classes of codes used in supercomputing environments, and some of the topics are probably beyond the scope of scipy (say structured/unstructured grids, at least for now).
But it can be a decent guiding outline to reason about what are the 'big areas' of scientific computing, so that scipy at least provides building blocks that would be useful in these directions.
One area that hasn't been directly mentioned too much is the situation with statistical tools. On the one hand, we have the phenomenal work of pandas, statsmodels and sklearn, which together are helping turn python into a great tool for statistical data analysis (understood in a broad sense). But it would probably be valuable to have enough of a statistical base directly in numpy/scipy so that the 'out of the box' experience for statistical work is improved. I know we have scipy.stats, but it seems like it needs some love.
(I didn't send something like the first part earlier, because I didn't want to talk so much.) Every new code and sub-package need additional topic specific maintainers. Pauli, Warren and Ralf are doing a great job as default, general maintainers, and especially Warren and Ralf have been pushing bug-fixes and enhancements into stats (and I have been reviewing almost all of it). If there is a well defined set of enhancements that could go into stats, then I wouldn't mind, but I don't see much reason in duplicating code and maintenance work with statsmodels. Of course there are large parts that statsmodels doesn't cover either, and it is useful to extend the coverage of statistics in either package. However, adding code that is not low maintenance (because it's fully tested) or doesn't have committed maintainers doesn't make much sense in my opinion. Cheers, Josef
Cheers,
f _______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev