On Wed, Jan 4, 2012 at 9:29 PM, Travis Oliphant <travis@continuum.io> wrote:This is a nice list, thanks!
On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote:
> Hi all,
>
> On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant <travis@continuum.io> wrote:
>> What do others think is missing? Off the top of my head: basic wavelets
>> (dwt primarily) and more complete interpolation strategies (I'd like to
>> finish the basic interpolation approaches I started a while ago).
>> Originally, I used GAMS as an "overview" of the kinds of things needed in
>> SciPy. Are there other relevant taxonomies these days?
>
> Well, probably not something that fits these ideas for scipy
> one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View
> from Berkeley' paper on parallel computing is not a bad starting
> point; summarized here they are:
>
> Dense Linear Algebra
> Sparse Linear Algebra [1]
> Spectral Methods
> N-Body Methods
> Structured Grids
> Unstructured Grids
> MapReduce
> Combinational Logic
> Graph Traversal
> Dynamic Programming
> Backtrack and Branch-and-Bound
> Graphical Models
> Finite State Machines
Thanks for the links.
>
> Descriptions of each can be found here:
> http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is
> here:
>
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html
>
> That list is biased towards the classes of codes used in
> supercomputing environments, and some of the topics are probably
> beyond the scope of scipy (say structured/unstructured grids, at least
> for now).
>
> But it can be a decent guiding outline to reason about what are the
> 'big areas' of scientific computing, so that scipy at least provides
> building blocks that would be useful in these directions.
>
It seems like scipy stats has received quite a bit of attention. There is always more to do, of course, but I'm not sure what specifically you think is missing or needs work.
> One area that hasn't been directly mentioned too much is the situation
> with statistical tools. On the one hand, we have the phenomenal work
> of pandas, statsmodels and sklearn, which together are helping turn
> python into a great tool for statistical data analysis (understood in
> a broad sense). But it would probably be valuable to have enough of a
> statistical base directly in numpy/scipy so that the 'out of the box'
> experience for statistical work is improved. I know we have
> scipy.stats, but it seems like it needs some love.
Test coverage, for example. I recently fixed several wildly incorrect skewness and kurtosis formulas for some distributions, and I now have very little confidence that any of the other distributions are correct. Of course, most of them probably *are* correct, but without tests, all are in doubt.
WarrenA big question to me is the impact of data-frames as the underlying data-representation of the algorithms and the relationship between the data-frame and a NumPy array.
-Travis
>
> Cheers,
>
> f
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev
_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev