[SciPy-Dev] Scipy 1.0 roadmap

Christopher Jordan-Squire cjordan1 at uw.edu
Sun Sep 22 15:25:45 EDT 2013


On Sun, Sep 22, 2013 at 1:52 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
>
> On Sun, Sep 22, 2013 at 7:52 AM, Christopher Jordan-Squire <cjordan1 at uw.edu>
> wrote:
>>
>> For scipy stats, is there anything on the table regarding somehow
>> unifying the sampling in numpy.random and the distributions in
>> scipy.stats? I'm specifically thinking of two issues:
>>
>> (1) There's a lot of duplication between numpy.random and scipy.stats
>> but with different interfaces. This seems like something that ideally
>> would be reduced.
>
>
> numpy.random only provides sampling and only has about half the
> distributions of scipy.stats. Sampling is really only a small part of what
> scipy.stats provides (pdf, cdf, moments, fitting a distribution, etc.). So
> I'm not bothered by that duplication. If we'd want to reduce it I think it
> would have to be removed from numpy, which doesn't sound like a good idea.
>

Yeah, I'm not about to suggest removing sampling from numpy.random.
That'd be crazy.

There's still the API mismatch between the names. Numpy.random, as a
rule, uses the full expansion of the name while scipy.stats, as a
rule, tends to abbreviate. That often confuses me, but not as
confusing as I first thought, since at least it's consistent.

>>
>> (2) The interface for the distributions in scipy.stats seems to
>> explicitly be for scalar random variables, so there's no multivariate
>> normals, multinomials, dirichlet, wishart, etc.. Instead the sampling
>> is in numpy.random, and pdf's aren't there.
>
>
> Two days ago PR-2726 was merged, which adds a multivariate normal
> distribution. Others can be added. IIRC there has been an enhancement ticket
> for wishhart somewhere and there's a Python implementation floating around
> somewhere.
>

A multivariate normal is a great addition.

Currently, dirichlet and multinomial are the only random variables you
can sample from in numpy.random that aren't in scipy.stats. My $0.02
for scipy 1.0 roadmap is adding dirichlet and multinomial to
scipy.stats as well as wishart/inverse-wishart. Then distributions in
scipy.stats would be a superset of numpy.random, and scipy.stats would
include one of the most widely used distributions currently not in it.
(In addition to the implementations floating around, both scikit-learn
and pymc include bits and pieces of wishart-related code.)

Also, right now you can use scipy.stats.rv_discrete to create your own
discrete random variable,  but only for an array of integers--so
[1,2,3] rather than ['apple', 'orange', 'banana']. Which is fine, but
that also means a lot of code duplication/wrapper classes for everyone
who wants their random variable to be over a space of fruits rather
than integers. Not sure how many people that effects, though.

Not sure if these belong on the roadmap or just as enhancement requests.

Thanks,
Chris


> Cheers,
> Ralf
>
>>
>> Has this been discussed elsewhere?
>>
>> On Sat, Sep 21, 2013 at 8:03 PM, Blake Griffith
>> <blake.a.griffith at gmail.com> wrote:
>> >
>> >> sparse
>> >> ``````
>> >>
>> >> Don't emulate np.matrix behavior, drop 2-D?
>> >
>> >
>> > What is meant by this? Emulate np.array instead?
>> >
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-dev
>> >
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>



More information about the SciPy-Dev mailing list