[SciPy-User] Proposal for a new data analysis toolbox

Mon Nov 22 11:06:56 EST 2010

On Mon, Nov 22, 2010 at 7:52 AM,  <josef.pktd at gmail.com> wrote:
> On Mon, Nov 22, 2010 at 10:35 AM, Keith Goodman <kwgoodman at gmail.com> wrote:

>> The function signatures for these are easy: we copy numpy, scipy. (I
>> am tempted to change nanstd from scipy's bias=False to ddof=0.)
>
> scipy.stats.nanstd is supposed to switch to ddof, so don't copy
> inconsistent signatures that are supposed to be depreciated.

Great, I'll use ddof then.

> I would like statistics (scipy.stats and statsmodels) to stick with
> default axis=0.

I put my dates on axis=-1. It is much faster:

>> a = np.random.rand(1000,1000)
>> timeit a.sum(0)
100 loops, best of 3: 9.01 ms per loop
>> timeit a.sum(1)
1000 loops, best of 3: 1.17 ms per loop
>> timeit a.std(0)
10 loops, best of 3: 27.2 ms per loop
>> timeit a.std(1)
100 loops, best of 3: 11.5 ms per loop

But I'd like the default axis to be what a numpy user would expect it to be.

> I would be in favor of axis=None for nan extended versions of numpy
> functions and axis=0 for stats functions as defaults, but since it
> will be a standalone package with wider usage, I will be able to keep
> track of axis=-1.

What default axis would a numpy/scipy user expect for mov_sum? group_mean?