[SciPy-Dev] scipy.stats improvements

josef.pktd at gmail.com josef.pktd at gmail.com
Sun Mar 15 13:33:39 EDT 2015


On Sun, Mar 15, 2015 at 12:38 AM, Abraham Escalante <aeklant at gmail.com>
wrote:

> Hi Ralf, thanks for all the feedback.
>
> I have made some changes. You can find the second draft here:
> http://1drv.ms/1BFW6Pb
>


>
> I reckon that when it comes to the StatisticsCleanup issues, the schedule
> may change considering their varying scopes. However, I need to get the
> ball rolling with the community feedback since most of the issues don't
> have any. I also need to do my own work getting to know the functions more
> closely, which is the next step in my plan. Do you have any other
> suggestions?
>
> I provide an overview of the changes to the draft here for your
> convenience:
>
>
>> About the abstract and deliverables: I would state the overall goal as
>> "enhancement and addressing maintenance issues"
>>
>
> It did sound like more of a documentation project than a coding effort. I
> made a few changes and I hope it sounds more accurate now.
>

I would explicitly add adding and checking unit tests.
I think some functions with insufficient test coverage should be verified
if possible against R or similar.



>
>
>
>> - the change to _chk_asarray gets too much attention I think, it's not
>> that big a deal (and effort will also be minor on the overall scale of
>> things).
>>
>
> I have removed some of the focus to it. It is also listed in the
> "community bonding" period because its purpose is to help me with the
> learning curve.
>
>
>
>> - you reserve separate time for PEP8 compliance, this should actually be
>> done at the moment you write any code. The TravisCI tests for Scipy will
>> check PEP8 automatically, so you can't even do it separately.
>>
>
> I've kept it as a deliverable because it is obviously required, but I
> removed it from the housekeeping buffer weeks.
>
>
>
>> - API changes for trimmed statistics functions will take longer than
>> other issues in StatisticsReview.
>>
>
> I moved the task to week 5. I also added a task at the "community bonding
> period" (although in reality this should start earlier and go along my
> learning curve) to make sure all the issues are defined in scope before the
> coding begins.
>
>
> - ppcc_plot is already done in PR 4563, so doesn't need to be in your plan
>>
>
> Removed it and made a note at the deliverables section.
>
>
>
>> - making stats.mstats consistent with stats is also a larger job. I would
>> put it towards the end of your plan.
>>
>
> I moved this to the very end while keeping the last week as a buffer just
> in case this or any other tasks need some more work.
>
>
> The other thing I recommend is to look at each function in your proposal,
>> and assess whether it just needs a few tweaks or a lot of work.
>>
>
> Agreed. This is basically what the scope definition task is meant to do
> and although it is listed to start at "community bonding" I plan to start
> right away.
>


Review and work for several functions that are on the list will not take
much time.

"Implement `alternative keyword` addition to all hypothesis tests"
This might be time consuming or difficult for the hypothesis tests that are
not based on normal or t distributions, e.g. KS tests
or essentially impossible without writing new algorithms: e.g.
fisher_exact, IIRC.
for normal and t-based tests it is trivial, once the pattern is
established, plus decision on breaking backwards compatibility (?!)

Another general issue that I would like to see, if there is time, is to add
a `missing` keyword to the functions, that could in the first stage just
delegate to the masked array functions.

Josef


>
>
> Cheers,
> Abraham.
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20150315/c06b31b2/attachment.html>


More information about the SciPy-Dev mailing list