Re: [SciPy-Dev] SciPy-Dev Digest, Vol 174, Issue 31
How much of the DATASETS issues could be handled simply by references in the documentation to where users can find those datasets that are generally considered both "standard" and potentially useful, without "physically" incorporating those datasets into SciPy? E.g, could the ECG dataset be handled that way? "You won't find the right answers if you don't ask the right questions!" (Robert Helmbold, 2013) On Saturday, April 28, 2018 11:42:46 PM MST, scipy-dev-request@python.org <scipy-dev-request@python.org> wrote: Send SciPy-Dev mailing list submissions to scipy-dev@python.org To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/scipy-dev or, via email, send a message with subject or body 'help' to scipy-dev-request@python.org You can reach the person managing the list at scipy-dev-owner@python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of SciPy-Dev digest..." Today's Topics: 1. Re: New subpackage: scipy.data (Ralf Gommers) 2. Re: New subpackage: scipy.data (Robert Kern) 3. Re: New subpackage: scipy.data (Ralf Gommers) ---------------------------------------------------------------------- Message: 1 Date: Sat, 28 Apr 2018 22:58:44 -0700 From: Ralf Gommers <ralf.gommers@gmail.com> To: SciPy Developers List <scipy-dev@python.org> Subject: Re: [SciPy-Dev] New subpackage: scipy.data Message-ID: <CABL7CQjuAKrHbVwSEWXd_V1uzLV-=XbokG=ZokMDt354hTpszw@mail.gmail.com> Content-Type: text/plain; charset="utf-8" On Tue, Apr 3, 2018 at 1:06 AM, Da?id <davidmenhur@gmail.com> wrote:
On 31 March 2018 at 02:17, Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Fri, Mar 30, 2018 at 12:03 PM, Eric Larson <larson.eric.d@gmail.com> wrote:
Top-level module for them alone sounds overkill, and I'm not sure if
discoverability alone is enough.
Fine by me. And if we follow the idea that these should be added sparingly, we can maintain discoverability without it growing out of hand by populating the See Also sections of each function.
I agree with this, the 2 images and 1 ECG signal (to be added) that we have doesn't justify a top-level module. We don't want to grow more than the absolute minimum of datasets. The package is already very large, which is problematic in certain cases. E.g. numpy + scipy still fits in the AWS Lambda limit of 50 MB, but there's not much margin.
The biggest subpackage is sparse, and there most of the space is taken by _ sparsetools.cpython-35m-x86_64-linux-gnu.so According to size -A -d, the biggest sections are debug. The same goes for the second biggest, special. Can it run without those sections? On preliminary checks, it seems that stripping .debug_info and .debug_loc trim down the size from 38 to 3.7 MB, and the test suite still passes.
Should work. That's a lot more gain than I'd realized. Given that we hardly ever get useful gdb tracebacks, it may be worth considering doing that for releases.
If we really need to trim down the size for installing in things like Lambda, could we have a scipy-lite for production environments, that is the same as scipy but without unnecessary debug? I imagine tracebacks would not be as informative, but that shouldn't matter for production environments. My first thought was to remove docstrings, comments, tests, and data, but maybe they don't amount to so much for the trouble.
Recipes for such things are floating around, and it makes sense to do that. I'd rather not maintain an official scipy-lite package though, rather just make choices within scipy that enable third parties to do that. Ralf
On the topic at hand, I would agree to having a few, small datasets to showcase functionality. I think a few kilobytes can go a long way to show and benchmark. As far as I can see, a top level module is free: it wouldn't add any maintenance burden, and would make them easier to find.
/David.
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
participants (1)
-
The Helmbolds