Deprecate scipy.misc and Add scipy.datasets submodule
Hi all, I just filed https://github.com/scipy/scipy/pull/15607 after an offline discussion with Ralf earlier this month. The summary of that PR is: This is a good time to get rid of the `scipy.misc` submodule since it houses only a handful of functions. As of today, most of the methods have already been deprecated from the `misc` module. They have generally moved under some other submodule. In this case, it is best to create a `scipy.datasets` submodule that can be home to the dataset functions (ascent, face, ecg). Adding a datasets submodule is not a new idea, and was proposed a couple of years ago by Warren (see https://github.com/scipy/scipy/pull/8707), but unfortunately, it never went ahead. Adding this submodule is not only sensible from an organizational/structural point of view but will also help reduce the wheel size of SciPy package. This is possible if we decouple the dataset files from the SciPy repository making use of Pooch to download and cache datasets kept in separate repos. See the PR for more details. I also filed https://github.com/scipy/scipy/issues/15608 Out of the 5 methods in `scipy.misc`, three dataset functions can be moved to this datasets submodule and will slowly be deprecated from misc. The question about the other two methods i.e (derivative, central_diff_weights) is up for discussion and I'd like to understand what will be the best way to deprecate them. Should we move them to a submodule like integrate (keeping derivative makes sense to me)? On the other hand, considering these two methods are not used extensively so one possibility is to deprecate both of them completely and remove them from SciPy. Please share your opinion/concerns, and comment if you see a potential issue with this. Cheers, Anirudh
On Wed, Feb 16, 2022 at 11:52 PM Anirudh Dagar <anirudhdagar6@gmail.com> wrote:
The question about the other two methods i.e (derivative, central_diff_weights) is up for discussion and I'd like to understand what will be the best way to deprecate them. Should we move them to a submodule like integrate (keeping derivative makes sense to me)? On the other hand, considering these two methods are not used extensively so one possibility is to deprecate both of them completely and remove them from SciPy.
I would vote for removing them entirely. If we were to keep them in SciPy, they might belong in scipy.optimize next to check_grad and approx_frime. But I don't think these functions (as written) are very useful. They have obvious computational inefficiencies and very limited functionality. I would rather point users to a fully functioning library for finite-differences like findiff: https://github.com/maroba/findiff The implementation of these functions is ~50 lines of very straightforward NumPy code, so I can imagine moving them into something like a GitHub Gist or StackOverflow answer that users could copy & paste. This should resolve the backwards compatibility concerns when they are removed.
On Thu, Feb 17, 2022 at 5:40 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Feb 16, 2022 at 11:52 PM Anirudh Dagar <anirudhdagar6@gmail.com> wrote:
The question about the other two methods i.e (derivative, central_diff_weights) is up for discussion and I'd like to understand what will be the best way to deprecate them. Should we move them to a submodule like integrate (keeping derivative makes sense to me)? On the other hand, considering these two methods are not used extensively so one possibility is to deprecate both of them completely and remove them from SciPy.
I would vote for removing them entirely.
If we were to keep them in SciPy, they might belong in scipy.optimize next to check_grad and approx_frime. But I don't think these functions (as written) are very useful. They have obvious computational inefficiencies and very limited functionality. I would rather point users to a fully functioning library for finite-differences like findiff: https://github.com/maroba/findiff
Thanks Stephan! I didn't hear about findiff before. Would you recommend it over https://github.com/pbrod/numdifftools?
The implementation of these functions is ~50 lines of very straightforward NumPy code, so I can imagine moving them into something like a GitHub Gist or StackOverflow answer that users could copy & paste. This should resolve the backwards compatibility concerns when they are removed.
I agree. Cheers, Ralf
On Tue, Feb 22, 2022 at 9:20 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
If we were to keep them in SciPy, they might belong in scipy.optimize next
to check_grad and approx_frime. But I don't think these functions (as written) are very useful. They have obvious computational inefficiencies and very limited functionality. I would rather point users to a fully functioning library for finite-differences like findiff: https://github.com/maroba/findiff
Thanks Stephan! I didn't hear about findiff before. Would you recommend it over https://github.com/pbrod/numdifftools?
I haven't used either of them, it just came up in a search for finite differences in Python.
On Wed, Feb 23, 2022 at 10:11 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Tue, Feb 22, 2022 at 9:20 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
If we were to keep them in SciPy, they might belong in scipy.optimize
next to check_grad and approx_frime. But I don't think these functions (as written) are very useful. They have obvious computational inefficiencies and very limited functionality. I would rather point users to a fully functioning library for finite-differences like findiff: https://github.com/maroba/findiff
Thanks Stephan! I didn't hear about findiff before. Would you recommend it over https://github.com/pbrod/numdifftools?
I haven't used either of them, it just came up in a search for finite differences in Python.
Okay, thanks Stephan. Both look good, so unless someone has practical experience and can make a recommendation for why one of these is preferred, we should probably list both in the deprecation notice. Cheers, Ralf
Hi all, I started working on adding *scipy.datasets* submodule and reviving the discussion around the deprecation of *scipy.misc* earlier this year with Ralf's help. Both the PRs are ready for review, and I'd request maintainers to have a look at them and share their thoughts on Github. SciPy Datasets are implemented using pooch <https://github.com/fatiando/pooch>, but the PR doesn't add pooch as a new dependency as discussed. 1. Add scipy.datasets: https://github.com/scipy/scipy/pull/15607 2. Deprecate scipy.misc: https://github.com/scipy/scipy/pull/15901 (probably clear enough and doesn't need any discussions) One thing to be done, fairly easy to address, is to move the datasets with separate repos into the SciPy org and update the links <https://github.com/AnirudhDagar/scipy/blob/scipy-datasets/scipy/datasets/_re...> in the registry file. Currently, all of these are under https://github.com/scipy-datasets. Some follow-ups will include getting rid of the dataset files from the SciPy repo completely once the PR is approved and merged. Just wanted to bring up the updates on the mailing list. Please let me know if you have any kind of feedback on Github. Thanks! Best, Anirudh On Fri, Feb 25, 2022 at 3:58 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Wed, Feb 23, 2022 at 10:11 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Tue, Feb 22, 2022 at 9:20 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
If we were to keep them in SciPy, they might belong in scipy.optimize
next to check_grad and approx_frime. But I don't think these functions (as written) are very useful. They have obvious computational inefficiencies and very limited functionality. I would rather point users to a fully functioning library for finite-differences like findiff: https://github.com/maroba/findiff
Thanks Stephan! I didn't hear about findiff before. Would you recommend it over https://github.com/pbrod/numdifftools?
I haven't used either of them, it just came up in a search for finite differences in Python.
Okay, thanks Stephan. Both look good, so unless someone has practical experience and can make a recommendation for why one of these is preferred, we should probably list both in the deprecation notice.
Cheers, Ralf
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: anirudhdagar6@gmail.com
On Thu, Jul 7, 2022 at 8:58 PM Anirudh Dagar <anirudhdagar6@gmail.com> wrote:
Hi all,
I started working on adding *scipy.datasets* submodule and reviving the discussion around the deprecation of *scipy.misc* earlier this year with Ralf's help. Both the PRs are ready for review, and I'd request maintainers to have a look at them and share their thoughts on Github. SciPy Datasets are implemented using pooch <https://github.com/fatiando/pooch>, but the PR doesn't add pooch as a new dependency as discussed.
1. Add scipy.datasets: https://github.com/scipy/scipy/pull/15607 2. Deprecate scipy.misc: https://github.com/scipy/scipy/pull/15901 (probably clear enough and doesn't need any discussions)
Thanks for pushing this forward Anirudh! It looks about ready to go.
One thing to be done, fairly easy to address, is to move the datasets with separate repos into the SciPy org and update the links <https://github.com/AnirudhDagar/scipy/blob/scipy-datasets/scipy/datasets/_re...> in the registry file. Currently, all of these are under https://github.com/scipy-datasets. Some follow-ups will include getting rid of the dataset files from the SciPy repo completely once the PR is approved and merged.
Moving those repos into the SciPy GitHub org seems indeed preferred, to ensure we can reuse our normal permissions management workflow, and don't have to maintain duplicate sets of permissions. There will be quite a few repos over time, however given that they're named `dataset-xxx` I don't see an issue with that. I plan to move these repos sometime next week. If anyone has a concern, please let me know. Cheers, Ralf
Just wanted to bring up the updates on the mailing list. Please let me know if you have any kind of feedback on Github. Thanks!
Best, Anirudh
On Fri, Feb 25, 2022 at 3:58 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Wed, Feb 23, 2022 at 10:11 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Tue, Feb 22, 2022 at 9:20 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
If we were to keep them in SciPy, they might belong in scipy.optimize
next to check_grad and approx_frime. But I don't think these functions (as written) are very useful. They have obvious computational inefficiencies and very limited functionality. I would rather point users to a fully functioning library for finite-differences like findiff: https://github.com/maroba/findiff
Thanks Stephan! I didn't hear about findiff before. Would you recommend it over https://github.com/pbrod/numdifftools?
I haven't used either of them, it just came up in a search for finite differences in Python.
Okay, thanks Stephan. Both look good, so unless someone has practical experience and can make a recommendation for why one of these is preferred, we should probably list both in the deprecation notice.
Cheers, Ralf
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: anirudhdagar6@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: ralf.gommers@gmail.com
On Sat, Jul 9, 2022 at 2:33 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
One thing to be done, fairly easy to address, is to move the datasets with separate repos into the SciPy org and update the links <https://github.com/AnirudhDagar/scipy/blob/scipy-datasets/scipy/datasets/_re...> in the registry file. Currently, all of these are under https://github.com/scipy-datasets. Some follow-ups will include getting rid of the dataset files from the SciPy repo completely once the PR is approved and merged.
Moving those repos into the SciPy GitHub org seems indeed preferred, to ensure we can reuse our normal permissions management workflow, and don't have to maintain duplicate sets of permissions. There will be quite a few repos over time, however given that they're named `dataset-xxx` I don't see an issue with that.
I plan to move these repos sometime next week. If anyone has a concern, please let me know.
Sounds good! I guess you are already the owner at https://github.com/scipy-datasets, so you should have all the rights to transfer the repo's ownership to SciPy. If not, feel free to create new ones in SciPy. Thanks, Anirudh
Just wanted to bring up the updates on the mailing list. Please let me know if you have any kind of feedback on Github. Thanks!
Best, Anirudh
On Fri, Feb 25, 2022 at 3:58 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Wed, Feb 23, 2022 at 10:11 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Tue, Feb 22, 2022 at 9:20 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
If we were to keep them in SciPy, they might belong in scipy.optimize
next to check_grad and approx_frime. But I don't think these functions (as written) are very useful. They have obvious computational inefficiencies and very limited functionality. I would rather point users to a fully functioning library for finite-differences like findiff: https://github.com/maroba/findiff
Thanks Stephan! I didn't hear about findiff before. Would you recommend it over https://github.com/pbrod/numdifftools?
I haven't used either of them, it just came up in a search for finite differences in Python.
Okay, thanks Stephan. Both look good, so unless someone has practical experience and can make a recommendation for why one of these is preferred, we should probably list both in the deprecation notice.
Cheers, Ralf
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: anirudhdagar6@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: ralf.gommers@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: anirudhdagar6@gmail.com
On Sat, Jul 9, 2022 at 4:02 PM Anirudh Dagar <anirudhdagar6@gmail.com> wrote:
On Sat, Jul 9, 2022 at 2:33 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
One thing to be done, fairly easy to address, is to move the datasets with separate repos into the SciPy org and update the links <https://github.com/AnirudhDagar/scipy/blob/scipy-datasets/scipy/datasets/_re...> in the registry file. Currently, all of these are under https://github.com/scipy-datasets. Some follow-ups will include getting rid of the dataset files from the SciPy repo completely once the PR is approved and merged.
Moving those repos into the SciPy GitHub org seems indeed preferred, to ensure we can reuse our normal permissions management workflow, and don't have to maintain duplicate sets of permissions. There will be quite a few repos over time, however given that they're named `dataset-xxx` I don't see an issue with that.
I plan to move these repos sometime next week. If anyone has a concern, please let me know.
Sounds good! I guess you are already the owner at https://github.com/scipy-datasets, so you should have all the rights to transfer the repo's ownership to SciPy. If not, feel free to create new ones in SciPy.
Thanks, Anirudh
Just wanted to bring up the updates on the mailing list. Please let me know if you have any kind of feedback on Github. Thanks!
Best, Anirudh
On Fri, Feb 25, 2022 at 3:58 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Wed, Feb 23, 2022 at 10:11 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Tue, Feb 22, 2022 at 9:20 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
If we were to keep them in SciPy, they might belong in scipy.optimize > next to check_grad and approx_frime. But I don't think these functions (as > written) are very useful. They have obvious computational inefficiencies > and very limited functionality. I would rather point users to a fully > functioning library for finite-differences like findiff: > https://github.com/maroba/findiff >
Thanks Stephan! I didn't hear about findiff before. Would you recommend it over https://github.com/pbrod/numdifftools?
I haven't used either of them, it just came up in a search for finite differences in Python.
Okay, thanks Stephan. Both look good, so unless someone has practical experience and can make a recommendation for why one of these is preferred, we should probably list both in the deprecation notice.
This has all been completed now as proposed. `scipy.datasets` is a thing, and `scipy.misc` is deprecated. Thanks Anirudh for pushing this forward, and anyone else who helped get it merged! Cheers, Ralf
Hi all, The first stage of adding the `scipy.datasets` submodule is done as pointed out by Ralf. Thanks everyone involved! We now look forward to making some enhancements, maybe adding more datasets which might be important to the SciPy community. With pooch, it is now much easier to add a new dataset since we don't really add the dataset files inside the SciPy repo. I've created a Tracker Issue for scipy.datasets <https://github.com/scipy/scipy/issues/16983> mentioning the next few Todo Items. I'll get started with them after my vacation next week. Until then, it would be great if people here can share some thoughts on datasets that they would like to see inside `scipy.datasets` and if there were any past requests about adding a certain dataset that could be helpful to the whole community. A few interesting/useful ones are already mentioned on the tracker thanks to suggestions from Ralf. Cheers, Anirudh On Mon, Sep 5, 2022 at 10:22 AM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Sat, Jul 9, 2022 at 4:02 PM Anirudh Dagar <anirudhdagar6@gmail.com> wrote:
On Sat, Jul 9, 2022 at 2:33 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
One thing to be done, fairly easy to address, is to move the datasets with separate repos into the SciPy org and update the links <https://github.com/AnirudhDagar/scipy/blob/scipy-datasets/scipy/datasets/_re...> in the registry file. Currently, all of these are under https://github.com/scipy-datasets. Some follow-ups will include getting rid of the dataset files from the SciPy repo completely once the PR is approved and merged.
Moving those repos into the SciPy GitHub org seems indeed preferred, to ensure we can reuse our normal permissions management workflow, and don't have to maintain duplicate sets of permissions. There will be quite a few repos over time, however given that they're named `dataset-xxx` I don't see an issue with that.
I plan to move these repos sometime next week. If anyone has a concern, please let me know.
Sounds good! I guess you are already the owner at https://github.com/scipy-datasets, so you should have all the rights to transfer the repo's ownership to SciPy. If not, feel free to create new ones in SciPy.
Thanks, Anirudh
Just wanted to bring up the updates on the mailing list. Please let me know if you have any kind of feedback on Github. Thanks!
Best, Anirudh
On Fri, Feb 25, 2022 at 3:58 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Wed, Feb 23, 2022 at 10:11 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Tue, Feb 22, 2022 at 9:20 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
> If we were to keep them in SciPy, they might belong in >> scipy.optimize next to check_grad and approx_frime. But I don't think these >> functions (as written) are very useful. They have obvious computational >> inefficiencies and very limited functionality. I would rather point users >> to a fully functioning library for finite-differences like findiff: >> https://github.com/maroba/findiff >> > > Thanks Stephan! I didn't hear about findiff before. Would you > recommend it over https://github.com/pbrod/numdifftools? >
I haven't used either of them, it just came up in a search for finite differences in Python.
Okay, thanks Stephan. Both look good, so unless someone has practical experience and can make a recommendation for why one of these is preferred, we should probably list both in the deprecation notice.
This has all been completed now as proposed. `scipy.datasets` is a thing, and `scipy.misc` is deprecated. Thanks Anirudh for pushing this forward, and anyone else who helped get it merged!
Cheers, Ralf
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: anirudhdagar6@gmail.com
Ralf Gommers wrote:
On Wed, Feb 23, 2022 at 10:11 PM Stephan Hoyer shoyer@gmail.com wrote:
On Tue, Feb 22, 2022 at 9:20 PM Ralf Gommers ralf.gommers@gmail.com wrote: If we were to keep them in SciPy, they might belong in scipy.optimize next to check_grad and approx_frime. But I don't think these functions (as written) are very useful. They have obvious computational inefficiencies and very limited functionality. I would rather point users to a fully functioning library for finite-differences like findiff: https://github.com/maroba/findiff Thanks Stephan! I didn't hear about findiff before. Would you recommend it over https://github.com/pbrod/numdifftools? I haven't used either of them, it just came up in a search for finite differences in Python. Okay, thanks Stephan. Both look good, so unless someone has practical experience and can make a recommendation for why one of these is preferred, we should probably list both in the deprecation notice. Cheers, Ralf
Hi all, I just came across the deprecation warning for the derivative function in scipy.misc. In my code, I need to be able to get the derivative of a complex mathematical function around a certain point. For me, scipy.misc.derivative does the job perfectly. I tried to switch to both findiff or numdifftoold, but neither of them work as scipy.misc.derivative. They either require the mathematical function to be pre-evaluated into an array, or for the mathematical function to be real. What I am going to do for my project will be to copy the scipy code and store it locally. However, I would gladly reimport it from scipy if this function remains available in the library. Thank you, Mauro
participants (4)
-
Anirudh Dagar
-
mektigefjell2314@outlook.com
-
Ralf Gommers
-
Stephan Hoyer