RE: [SciPy-dev] MCMC, Kalman Filtering, AI for SciPy?
![](https://secure.gravatar.com/avatar/7ae1f973711b916a188ec3962f4aa701.jpg?s=120&d=mm&r=g)
From: scipy-dev-bounces@scipy.net on behalf of eric jones
Robert Kern wrote:
Charles Harris wrote:
[snip]
I agree that search and indexing are the best ways to find stuff, but I am mostly concerned as to where to commit stuff. Clustering, where does that go?
scipy.cluster I would imagine. ;-)
Lattice methods, where do they go? How about useful data structures or combinatorics? So on and so forth. I think the upper level GAMS categories cover sufficient range that most things can be put into a directory without embarrassment. As to the detailed breakdown in the GAMS sub-classifications, I am not so sure.
To make the discussion a bit more concrete, here is an example directory structure corresponding to the top-level GAMS classifications. The names are all my own, so feel free to pretend they are something more to your liking.
<snip hierarchy>
Now that I see it, it is somewhat appealing. I would probably want to break up some of those into two or more top-level groups. I definitely don't want to see too many subpackages under each of the top-level groups ("Flat is better than nested.").
Here is where the current SciPy modules would likely get lumped in the GAMS hierarchy.
scipy/ analysis/ numbertheory/ functions/ special linalg/ linalg, sparse interpolation/ interpolate rootfinding/ optimization/ optimize, ga calculus/ integrate diffeq/ integraltransforms/ fftpack approximation/ probstat/ stats simulation/ datahandling/ io symbolic/ geometry/ graphics/ xplt, gplt, plt service/ gui_thread develop/ other/ cow, cluster ??, signal ??
(Cluster and signal didn't fit anywhere obvious to me)
Gams has clustering under probstat. The union/find algorithm could also go under data where they have trees, etc. Lattice stuff goes into numbertheory. Service also contains the machine parameters, eps and such. Remez algorithm goes into approximation (L_inf). Weave into develop. Hmm, signal processing is missing somewhere. Markov chain into simulation. Grey codes, permutations into other, although GAMS has permutations under data. We could split diffeq into ode, pde, but it is probably ok as is. Control is under optimization (for optimal control) but could be brought to the top. The addition of rootfinding is good. Convolutions and such are under integraltransforms, Haar transforms would go there also. Filters are under probstats in time series analysis, so it might make sense to create a signal (time series?) directory, probstats seems to be an overloaded GAMS category that could use some upperlevel subdivision.
The naming conventions are often quite similar. The SciPy names are generally shorter which is nice for typing. Where SciPy has multiple packages [(linalg, sparse), (optimize, ga), etc.], it is likely a good idea. Like you, I don't want to see a deep nesting in the package structure.
Looking at this, I don't see any real reason to reorganize top level package names. Are any of them that bad or misleading? On the other hand, I do think we should reorganize the functions within them some to fix the places where they are organized based on "build" convenience instead of actual function. This will probably necessitate the addition of new top level groups and maybe the pruning of one of the current ones. I've made a Wiki page to keep suggestions that people have:
http://www.scipy.org/wikis/featurerequests/PackageReorganization
If you update the page, you might also post to python-dev so that people know to go check on the Wiki (that is so painful...). We can obviously also just discuss it here and then transfer to the Wiki later. [side note: this using a wiki and a mailing list for communication is also a little painful].
Fernando, could you give an example or two where you would want to replicate a function across sub-packages? I'm wary of doing so as there is already the enormous amount of replication with respect to, at least, the base Numeric functions. Try scipy.special.<tab> in IPython. I realize what you're proposing doesn't even come close to that, but I'd like an example in any case.
I don't like the replication idea very well. I think things should live in one place. Otherwise people will wonder if two functions that are actually the same have different purposes, implementation, etc.
And since we are talking about re-organization, is there anything we can do about the problem I just mentioned? It wreaks havoc with not only tab-completion but also automatic documentation generation [1]. Is it practical to be careful about what we import into __init__.py? By which I mean not doing "from foo import *" in __init__.py where foo.py does "from scipy_base import *". On the other hand, explicitly listing all of the names in special is gonna be a major pain and fragile to boot.
<snip> _______________________________________________ Scipy-dev mailing list Scipy-dev@scipy.net http://www.scipy.net/mailman/listinfo/scipy-dev
![](https://secure.gravatar.com/avatar/d41fa6e1fe29e6c5c5821b5a3f31f190.jpg?s=120&d=mm&r=g)
Charles Harris wrote:
From: scipy-dev-bounces@scipy.net on behalf of eric jones
Robert Kern wrote:
Charles Harris wrote:
[snip]
I agree that search and indexing are the best ways to find stuff, but I am mostly concerned as to where to commit stuff. Clustering, where does that go?
scipy.cluster I would imagine. ;-)
Lattice methods, where do they go? How about useful data structures or combinatorics? So on and so forth. I think the upper level GAMS categories cover sufficient range that most things can be put into a directory without embarrassment. As to the detailed breakdown in the GAMS sub-classifications, I am not so sure.
To make the discussion a bit more concrete, here is an example directory structure corresponding to the top-level GAMS classifications. The names are all my own, so feel free to pretend they are something more to your liking.
<snip hierarchy>
Now that I see it, it is somewhat appealing. I would probably want to break up some of those into two or more top-level groups. I definitely don't want to see too many subpackages under each of the top-level groups ("Flat is better than nested.").
Here is where the current SciPy modules would likely get lumped in the GAMS hierarchy.
scipy/ analysis/ numbertheory/ functions/ special linalg/ linalg, sparse interpolation/ interpolate rootfinding/ optimization/ optimize, ga calculus/ integrate diffeq/ integraltransforms/ fftpack approximation/ probstat/ stats simulation/ datahandling/ io symbolic/ geometry/ graphics/ xplt, gplt, plt service/ gui_thread develop/ other/ cow, cluster ??, signal ??
(Cluster and signal didn't fit anywhere obvious to me)
Gams has clustering under probstat. The union/find algorithm could also go under data where they have trees, etc. Lattice stuff goes into numbertheory. Service also contains the machine parameters, eps and such. Remez algorithm goes into approximation (L_inf). Weave into develop. Hmm, signal processing is missing somewhere. Markov chain into simulation. Grey codes, permutations into other, although GAMS has permutations under data.
We could split diffeq into ode, pde, but it is probably ok as is.
How about dde's (delay differential equations) ? http://fde.usaaa.ru/mirrors/www.cs.kuleuven.ac.be/~koen/delay/software.shtml
Control is under optimization (for optimal control) but could be brought to the top. The addition of rootfinding is good. Convolutions and such are under integraltransforms, Haar transforms would go there also. Filters are under probstats in time series analysis, so it might make sense to create a signal (time series?) directory, probstats seems to be an overloaded GAMS category that could use some upperlevel subdivision.
The naming conventions are often quite similar. The SciPy names are generally shorter which is nice for typing. Where SciPy has multiple packages [(linalg, sparse), (optimize, ga), etc.], it is likely a good idea. Like you, I don't want to see a deep nesting in the package structure.
Looking at this, I don't see any real reason to reorganize top level package names. Are any of them that bad or misleading? On the other hand, I do think we should reorganize the functions within them some to fix the places where they are organized based on "build" convenience instead of actual function. This will probably necessitate the addition of new top level groups and maybe the pruning of one of the current ones. I've made a Wiki page to keep suggestions that people have:
http://www.scipy.org/wikis/featurerequests/PackageReorganization
If you update the page, you might also post to python-dev so that people know to go check on the Wiki (that is so painful...). We can obviously also just discuss it here and then transfer to the Wiki later. [side note: this using a wiki and a mailing list for communication is also a little painful].
Fernando, could you give an example or two where you would want to replicate a function across sub-packages? I'm wary of doing so as there is already the enormous amount of replication with respect to, at least, the base Numeric functions. Try scipy.special.<tab> in IPython. I realize what you're proposing doesn't even come close to that, but I'd like an example in any case.
I don't like the replication idea very well. I think things should live in one place. Otherwise people will wonder if two functions that are actually the same have different purposes, implementation, etc.
And since we are talking about re-organization, is there anything we can do about the problem I just mentioned? It wreaks havoc with not only tab-completion but also automatic documentation generation [1]. Is it practical to be careful about what we import into __init__.py? By which I mean not doing "from foo import *" in __init__.py where foo.py does "from scipy_base import *". On the other hand, explicitly listing all of the names in special is gonna be a major pain and fragile to boot.
<snip>
_______________________________________________ Scipy-dev mailing list Scipy-dev@scipy.net http://www.scipy.net/mailman/listinfo/scipy-dev
------------------------------------------------------------------------
_______________________________________________ Scipy-dev mailing list Scipy-dev@scipy.net http://www.scipy.net/mailman/listinfo/scipy-dev
-- Dr.-Ing. Nils Wagner Institut A für Mechanik Universität Stuttgart Pfaffenwaldring 9 D-70550 Stuttgart Tel.: (+49) 0711 685 6262 Fax.: (+49) 0711 685 6282 E-mail: nwagner@mecha.uni-stuttgart.de URL : http://www.mecha.uni-stuttgart.de
participants (2)
-
Charles Harris
-
Nils Wagner