[ I'm new here and this has the feel of an FAQ but I couldn't find anything at http://www.scipy.org/FAQ . If I should have looked somewhere else a URL would be gratefully received. ] What's the reasoning behind functions like sum() and cumsum() being provided both as module functions (numpy.sum(data, axis=1)) and as object methods (data.sum(axis=1)) but other functions - and I stumbled over diff() - only being provided as module functions?
print numpy.__version__ 1.1.0
data = numpy.array([[1,2,3],[4,5,6]])
numpy.sum(data,axis=1) array([ 6, 15])
data.sum(axis=1) array([ 6, 15])
numpy.diff(data,axis=1) array([[1, 1], [1, 1]])
data.diff(axis=1) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'numpy.ndarray' object has no attribute 'diff'
On Mon, Jun 23, 2008 at 10:31 AM, Bob Dowling
[ I'm new here and this has the feel of an FAQ but I couldn't find anything at http://www.scipy.org/FAQ . If I should have looked somewhere else a URL would be gratefully received. ]
What's the reasoning behind functions like sum() and cumsum() being provided both as module functions (numpy.sum(data, axis=1)) and as object methods (data.sum(axis=1)) but other functions - and I stumbled over diff() - only being provided as module functions?
Hi Bob, this is a very good question. I think the answers are a) historical reasons AND, more importantly, differing personal preferences b) I would file the missing data.diff() as a bug. There are many inconsistencies left in such a big project like numpy. And filing bugs might be the best way of keeping track of them and getting them fixes eventually... (( a much more dangerous example is numpy.resize and data.resize, which do (slightly) different things !!)) Others, please correct my ..... Welcome on the list. - Sebastian Haase
On Mon, Jun 23, 2008 at 18:10, Sebastian Haase
On Mon, Jun 23, 2008 at 10:31 AM, Bob Dowling
wrote: [ I'm new here and this has the feel of an FAQ but I couldn't find anything at http://www.scipy.org/FAQ . If I should have looked somewhere else a URL would be gratefully received. ]
What's the reasoning behind functions like sum() and cumsum() being provided both as module functions (numpy.sum(data, axis=1)) and as object methods (data.sum(axis=1)) but other functions - and I stumbled over diff() - only being provided as module functions?
Hi Bob, this is a very good question. I think the answers are a) historical reasons AND, more importantly, differing personal preferences b) I would file the missing data.diff() as a bug.
It's not. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Robert Kern wrote:
On Mon, Jun 23, 2008 at 18:10, Sebastian Haase
wrote: On Mon, Jun 23, 2008 at 10:31 AM, Bob Dowling
wrote: [ I'm new here and this has the feel of an FAQ but I couldn't find anything at http://www.scipy.org/FAQ . If I should have looked somewhere else a URL would be gratefully received. ]
What's the reasoning behind functions like sum() and cumsum() being provided both as module functions (numpy.sum(data, axis=1)) and as object methods (data.sum(axis=1)) but other functions - and I stumbled over diff() - only being provided as module functions?
Hi Bob, this is a very good question. I think the answers are a) historical reasons AND, more importantly, differing personal preferences b) I would file the missing data.diff() as a bug.
It's not.
Care to elaborate? -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma
On Mon, Jun 23, 2008 at 19:35, Ryan May
Robert Kern wrote:
On Mon, Jun 23, 2008 at 18:10, Sebastian Haase
wrote: On Mon, Jun 23, 2008 at 10:31 AM, Bob Dowling
wrote: [ I'm new here and this has the feel of an FAQ but I couldn't find anything at http://www.scipy.org/FAQ . If I should have looked somewhere else a URL would be gratefully received. ]
What's the reasoning behind functions like sum() and cumsum() being provided both as module functions (numpy.sum(data, axis=1)) and as object methods (data.sum(axis=1)) but other functions - and I stumbled over diff() - only being provided as module functions?
Hi Bob, this is a very good question. I think the answers are a) historical reasons AND, more importantly, differing personal preferences b) I would file the missing data.diff() as a bug.
It's not.
Care to elaborate?
There is not supposed to be a one-to-one correspondence between the functions in numpy and the methods on an ndarray. There is some duplication between the two, but that is not a reason to make more duplication. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
There is not supposed to be a one-to-one correspondence between the functions in numpy and the methods on an ndarray. There is some duplication between the two, but that is not a reason to make more duplication.
I would make a plea for consistency, to start with. Those of us who write in an OO style are required to switch backwards and forwards between OO and not-OO, or to abandon OO altogether in our NumPy code. Neither is an attractive option. The reason I tripped over this is that I am currently writing a course which introduces students to NumPy. I am going to be asked this question from the audience. As yet I don't have any answer except "history".
2008/6/24 Bob Dowling
There is not supposed to be a one-to-one correspondence between the functions in numpy and the methods on an ndarray. There is some duplication between the two, but that is not a reason to make more duplication.
I would make a plea for consistency, to start with.
Those of us who write in an OO style are required to switch backwards and forwards between OO and not-OO, or to abandon OO altogether in our NumPy code. Neither is an attractive option.
The reason I tripped over this is that I am currently writing a course which introduces students to NumPy. I am going to be asked this question from the audience. As yet I don't have any answer except "history".
As a rule, I personally avoid methods whenever reasonable. The primary reason is that the function versions generally work fine on lists, giving my functions some extra genericity for free. I generally make an exception only for attributes - X.shape, for example, rather than np.shape(X). It's true, it's a mess, but I'd just set down some simple rules that work, and mention that some functions exist in other forms. There's something to be said for rationalizing numpy's rules - or at least writing them down! - but there's no need to use every version of every function. And I believe they're all accessible as module-level functions. Anne
Hi Bob
2008/6/24 Bob Dowling
I would make a plea for consistency, to start with.
Those of us who write in an OO style are required to switch backwards and forwards between OO and not-OO, or to abandon OO altogether in our NumPy code. Neither is an attractive option.
There are an infinite number of operations to be performed on arrays, so the question becomes: which of those belong as members of the class? In my opinion, none do; we have them simply for backward compatibility. In general, my feeling (not the status quo) is to: a) Use array methods for in-place operations and operations pertaining specifically to ndarrays. This would include `sort`, but not `sum` or `dump`. b) Use numpy functions for operations that copy the object, or do calculations that yield new objects. Even if you subscribe to such a rule, having x.sum() at hand is convenient, so many people use it. There's bound to be a big outcry if we try to remove them now. I'm not even sure most people would agree with these guidelines. Regards Stéfan
On Tue, Jun 24, 2008 at 02:33, Bob Dowling
There is not supposed to be a one-to-one correspondence between the functions in numpy and the methods on an ndarray. There is some duplication between the two, but that is not a reason to make more duplication.
I would make a plea for consistency, to start with.
It's way too late to make changes like this.
Those of us who write in an OO style are required to switch backwards and forwards between OO and not-OO, or to abandon OO altogether in our NumPy code. Neither is an attractive option.
OO does not mean "always use methods."
The reason I tripped over this is that I am currently writing a course which introduces students to NumPy. I am going to be asked this question from the audience. As yet I don't have any answer except "history".
Well, "history," usually along with "it seemed like a good idea at the time," are valid reasons for things to continue to exist in any nontrivial software project with a userbase. Your students will need to learn this if they use software. If you want a slightly better answer, the implementation of many of the C functions were somewhat easier to do as methods on ndarray than separate functions particularly since numpy.ndarray has subclasses. The functions could then be implemented similar to the following: def myfunc(a): return asanyarray(a).myfunc() One thing you will notice about numpy.diff() is that it is a pure Python function rather than a C function, so it's certainly not going to be a method on ndarray. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Tue, Jun 24, 2008 at 3:11 AM, Robert Kern
On Mon, Jun 23, 2008 at 19:35, Ryan May
wrote: Robert Kern wrote:
On Mon, Jun 23, 2008 at 18:10, Sebastian Haase
wrote: On Mon, Jun 23, 2008 at 10:31 AM, Bob Dowling
wrote: [ I'm new here and this has the feel of an FAQ but I couldn't find anything at http://www.scipy.org/FAQ . If I should have looked somewhere else a URL would be gratefully received. ]
What's the reasoning behind functions like sum() and cumsum() being provided both as module functions (numpy.sum(data, axis=1)) and as object methods (data.sum(axis=1)) but other functions - and I stumbled over diff() - only being provided as module functions?
Hi Bob, this is a very good question. I think the answers are a) historical reasons AND, more importantly, differing personal preferences b) I would file the missing data.diff() as a bug.
It's not.
Care to elaborate?
There is not supposed to be a one-to-one correspondence between the functions in numpy and the methods on an ndarray. There is some duplication between the two, but that is not a reason to make more duplication.
Are you saying the duplication is "just random" ? It would be better -- as in: principle of minimum surprise -- if there would be some sort "reasonable set" of duplicates .... If there are only a handful functions missing, why not try to make it complete. ( Again, a list of functions vs. methods on the wiki would clarify what we are talking about ....) Just thinking loudly of course. Don't take this as an offense ..... -Sebastian
Sebastian Haase wrote:
Are you saying the duplication is "just random" ? It would be better -- as in: principle of minimum surprise -- if there would be some sort "reasonable set" of duplicates ....
Yes it would be better. But how do you do it without breaking other people code and avoiding duplication ? That's a trade-off: if we remove say numpy.sum, people will complain that numpy.sum does not exist. Adding all the functions in numpy arrays to be able to say a.foo instead of foo(a) is not that great either. I think we can spend our time on more interesting/important problems.
If there are only a handful functions missing, why not try to make it complete. Duplication is bad, and should be avoided as much as possible. a.foo vs foo(a) does not sound like a good case to introduce more duplication ot me.
cheers, David
On Tue, Jun 24, 2008 at 02:43, Sebastian Haase
On Tue, Jun 24, 2008 at 3:11 AM, Robert Kern
wrote: On Mon, Jun 23, 2008 at 19:35, Ryan May
wrote: Robert Kern wrote:
On Mon, Jun 23, 2008 at 18:10, Sebastian Haase
wrote: On Mon, Jun 23, 2008 at 10:31 AM, Bob Dowling
wrote: [ I'm new here and this has the feel of an FAQ but I couldn't find anything at http://www.scipy.org/FAQ . If I should have looked somewhere else a URL would be gratefully received. ]
What's the reasoning behind functions like sum() and cumsum() being provided both as module functions (numpy.sum(data, axis=1)) and as object methods (data.sum(axis=1)) but other functions - and I stumbled over diff() - only being provided as module functions?
Hi Bob, this is a very good question. I think the answers are a) historical reasons AND, more importantly, differing personal preferences b) I would file the missing data.diff() as a bug.
It's not.
Care to elaborate?
There is not supposed to be a one-to-one correspondence between the functions in numpy and the methods on an ndarray. There is some duplication between the two, but that is not a reason to make more duplication.
Are you saying the duplication is "just random" ?
No. If you want a clearer (but still imperfect) dividing line is that all of the methods are implemented in C. numpy.diff(), for example, is not. A lot of the C functions in Numeric's core (but not FFT, SVD, etc.) got moved into methods, partly for implementation reasons (ndarray is subclassable now, so methods are easier to make generic), and partly for "it seemed like a good idea at the time." We couldn't remove the functions, then, if we wanted any kind of continuity with Numeric, and we certainly can't now.
It would be better -- as in: principle of minimum surprise -- if there would be some sort "reasonable set" of duplicates .... If there are only a handful functions missing, why not try to make it complete.
There aren't.
( Again, a list of functions vs. methods on the wiki would clarify what we are talking about ....)
Go ahead.
Just thinking loudly of course. Don't take this as an offense .....
It's not that you're being offensive, but you are kicking up dust on an old argument that was settled before 1.0, when it actually mattered. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Tue, Jun 24, 2008 at 7:40 PM, Robert Kern
On Tue, Jun 24, 2008 at 02:43, Sebastian Haase
wrote: On Tue, Jun 24, 2008 at 3:11 AM, Robert Kern
wrote: On Mon, Jun 23, 2008 at 19:35, Ryan May
wrote: Robert Kern wrote:
On Mon, Jun 23, 2008 at 18:10, Sebastian Haase
wrote: On Mon, Jun 23, 2008 at 10:31 AM, Bob Dowling
wrote: > [ I'm new here and this has the feel of an FAQ but I couldn't find > anything at http://www.scipy.org/FAQ . If I should have looked > somewhere else a URL would be gratefully received. ] > > > What's the reasoning behind functions like sum() and cumsum() being > provided both as module functions (numpy.sum(data, axis=1)) and as > object methods (data.sum(axis=1)) but other functions - and I stumbled > over diff() - only being provided as module functions? > > Hi Bob, this is a very good question. I think the answers are a) historical reasons AND, more importantly, differing personal preferences b) I would file the missing data.diff() as a bug. It's not.
Care to elaborate?
There is not supposed to be a one-to-one correspondence between the functions in numpy and the methods on an ndarray. There is some duplication between the two, but that is not a reason to make more duplication.
Are you saying the duplication is "just random" ?
No. If you want a clearer (but still imperfect) dividing line is that all of the methods are implemented in C. numpy.diff(), for example, is not. A lot of the C functions in Numeric's core (but not FFT, SVD, etc.) got moved into methods, partly for implementation reasons (ndarray is subclassable now, so methods are easier to make generic), and partly for "it seemed like a good idea at the time." We couldn't remove the functions, then, if we wanted any kind of continuity with Numeric, and we certainly can't now.
It would be better -- as in: principle of minimum surprise -- if there would be some sort "reasonable set" of duplicates .... If there are only a handful functions missing, why not try to make it complete.
There aren't.
( Again, a list of functions vs. methods on the wiki would clarify what we are talking about ....)
Go ahead.
Just thinking loudly of course. Don't take this as an offense .....
It's not that you're being offensive, but you are kicking up dust on an old argument that was settled before 1.0, when it actually mattered.
Just for the record: I like it the way it is. I did follow the discussion at the time, and while there was a real (historical) need to keep functions, there were arguments for supporting a two paradigms and thus also having methods. "Numpy serves both people coming from the Matlab/IDL community and people coming from (strict / method-based) OO programming". Regarding the OP, if he stumbled over this, like probably many other "newcomers", it should go into the FAQ section. (at http://www.scipy.org/FAQ ) ("For implementation reasons some operations are only available as functions or as methods respectively") Cheers, - Sebastian Haase
участники (7)
-
Anne Archibald
-
Bob Dowling
-
David Cournapeau
-
Robert Kern
-
Ryan May
-
Sebastian Haase
-
Stéfan van der Walt