
On Thu, Jan 17, 2013 at 2:32 PM, Alan G Isaac <alan.isaac@gmail.com> wrote:
Is it really better to have `permute` and `permuted` than to add a keyword? (Note that these are actually still ambiguous, except by convention.)
The convention in question, though, is that of English grammar. In practice everyone who uses numpy is a more-or-less skilled English speaker in any case, so re-using the conventions is helpful! "Shake the martini!" <- an imperative command This is a complete statement all by itself. You can't say "Hand me the shake the martini". In procedural languages like Python, there's a strong distinction between statements (whole lines, a = 1), which only matter because of their side-effects, and expressions (a + b) which have a value and can be embedded into a larger statement or expression ((a + b) + c). "Shake the martini" is clearly a statement, not an expression, and therefore clearly has a side-effect. "shaken martini" <- a noun phrase Grammatically, this is like plain "martini", you can use it anywhere you can use a noun. "Hand me the martini", "Hand me the shaken martini". In programming terms, it's an expression, not a statement. And side-effecting expressions are poor style, because when you read procedural code, you know each statement contains at least 1 side-effect, and it's much easier to figure out what's going on if each statement contains *exactly* one side-effect, and it's the top-most operation. This underlying readability guideline is actually baked much more deeply into Python than the sort/sorted distinction -- this is why in Python, 'a = 1' is *not* an expression, but a statement. C allows you to say things like "b = (a = 1)", but in Python you have to say "a = 1; b = a".
Btw, two separate issues seem to be running side by side.
i. should in-place operations return their result? ii. how can we signal that an operation is inplace?
I expect NumPy to do inplace operations when feasible, so maybe they could take an `out` keyword with a None default. Possibly recognize `out=True` as asking for the original array object to be returned (mutated); `out='copy'` as asking for a copy to be created, operated upon, and returned; and `out=a` to ask for array `a` to be used for the output (without changing the original object, and with a return value of None).
Good point that numpy also has a nice convention with out= arguments for ufuncs. I guess that convention is, by default return a new array, but also allow one to modify the same (or another!) array in-place, by passing out=. So this would suggest that we'd have b = shuffled(a) shuffled(a, out=a) shuffled(a, out=b) shuffle(a) # same as shuffled(a, out=a) and if people are bothered by having both 'shuffled' and 'shuffle', then we drop 'shuffle'. (And the decision about whether to include the imperative form can be made on a case-by-case basis; having both shuffled and shuffle seems fine to me, but probably there are other cases where this is less clear.) There is also an argument that if out= is given, then we should always return None, in general. I'm having a lot of trouble thinking of any situation where it would be acceptable style (or even useful) to write something like: c = np.add(a, b, out=a) + 1 But, 'out=' is very large and visible (which makes the readability less terrible than it could be). And np.add always returns the out array when working out-of-place (so there's at least a weak countervailing convention). So I feel much more strongly that shuffle() should return None, than I do that np.add(out=...) should return None. A compromise position would be to make all new functions that take out= return None when out= is given, while leaving existing ufuncs and such as they are for now. -n