"eric jones"
writes: I think the consistency with Python is less of an issue than it seems. I wasn't aware that add.reduce(x) would generated the same results as the Python version of reduce(add,x) until Perry pointed it out to me. There are some inconsistencies between Python the language and Numeric because the needs of the Numeric community. For instance, slices create views instead of copies as in Python. This was a correct break with consistency in a very utilized area of Python because of efficiency.
Ahh, a loaded example ;) I always thought that Numeric's view-slicing is a fairly problematic deviation from standard Python behavior and I'm not entirely sure why it needs to be done that way.
Couldn't one have both consistency *and* efficiency by implementing a copy-on-demand scheme (which is what matlab does, if I'm not entirely mistaken; a real copy gets only created if either the original or the 'copy' is modified)?
The current behavior seems not just problematic because it breaks consistency and hence user expectations, it also breaks code
is written with more pythonic sequences in mind (in a potentially hard to track down manner) and is, IMHO generally undesirable and error-prone, for pretty much the same reasons that dynamic scope and global variables are generally undesirable and error-prone -- one can unwittingly create intricate interactions between remote parts of a program that can be very difficult to track down.
Obviously there *are* cases where one really wants a (partial) view of an existing array. It would seem to me, however, that these cases are exceedingly rare (In all my Numeric code I'm only aware of one instance where I actually want the aliasing behavior, so that I can manipulate a large array by manipulating its views and vice versa). Thus rather than being the default behavior, I'd rather see those cases accommodated by a special syntax
Well, slices creating copies is definitely a bad idea (which is what I have heard proposed before) -- finite difference calculations (and others) would be very slow with this approach. Your copy-on-demand suggestion might work though. Its implementation would be more complex, but I don't think it would require cooperation from the Python core.? It could be handled in the ufunc code. It would also require extension modules to make copies before they modified any values. Copy-on-demand doesn't really fit with python's 'assignments are references" approach to things though does it? Using foo = bar in Python and then changing an element of foo will also change bar. So, I guess there would have to be a distinction made here. This adds a little more complexity. Personally, I like being able to pass views around because it allows for efficient implementations. The option to pass arrays into extension function and edit them in-place is very nice. Copy-on-demand might allow for equal efficiency -- I'm not sure. I haven't found the current behavior very problematic in practice and haven't seen that it as a major stumbling block to new users. I'm happy with status quo on this. But, if copy-on-demand is truly efficient and didn't make extension writing a nightmare, I wouldn't complain about the change either. I have a feeling the implementers of numarray would though. :-) And talk about having to modify legacy code... that that
makes it explicit that an alias is desired and that care must be taken when modifying either the original or the view (e.g. one possible syntax would be ``aliased_vector = m.view[:,1]``). Again I think the current behavior is somewhat analogous to having variables declared in global (or dynamic) scope by default which is not only error-prone, it also masks those cases where global (or dynamic) scope *is* actually desired and necessary.
It might be that the problems associated with a copy-on-demand scheme outweigh the error-proneness, the interface breakage that the deviation from standard python slicing behavior causes, but otherwise copying on slicing would be an backwards incompatibility in numarray I'd rather like to see (especially since one could easily add a view attribute to Numeric, for forwards-compatibility). I would also suspect that this would make it *a lot* easier to get numarray (or parts of it) into the core, but this is just a guess.
I think the two things Guido wants for inclusion of numarray is a consensus from our community on what we want, and (more importantly) a comprehensible code base. :-) If Numeric satisfied this 2nd condition, it might already be slated for inclusion... The 1st is never easy with such varied opinions -- I've about concluded that Konrad and I are anti-particles :-) -- but I hope it will happen.
I don't see choosing axis=-1 as a break with Python --
arrays are inherently different and used differently than lists of
in Python. Further, reduce() is a "corner" of the Python language
has been superceded by list comprehensions. Choosing an alternative
Guido might nowadays think that adding reduce was as mistake, so in
multi-dimensional lists that that
sense it might be a "corner" of the python language (although some people, including me, still rather like using reduce), but I can't see how you can generally replace reduce with anything but a loop. Could you give an example?
Your right. You can't do it without a loop. List comprehensions only supercede filter and map since they always return a list. I think reduce is here to stay. And, like you, I would actually be disappointed to see it go (I like lambda too...) The point is that I wouldn't choose the definition of sum() or product() based on the behavior of Python's reduce operator. Hmmm. So I guess that is key -- its really these *function* interfaces that I disagree with. So, how about add.reduce() keep axis=0 to match the behavior of Python, but sum() and friends defaulted to axis=-1 to match the rest of the library functions? It does break with consistency across the library, so I think it is sub-optimal. However, the distinction is reasonably clear and much less likely to cause confusion. It also allows FFT and future modules (wavelets or whatever) operate across the fastest axis by default while conforming to an intuitive standard. take() and friends would also become axis=-1 for consistency with all other functions. Would this be a reasonable compromise? eric
alex
-- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck@gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/