
[sorry for replying so late, an almost finished email got lost in a computer accident and I was rather busy.] Konrad Hinsen <hinsen@cnrs-orleans.fr> writes:
Wouldn't an (almost) automatic solution be to simply replace (almost) all instances of a[b:c] with a.view[b:c] in your legacy code? Even for unusual
That would convert all slicing operations, even those working on strings, lists, and user-defined sequence-type objects.
Well that's where the "(almost)" comes in ;) If you can tell at glance for most instances in you code whether the ``foo`` in ``foo[a:b]`` is an array, then running a query replace isn't that much trouble. Of course this might not be true. But the question really is: to what extent would it be more difficult to tell than what you need to find out already in all the other situations where code needs changing because of the incompatibilities numarray already introduces? (I think I have for example already found a slicing-incompatibility -- unfortunately the list of the issues I hit upon so far has disappeared somewhere, so I'll have to try to reconstruct it sometime...) If the answer is "not much", then you would have to regard these incompatibilities as even less acceptable than the introduction of copy-slicing semantics (because as you've already agreed, these incompatibilities don't confer the same benefit) or otherwise it would be difficult to see why copy-slicing shouldn't be introduced as well (just as an example, I'm sure I've already come across a slicing incompatibility -- unfortunately I've lost my compilation of this and similar problems, but I'll try to reconstruct it). View semantics have always bothered me, but if it weren't for the fact that numarray is going to cause me not inconsiderable inconvenience through various incompatibilities anyway, I would have been satisfied with the status quo. As things are, however I must admit I feel a strong temptation to get this fixed as well, especially as most of the other laudable improvements of numarray wouldn't seem to be of great importance to me personally at the moment (much nicer C code base, better handling of byteswapped data and very large arrays etc.). So I fully admit to a selfish desire for either more gain or less pain (incompatibility) or maybe even a bit of both. Of course I don't think these subjective desires of mine are a good standard to go by, but I am convinced that offering attractive improvements or few compatibility problems (or both) to the widest possible audience of current Numeric users is important in order to replace Numeric, quickly and cleanly, without any splitting.
autoconvert by inserting ``if type(foo) == ArrayType:...``, although
typechecks for every slicing or indexing operation (a[0] generates a view as well for a multidimensional array). Guaranteed to render most code unreadable, and of course slow down execution.
A further challenge for your code convertor:
f(a[0], b[2:3], c[-1, 1])
That makes eight type combination cases.
I'd say 4 (since c[-1,1] can't be a list) but that is beside the point. This was mainly intended as a demonstration that you *can* do it automatically, if you really need to. A function call would help the readability but obviously be even more inefficient. If I really had large amounts of code that needed that conversion, I'd be tempted to write such a function with an additional twist: have it monitor the input argument type whenever the program is run and if it isn't an array, the wrapping in this particular line can be discarded (with less confidence, if it always seems to be an array it could be converted into ``a.view[b:c]``, but that might need additional checking). In code that isn't reached, the wrapper just stays forever. I've always been looking for an excuse to write some self-modifying code :)
Well, AFAIK there are actually three mutable sequence types in python core and all have copy-slicing behavior: list, UserList and array:
UserList is not an independent type, it is merely a subclassable wrapper around lists. As for the array module, I haven't seen any code that uses it.
It is AFAIK the only way to work efficiently with large strings, so I guess it is important also I agree that it is not that often used.
I would suppose that in the grand scheme of things numarray.array is intended as an eventual replacement for array.array, or not?
In the interest of those who rely on the current array module, I hope not.
As long as array is kept around for backwards-compatibility, why not? [...]
But reliability to me also includes the ability for growth -- I not only want my old code to work in a couple of years, I also want the tool I wrote it in to remain competitive and this can conflict with backwards-compatibility. I
In what way does the current slicing behaviour render your code non-competitive?
A single design decision obviously doesn't have such an immediate huge negative impact that it immediately renders all your code-noncompetive, unless it was a *really* bad design decision it just means more bugs and less clear and general code. But language warts are more like tumours, they grow over the years and become increasingly difficult to excise (just look what tremendous redesign effort the perl people go through at the moment). The closer warts come to the core language the worse, and since numarray aims for inclusion I think it must be measured to a higher standard than other modules that don't.
like the balance python strikes here so far -- the language has
Me too. But there haven't been any incompatible changes in the documented core language, and only very few in the standard library (the to-be-abandoned re module comes to mind - anything else?).
I don't think this is true (and the documented core language is not necessarily a good standard to go by as far as python is concerned, because not quite everything one has to rely upon is actually documented (instead one can find things like: "XXX Can't be bothered to spell this out right now...")). Among the incompatible changes that I would strongly assume *were* documented before and after are: exceptions (strings -> classes), automatic conversion of ints to longs (instead of an exception) and the new division rules whose stepwise introduction has already started. There are also quite a few things that used to work for all classes, but that now no longer work with new-style classes, some of which can be quite annoying (you loose quite a bit of introspective and interactive power), but I'm not sure to which extent they were documented.
For a bad example, see the Python XML package(s). Lots of changes, incompatibilities between parsers, etc. The one decision I really regret is to have chosen an XML-based solution for documentation. Now I spend two days at every new release of my stuff to adapt the XML code to the fashion of the day.
I didn't do much xml processing, but as far as I can remember I was happy with 4suite: http://4suite.org/index.xhtml.
It is almost ironic that I appear here as the great anti-change advocate, since in many other occasions I have argued for improvement over excessive compatiblity. Basically I favour motivated incompatible
I don't think a particularly conservative character is necessary to fill that role :) You've got a big code base, which automatically reduces the desire for incompatibilities because you have to pay a hefty cost that is difficult to offset by potential advantages for future code. But that side of the argument is clearly important and I think even if you don't like to be an anti-change advocate you still often make valuable points against changes you perceive as uncalled for. alex -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck@gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/