Copy-on-demand requires the maintenance of a global list of all the active views associated with a particular array buffer. Here is a simple example:
>>> a = zeros((5000,5000)) >>> b = a[49:51,50] >>> c = a[51:53,50] >>> a[50,50] = 1
The assignment to a[50,50] must trigger a copy of the array b; otherwise b also changes. On the other hand, array c does not need to be copied since its view does not include element 50,50. You could instead copy the array a -- but that means copying a 100 Mbyte array while leaving the original around (since b and c are still using it) -- not a good idea!
Sure, if one wants do perform only the *minimum* amount of copying, things can get rather tricky, but wouldn't it be satisfactory for most cases if attempted modification of the original triggered the delayed copying of the "views" (lazy copies)? In those cases were it isn't satisfactory the user could still explicitly create real (i.e. alias-only) views.
I'm not sure what you mean. Are you saying that if anything in the buffer changes, force all views of the buffer to generate copies (rather than try to determine if the change affected only selected views)? If so, yes, it is easier, but it still is a non-trivial capability to implement.
The bookkeeping can get pretty messy (if you care about memory usage, which we definitely do). Consider this case:
>>> a = zeros((5000,5000)) >>> b = a[0:-10,0:-10] >>> c = a[49:51,50] >>> del a >>> b[50,50] = 1
Now what happens? Either we can copy the array for b (which means two
``b`` and ``c`` are copied and then ``a`` is deleted.
What does numarray currently keep of a if I do something like the above or:
b = a.flat[::-10000] del a
?
The whole buffer remains in both cases.
copies of the huge (5000,5000) array exist, one used by c and the new version used by b), or we can be clever and copy c instead.
Even keeping track of the views associated with a buffer doesn't solve the problem of an array that is passed to a C extension and is modified in place. It would seem that passing an array into a C extension would always require all the associated views to be turned into copies. Otherwise we can't guarantee that views won't be modifed.
Yes -- but only if the C extension is destructive. In that case the user might well be making a mistake in current Numeric if he has views and doesn't want them to be modified by the operation (of course he might know that the inplace operation does not affect the view(s) -- but wouldn't such cases be rather rare?). If he *does* want the views to be modified, he would obviously have to explictly specify them as such in a copy-on-demand scheme and in the other case he has been most likely been prevented from making an error (and can still explicitly use real views if he knows that the inplace operation on the original will not have undesired effects on the "views").
If the point is that views are susceptible to unexpected changes made in place by a C extension, yes, certainly (just as they are for changes made in place in Python). But I'm not sure what that has to do with the implied copy (even if delayed) being broken by extensions written in C. Promising a copy, and not honoring it is not the same as not promising it in the first place. But I may be misunderstanding your point. Perry