
Hi all, This looks like a bug to me.
a = arange(6).reshape(2,3) a.resize((3,3)) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: cannot resize this array: it does not own its data
Is there any reason resize should fail in this case? Resize should be returning an new array, no? There are several other things that look like bugs in this method, for instance:
a = arange(6).resize((2,3)) a
`a` has no value and no error is raised. The resize function works as expected
resize(a,(3,3)) array([[0, 1, 2], [3, 4, 5], [0, 1, 2]])
Chuck

Hi Charles On Wed, Aug 29, 2007 at 09:42:50AM -0600, Charles R Harris wrote:
Hi all,
This looks like a bug to me.
a = arange(6).reshape(2,3) a.resize((3,3)) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: cannot resize this array: it does not own its data
From the docstring of a.resize:
Change size and shape of self inplace. Array must own its own memory and not be referenced by other arrays. Returns None. The reshaped array is a view on the original data, hence it doesn't own it: In [15]: a = N.arange(6).reshape(2,3) In [16]: a.flags Out[16]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
a = arange(6).resize((2,3)) a
`a` has no value and no error is raised.
It is because `a` is now None. Cheers Stéfan

On 8/29/07, Stefan van der Walt <stefan@sun.ac.za> wrote:
Hi Charles
On Wed, Aug 29, 2007 at 09:42:50AM -0600, Charles R Harris wrote:
Hi all,
This looks like a bug to me.
a = arange(6).reshape(2,3) a.resize((3,3)) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: cannot resize this array: it does not own its data
From the docstring of a.resize:
Change size and shape of self inplace. Array must own its own memory and not be referenced by other arrays. Returns None.
The documentation is bogus:
a = arange(6).reshape(2,3) a array([[0, 1, 2], [3, 4, 5]]) a.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False a.resize((3,2)) a array([[0, 1], [2, 3], [4, 5]])
The reshaped array is a view on the original data, hence it doesn't
own it:
In [15]: a = N.arange(6).reshape(2,3)
In [16]: a.flags Out[16]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
a = arange(6).resize((2,3)) a
`a` has no value and no error is raised.
It is because `a` is now None.
This behaviour doesn't match documentation elsewhere, which is why I am raising the question. What *should* the resize method do? It looks like it is equivalent to assigning a shape tuple to a.shape, so why do we need it? Apart from that, the reshape method looks like it would serve for most cases. Chuck

Charles R Harris wrote:
What *should* the resize method do? It looks like it is equivalent to assigning a shape tuple to a.shape,
No, that's what reshape does.
so why do we need it?
resize() will change the SIZE of the array (number of elements), where reshape() will only change the shape, but not the number of elements. The fact that the size is changing is why it won't work if if doesn't own the data.
a = N.array((1,2,3)) a.reshape((6,)) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: total size of new array must be unchanged
can't reshape to a shape that is a different size.
b = a.resize((6,)) repr(b) 'None'
resize changes the array in place, so it returns None, but a has been changed:
a array([1, 2, 3, 0, 0, 0])
Perhaps you want the function, rather than the method:
b = N.resize(a, (12,)) b array([1, 2, 3, 0, 0, 0, 1, 2, 3, 0, 0, 0])
a array([1, 2, 3, 0, 0, 0])
a hasn't been changed, b is a brand new array. -CHB Apart from that, the reshape method looks like it would serve
for most cases.
Chuck
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 8/29/07, Christopher Barker <Chris.Barker@noaa.gov> wrote:
Charles R Harris wrote:
What *should* the resize method do? It looks like it is equivalent to assigning a shape tuple to a.shape,
No, that's what reshape does.
No, reshape returns a view and the view doesn't own its data. Totally different behavior in this context.
so why do we need it?
resize() will change the SIZE of the array (number of elements), where reshape() will only change the shape, but not the number of elements. The fact that the size is changing is why it won't work if if doesn't own the data.
According to the documentation, the resize method changes the array inplace. How can it be inplace if the number of elements changes? Admittedly, it *will* change the size, but that is not consistent with the documentation. I suspect it reallocates memory and (hopefully) frees the old, but then that is what the documentation should say because it explains why the data must be owned -- a condition violated in some cases as demonstrated above. I am working on documentation and that is why I am raising these questions. There seem to be some inconsistencies that need clarification and/or fixing. Chuck

On 8/29/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
On 8/29/07, Christopher Barker <Chris.Barker@noaa.gov> wrote:
Charles R Harris wrote:
What *should* the resize method do? It looks like it is equivalent to assigning a shape tuple to a.shape,
No, that's what reshape does.
No, reshape returns a view and the view doesn't own its data. Totally different behavior in this context.
so why do we need it?
resize() will change the SIZE of the array (number of elements), where reshape() will only change the shape, but not the number of elements. The fact that the size is changing is why it won't work if if doesn't own the data.
According to the documentation, the resize method changes the array inplace. How can it be inplace if the number of elements changes?
It sounds like you and Chris are talking past each other on a matter of terminology. At a C-level, it's obviously not (necessarily) in place, since the array may get realloced as you surmise below. However, at the Python level, the change is in fact in place, in the same sense that appending to a Python list operates in-place, even though under the covers memory may get realloced there as well.
Admittedly, it *will* change the size, but that is not consistent with the documentation. I suspect it reallocates memory and (hopefully) frees the old, but then that is what the documentation should say because it explains why the data must be owned -- a condition violated in some cases as demonstrated above. I am working on documentation and that is why I am raising these questions. There seem to be some inconsistencies that need clarification and/or fixing.
The main inconsistency I see above is that resize appears to only require ownership of the data if in fact the number of items changes. I don't think that's actually a bug, but I don't like it much; I would prefer that resize be strict and always require ownership. However, I'm fairly certain that there are people that prefer "friendliness" over consistency, so I wouldn't be surprised to get some pushback on changing that. -- . __ . |-\ . . tim.hochberg@ieee.org

On 29/08/2007, Timothy Hochberg <tim.hochberg@ieee.org> wrote:
The main inconsistency I see above is that resize appears to only require ownership of the data if in fact the number of items changes. I don't think that's actually a bug, but I don't like it much; I would prefer that resize be strict and always require ownership. However, I'm fairly certain that there are people that prefer "friendliness" over consistency, so I wouldn't be surprised to get some pushback on changing that.
It seems to me like inplace resize is a problem, no matter how you implement it --- is there any way to verify that no view exists of a given array? (refcounts won't do it since there are other, non-view ways to increase the refcount of an array.) If there's a view of an array, you resize() it in place, and realloc() moves the data, the views now point to bogus memory: you can cause the python interpreter to segfault by addressing their contents. I really can't see any way around this; why not remove inplace resize() (or make it raise exceptions if the size has to change) and allow only the function resize()? Anne

On 8/29/07, Anne Archibald <peridot.faceted@gmail.com> wrote:
On 29/08/2007, Timothy Hochberg <tim.hochberg@ieee.org> wrote:
The main inconsistency I see above is that resize appears to only require ownership of the data if in fact the number of items changes. I don't think that's actually a bug, but I don't like it much; I would prefer that resize be strict and always require ownership. However, I'm fairly certain that there are people that prefer "friendliness" over consistency, so I wouldn't be surprised to get some pushback on changing that.
It seems to me like inplace resize is a problem, no matter how you implement it --- is there any way to verify that no view exists of a given array? (refcounts won't do it since there are other, non-view ways to increase the refcount of an array.)
I think that may be overstating the problem a bit; refcounts should work in the sense that they would prevent segfaults. They'll just be too conservative in many cases, preventing resizes in cases where they would otherwise work.
If there's a view of an array, you resize() it in place, and realloc() moves the data, the views now point to bogus memory: you can cause the python interpreter to segfault by addressing their contents. I really can't see any way around this; why not remove inplace resize() (or make it raise exceptions if the size has to change) and allow only the function resize()?
Probably because in a few cases, it's vastly more efficient to realloc the data than to copy it. FWIW, I don't use either the resize function or the resize method, but if I was going to get rid of one, personally I'd axe the function. Resizing is a confusing operation and the function doesn't have the possibility of better efficiency to justify it's existence. -- . __ . |-\ . . tim.hochberg@ieee.org

On Wed, Aug 29, 2007 at 11:31:12AM -0700, Timothy Hochberg wrote:
FWIW, I don't use either the resize function or the resize method, but if I was going to get rid of one, personally I'd axe the function. Resizing is a confusing operation and the function doesn't have the possibility of better efficiency to justify it's existence.
My understand of OOP is that I expect a method to modify an object in place, and a function to return a new object (or a view). Now this is not true with Python, as some objects are imutable and this is not possible, but at least there seems to be some logic that a method returns a new object only if the object is imutable. With numpy I often fail to see the logic, but I'd love to see one. Gaël

Anne Archibald wrote:
On 29/08/2007, Timothy Hochberg <tim.hochberg@ieee.org> wrote:
The main inconsistency I see above is that resize appears to only require ownership of the data if in fact the number of items changes. I don't think that's actually a bug, but I don't like it much; I would prefer that resize be strict and always require ownership. However, I'm fairly certain that there are people that prefer "friendliness" over consistency, so I wouldn't be surprised to get some pushback on changing that.
It seems to me like inplace resize is a problem, no matter how you implement it --- is there any way to verify that no view exists of a given array? (refcounts won't do it since there are other, non-view ways to increase the refcount of an array.)
Yes, as long as every view is created using the C API correctly. That's why Chuck saw the exception he did, because he tried to resize() an array that had a view stuck of it (or rather, he was trying to resize() the view, which didn't have ownership of the data). In [8]: from numpy import * In [9]: a = zeros(10) In [10]: a.resize(15) In [11]: b = a[:] In [12]: a.resize(20) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /Users/rkern/src/VTK-5.0.2/<ipython console> in <module>() ValueError: cannot resize an array that has been referenced or is referencing another array in this way. Use the resize function Of course, if you muck around with the raw data pointer using ctypes, you might have problems, but that's ctypes for you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On 8/29/07, Timothy Hochberg <tim.hochberg@ieee.org> wrote:
On 8/29/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
On 8/29/07, Christopher Barker < Chris.Barker@noaa.gov> wrote:
Charles R Harris wrote:
What *should* the resize method do? It looks like it is equivalent to assigning a shape tuple to a.shape,
No, that's what reshape does.
No, reshape returns a view and the view doesn't own its data. Totally different behavior in this context.
so why do we need it?
resize() will change the SIZE of the array (number of elements), where
reshape() will only change the shape, but not the number of elements. The fact that the size is changing is why it won't work if if doesn't own the data.
According to the documentation, the resize method changes the array inplace. How can it be inplace if the number of elements changes?
It sounds like you and Chris are talking past each other on a matter of terminology. At a C-level, it's obviously not (necessarily) in place, since the array may get realloced as you surmise below. However, at the Python level, the change is in fact in place, in the same sense that appending to a Python list operates in-place, even though under the covers memory may get realloced there as well.
Admittedly, it *will* change the size, but that is not consistent with the documentation. I suspect it reallocates memory and (hopefully) frees the old, but then that is what the documentation should say because it explains why the data must be owned -- a condition violated in some cases as demonstrated above. I am working on documentation and that is why I am raising these questions. There seem to be some inconsistencies that need clarification and/or fixing.
The main inconsistency I see above is that resize appears to only require ownership of the data if in fact the number of items changes. I don't think that's actually a bug, but I don't like it much; I would prefer that resize be strict and always require ownership. However, I'm fairly certain that there are people that prefer "friendliness" over consistency, so I wouldn't be surprised to get some pushback on changing that.
I still don't see why the method is needed at all. Given the conditions on the array, the only thing it buys you over the resize function or a reshape is the automatic deletion of the old memory if new memory is allocated. And the latter is easily done as a = reshape(a, new_shape). I know there was a push to make most things methods, but it is possible to overdo it. Is this a Numarray compatibility issue? Chuck

On 8/29/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
I still don't see why the method is needed at all. Given the conditions on the array, the only thing it buys you over the resize function or a reshape is the automatic deletion of the old memory if new memory is allocated.
Can you explain this more? Both you and Anne seem to share the opinion that the resize method is useless, while the resize function is useful. So, now I'm worried I'm missing something since as far as I can tell the function is useless and the method is only mostly useless.
And the latter is easily done as a = reshape(a, new_shape). I know there was a push to make most things methods,
In general I think methods are easy to overdo, but I'm not on board for this particular case. but it is possible to overdo it. Is this a Numarray compatibility issue?
Dunno about that. -- . __ . |-\ . . tim.hochberg@ieee.org

On 8/29/07, Timothy Hochberg <tim.hochberg@ieee.org> wrote:
On 8/29/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
I still don't see why the method is needed at all. Given the conditions on the array, the only thing it buys you over the resize function or a reshape is the automatic deletion of the old memory if new memory is allocated.
Can you explain this more? Both you and Anne seem to share the opinion that the resize method is useless, while the resize function is useful. So, now I'm worried I'm missing something since as far as I can tell the function is useless and the method is only mostly useless.
Heh. I might dump both. The resize function is a concatenation followed by reshape. It differs from the resize method in that it always returns a new array and repeats the data instead of filling with zeros. The inconsistency in the way the array is filled bothers me a bit, I would have just named the method realloc. I really don't see the need for either except for backward compatibility. Maybe someone can make a case. Chuck

Timothy Hochberg wrote:
On 8/29/07, *Charles R Harris* <charlesr.harris@gmail.com <mailto:charlesr.harris@gmail.com>> wrote:
I still don't see why the method is needed at all. Given the conditions on the array, the only thing it buys you over the resize function or a reshape is the automatic deletion of the old memory if new memory is allocated.
Can you explain this more? Both you and Anne seem to share the opinion that the resize method is useless, while the resize function is useful. So, now I'm worried I'm missing something since as far as I can tell the function is useless and the method is only mostly useless.
The resize function docstring makes the following distinction: Definition: numpy.resize(a, new_shape) Docstring: Return a new array with the specified shape. The original array's total size can be any size. The new array is filled with repeated copies of a. Note that a.resize(new_shape) will fill the array with 0's beyond current definition of a. So the method and the function are subtly different. As far as I can see, the method is causing more trouble than it is worth. Under what circumstances, in real code, can it provide enough benefit to override the penalty it is now exacting in confusion? Eric
participants (8)
-
Anne Archibald
-
Charles R Harris
-
Christopher Barker
-
Eric Firing
-
Gael Varoquaux
-
Robert Kern
-
Stefan van der Walt
-
Timothy Hochberg