I am trying to write up some code that takes advantage of np.tile() on arbitrary array-like objects. I only want to tile along the first axis. Any other axis, if they exist, should be left alone. I first coerce the object using np.asanyarray(), tile it, and then coerce it back to the original type.
The problem seems to be that some of my array-like objects are being "over-coerced", particularly the list of tuples. I tried doing "np.asanyarray(a, dtype='O')", but that still turns it into a 2-D array.
Am I missing something?
Thanks, Ben Root
It appears that the only reliable way to do this may be to use a loop to modify an object arrays in-place. Pandas has a version of this written in Cython: https://github.com/pydata/pandas/blob/c1a0dbc4c0dd79d77b2a34be5bc35493279013...
To quote Wes McKinney "Seriously can't believe I had to write this function"
Best, Stephan
On Mon, Feb 9, 2015 at 8:31 AM, Benjamin Root ben.root@ou.edu wrote:
I am trying to write up some code that takes advantage of np.tile() on arbitrary array-like objects. I only want to tile along the first axis. Any other axis, if they exist, should be left alone. I first coerce the object using np.asanyarray(), tile it, and then coerce it back to the original type.
The problem seems to be that some of my array-like objects are being "over-coerced", particularly the list of tuples. I tried doing "np.asanyarray(a, dtype='O')", but that still turns it into a 2-D array.
Am I missing something?
Thanks, Ben Root
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, Feb 9, 2015 at 8:31 AM, Benjamin Root ben.root@ou.edu wrote:
I am trying to write up some code that takes advantage of np.tile() on arbitrary array-like objects. I only want to tile along the first axis. Any other axis, if they exist, should be left alone. I first coerce the object using np.asanyarray(), tile it, and then coerce it back to the original type.
The problem seems to be that some of my array-like objects are being "over-coerced", particularly the list of tuples. I tried doing "np.asanyarray(a, dtype='O')", but that still turns it into a 2-D array.
The default constructors will drill down until they find a scalar or a non-matching shape. So you get an array full of Python ints, or floats, but still 2-D:
a = [tuple(range(j, j+3)) for j in range(5)] a
[(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)]
np.asarray(a, dtype=object)
array([[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]], dtype=object)
If you add a non-matching item, e.g. an empty tuple, then all works fine for your purposes:
a.append(()) np.asarray(a, dtype=object)
array([(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), ()], dtype=object)
But you would then have to discard that item before tiling. The only other way is to first create the object array, then assign your array-like object to it:
a.pop()
()
a
[(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)]
b = np.empty(len(a), object) b[:] = a b
array([(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)], dtype=object)
Not sure if this has always worked, or if it breaks down in some corner case, but Wes may not have had to write that function after all! At least not in Cython.
Jaime
Yeah, well, you know Wes... in for a penny, in for a pound (or something like that). Significant portions of pandas already needs Cython, so might as well get as much performance as possible.
Btw, the edge case (if you want to call it that), is if it is given an N-dimensional array:
import numpy as np a = np.zeros((4, 5)) b = np.empty(4, object) b[:] = a
Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: could not broadcast input array from shape (4,5) into shape (4)
I am already filtering out ndarrays anyway, so it isn't a big deal. I was just hoping to reduce the amount of code as possible by removing the filter.
This will do for now. Thank you for the clarifications. Ben Root
On Mon, Feb 9, 2015 at 12:49 PM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
On Mon, Feb 9, 2015 at 8:31 AM, Benjamin Root ben.root@ou.edu wrote:
I am trying to write up some code that takes advantage of np.tile() on arbitrary array-like objects. I only want to tile along the first axis. Any other axis, if they exist, should be left alone. I first coerce the object using np.asanyarray(), tile it, and then coerce it back to the original type.
The problem seems to be that some of my array-like objects are being "over-coerced", particularly the list of tuples. I tried doing "np.asanyarray(a, dtype='O')", but that still turns it into a 2-D array.
The default constructors will drill down until they find a scalar or a non-matching shape. So you get an array full of Python ints, or floats, but still 2-D:
a = [tuple(range(j, j+3)) for j in range(5)] a
[(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)]
np.asarray(a, dtype=object)
array([[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]], dtype=object)
If you add a non-matching item, e.g. an empty tuple, then all works fine for your purposes:
a.append(()) np.asarray(a, dtype=object)
array([(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), ()], dtype=object)
But you would then have to discard that item before tiling. The only other way is to first create the object array, then assign your array-like object to it:
a.pop()
()
a
[(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)]
b = np.empty(len(a), object) b[:] = a b
array([(0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)], dtype=object)
Not sure if this has always worked, or if it breaks down in some corner case, but Wes may not have had to write that function after all! At least not in Cython.
Jaime
-- (__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion