[Numpy-discussion] numpy.array() of mixed integers and strings can truncate data

Thouis Jones thouis.jones at curie.fr
Mon Dec 5 05:45:57 EST 2011


On Fri, Dec 2, 2011 at 18:53, Charles R Harris
<charlesr.harris at gmail.com> wrote:

> After sleeping on this, I think an object array in this situation would be
> the better choice and wouldn't result in lost information. This might change
> the behavior of
> some functions though, so would need testing.

I tried to come up with a simple patch to achieve this, but I think
this is beyond me, particularly since I think  something different has
to happen for these cases:
np.array([1234, 'ab'])
np.array([1234]).astype('|S2')

I tried a few things (changing the rules in PyArray_PromoteTypes(),
other places), but I think I'm more likely to break some corner case
than fix this cleanly.

I filed a ticket (#1990) and a pull request to add a test to the 1.6.x
maintenance branch, for someone more knowledgeable than me to address.
 I tried to write the test so that either choosing dtype=object or
dtype=<string of the required length> would both pass.

Ray Jones



More information about the NumPy-Discussion mailing list