[Numpy-discussion] string arrays - accessing data from C++

Jaroslav Hajek highegg at gmail.com
Fri Sep 18 15:23:29 EDT 2009


On Fri, Sep 18, 2009 at 7:08 PM, Christopher Barker
<Chris.Barker at noaa.gov> wrote:
> Jaroslav Hajek wrote:
>
>> Does PyArrayObject::data point to a single contiguous char[] buffer,
>> like with the old Numeric char arrays, with
>> PyArrayObject::descr->elsize being the maximum length?
>
> yes.
>
>> string lengths determined
>
> c-style null termination
>

Hmm, this didn't seem to work for me. But maybe I was doing something
else wrong. Thanks.

>> Finally, is there any way to create an array in NumPy (from within the
>> interpreter) that would have type == PyArray_CHAR?
>
> I think this will get you what you want:
>
> a = np.empty((3,4), dtype=np.character)
> or
> a = np.empty((3,4), dtype='c')
>

Are you sure? I think this is what I tried (I can't check at this
moment), and the result has descr->type equal to PyArray_STRING. Also,
note that even in the interpreter, the dtype shows itself as string:

>>> numpy.dtype('c')
dtype('|S1')


> You can learn a lot by experimenting at the command line (even better,
> ipython):
>
> In [27]: a = np.array(('this', 'that','a longer string','s'))
>
> In [28]: a
> Out[28]:
> array(['this', 'that', 'a longer string', 's'],
>       dtype='|S15')
>
>
> you can see that it is a dtype of '|S15', so each element can be up to
> 15 bytes.
>
> #which you can also fine this way:
>
> In [30]: a.itemsize
> Out[30]: 15
>
> and, for a contiguous block, like this:
>
> In [31]: a.strides
> Out[31]: (15,)
>
> # now to look at the bytes themselves:
>
> In [37]: b = a.view(dtype=np.uint8).reshape((4,-1))
>
> In [38]: b
> Out[38]:
> array([[116, 104, 105, 115,   0,   0,   0,   0,   0,   0,   0,   0,   0,
>           0,   0],
>        [116, 104,  97, 116,   0,   0,   0,   0,   0,   0,   0,   0,   0,
>           0,   0],
>        [ 97,  32, 108, 111, 110, 103, 101, 114,  32, 115, 116, 114, 105,
>         110, 103],
>        [115,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
>           0,   0]], dtype=uint8)
>
>
> so you can see that it's null-terminated.
>

Even null-padded, apparently.

> I find it very cool that you can get at virtually all the c-level info
> for an array from python.
>

Yes.

-- 
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz



More information about the NumPy-Discussion mailing list