[Numpy-discussion] size_t or npy_intp?
Kurt Smith
kwmsmith at gmail.com
Tue Jul 27 11:11:46 EDT 2010
On Tue, Jul 27, 2010 at 9:45 AM, Francesc Alted <faltet at pytables.org> wrote:
> A Tuesday 27 July 2010 15:20:47 Charles R Harris escrigué:
>> On Tue, Jul 27, 2010 at 7:08 AM, Francesc Alted <faltet at pytables.org> wrote:
>> > Hi,
>> >
>> > I'm a bit confused on which datatype should I use when referring to NumPy
>> > ndarray lengths. In one hand I'd use `size_t` that is the canonical way
>> > to refer to lengths of memory blocks. In the other hand, `npy_intp`
>> > seems the standard data type used in NumPy for this.
>>
>> They have different ranges, npy_intp is signed and in later versions of
>> Python is the same as Py_ssize_t, while size_t is unsigned. It would be a
>> bad idea to mix the two.
>
> Agreed that mixing the two is a bad idea. So I suppose that you are
> suggesting to use `npy_intp`. But then, I'd say that `size_t` being unsigned,
> is a better fit for describing a memory length.
>
> Mmh, I'll stick with `size_t` for the time being (unless anyone else can
> convince me that this is really a big mistake ;-)
This would be good to clear up; I've been confused on the issue myself
for my project. The PyArrayObject struct is defined using
`npy_intp`s:
typedef struct PyArrayObject {
PyObject_HEAD
char *data; /* pointer to raw data buffer */
int nd; /* number of dimensions, also called
ndim */
npy_intp *dimensions; /* size in each dimension */
npy_intp *strides; /* bytes to jump to get to the
next element in each dimension */
PyObject *base; /* This object should be decref'd
upon deletion of array */
/* For views it points to the original
array */
/* For creation from buffer object it
points
to an object that shold be decref'd
on
deletion */
/* For UPDATEIFCOPY flag this is an
array
to-be-updated upon deletion of this
one */
PyArray_Descr *descr; /* Pointer to type structure */
int flags; /* Flags describing array -- see
below*/
PyObject *weakreflist; /* For weakreferences */
} PyArrayObject;
(numpy 1.4.1, numpy/core/include/numpy/ndarrayobject.h)
And because of that, Cython's numpy functionality uses `npy_intp`
everywhere. Perhaps this is required for backwards compat. in numpy,
but in an ideal world, should those be `npy_uintp`s?
Looking at the bufferinfo struct for the buffer protocol, it uses `Py_ssize_t`:
struct bufferinfo {
void *buf;
Py_ssize_t len;
int readonly;
const char *format;
int ndim;
Py_ssize_t *shape;
Py_ssize_t *strides;
Py_ssize_t *suboffsets;
Py_ssize_t itemsize;
void *internal;
} Py_buffer;
So everyone is using signed values where it would make more sense (to
me at least) to use unsigned. Any reason for this?
I'm using `npy_intp` since Cython does it that way :-)
Kurt
More information about the NumPy-Discussion
mailing list