[Python-Dev] idea for data-type (data-format) PEP

Alexander Belopolsky alexander.belopolsky at gmail.com
Wed Nov 1 21:52:43 CET 2006

Travis E. Oliphant <oliphant.travis <at> ieee.org> writes:

> What if we look at this from the angle of trying to communicate 
> data-formats between different libraries (not change the way anybody 
> internally deals with data-formats).
> For example, ctypes has one way to internally deal with data-formats 
> (using type objects).
> NumPy/Numeric has a way to internally deal with data-formats (using 
> PyArray_Descr * structure -- in Numeric it's just a C-structure but in 
> NumPy it's fleshed out further and also a Python object called the 
> data-type).

Ctypes and NumPy's Array Interface address two different needs.
When using ctypes, producers of type information
are at the Python level, but Array Interface information is
produced in C code. It is very convenient to write c_int*2*3 to
specify a 2x3 integer matrix in Python, but it is much easier to
set type code to 'i' and populate the shape array with integers
in C.

Consumers of type information are at the C level in both ctypes
and Array Interface applications, but in the case of ctypes, users
are not expected to write C code. It is typical for an array
interface consumer to switch on the type code.  Single character
(or numeric) type codes are much more convenient than verbose type
names in this case.

I have used Array Interface extensively, but only for simple types
and I have studied ctypes from Python level, but not from C level.

I think the standard data type description object should build on
the strengths of both approaches.

I believe the first step should be to agree on a representation of
simple types.  Just an agreement on the standard type codes that
every module could use would be a great improvement. (Personally,
I don't need anything else from array interface.)

I don't like letter codes, however. I would prefer to use an enum
at the C level and verbose names at Python level.

I would also like to mention one more difference between NumPy datatypes
and ctypes that I did not see discussed.  In ctypes arrays of different
shapes are represented using different types.  As a result, if the object
exporting its buffer is resized, the datatype object cannot be reused, it
has to be replaced.

More information about the Python-Dev mailing list