[Python-Dev] idea for data-type (data-format) PEP

Travis E. Oliphant oliphant.travis at ieee.org
Wed Nov 1 17:44:27 CET 2006


Thanks for all the comments that have been given on the data-type 
(data-format) PEP.  I'd like opinions on an idea for revising the PEP I 
have.

What if we look at this from the angle of trying to communicate 
data-formats between different libraries (not change the way anybody 
internally deals with data-formats).

For example, ctypes has one way to internally deal with data-formats 
(using type objects).

NumPy/Numeric has a way to internally deal with data-formats (using 
PyArray_Descr * structure -- in Numeric it's just a C-structure but in 
NumPy it's fleshed out further and also a Python object called the 
data-type).

Numarray has a way to internally deal with data-formats (using type 
objects).

The array module has a way to internally deal with data-formats (using a 
PyArray_Descr * structure -- and character codes to select one).

The struct module deals with data-formats using character codes.

The PIL deals with data-formats using image modes.

PyVTK deals with data-formats using it's own internal objects.

MPI deals with data-formats using it's own MPI_DataType structures.

This list goes on and on.

What I claim is needed in Python (to make it better glue) is to have a 
standard way to communicate data-format information between these 
extensions.  Then, you don't have to build in support for all the 
different ways data-formats are represented by different libraries.  The 
library only has to be able to translate their representation to the 
standard way that Python uses to represent data-format.

How is this goal going to be achieved?  That is the real purpose of the 
data-type object I previously proposed.

Nick showed that there are two (non-orthogonal) ways to think about this 
goal.

1) We could define a special string-syntax (or list syntax) that covers 
every special case.  The array interface specification goes this 
direction and it requires no new Python types.  This could also be seen 
as an extension of the "struct" module to allow for nested structures, etc.

2) We could define a Python object that specifically carries data-format 
information.


There is also a third way (or really 2b) that has been mentioned:  take 
one of the extensions and use what it does to communicate data-format 
between objects and require all other extensions to conform to that 
standard.

The problem with 2b is that what works inside an extension module may 
not be the best option when it comes to communicating across multiple 
extension modules.   Certainly none of the extension modules have argued 
that case effectively.

Does that explain the goal of what I'm trying to do better?







More information about the Python-Dev mailing list