[Python-Dev] idea for data-type (data-format) PEP
Travis E. Oliphant
oliphant.travis at ieee.org
Wed Nov 1 17:44:27 CET 2006
Thanks for all the comments that have been given on the data-type
(data-format) PEP. I'd like opinions on an idea for revising the PEP I
have.
What if we look at this from the angle of trying to communicate
data-formats between different libraries (not change the way anybody
internally deals with data-formats).
For example, ctypes has one way to internally deal with data-formats
(using type objects).
NumPy/Numeric has a way to internally deal with data-formats (using
PyArray_Descr * structure -- in Numeric it's just a C-structure but in
NumPy it's fleshed out further and also a Python object called the
data-type).
Numarray has a way to internally deal with data-formats (using type
objects).
The array module has a way to internally deal with data-formats (using a
PyArray_Descr * structure -- and character codes to select one).
The struct module deals with data-formats using character codes.
The PIL deals with data-formats using image modes.
PyVTK deals with data-formats using it's own internal objects.
MPI deals with data-formats using it's own MPI_DataType structures.
This list goes on and on.
What I claim is needed in Python (to make it better glue) is to have a
standard way to communicate data-format information between these
extensions. Then, you don't have to build in support for all the
different ways data-formats are represented by different libraries. The
library only has to be able to translate their representation to the
standard way that Python uses to represent data-format.
How is this goal going to be achieved? That is the real purpose of the
data-type object I previously proposed.
Nick showed that there are two (non-orthogonal) ways to think about this
goal.
1) We could define a special string-syntax (or list syntax) that covers
every special case. The array interface specification goes this
direction and it requires no new Python types. This could also be seen
as an extension of the "struct" module to allow for nested structures, etc.
2) We could define a Python object that specifically carries data-format
information.
There is also a third way (or really 2b) that has been mentioned: take
one of the extensions and use what it does to communicate data-format
between objects and require all other extensions to conform to that
standard.
The problem with 2b is that what works inside an extension module may
not be the best option when it comes to communicating across multiple
extension modules. Certainly none of the extension modules have argued
that case effectively.
Does that explain the goal of what I'm trying to do better?
More information about the Python-Dev
mailing list