[Python-Dev] PEP: Adding data-type objects to Python

Mon Oct 30 22:44:22 CET 2006

Jim Jewett wrote:
> Travis E. Oliphant wrote:
> 
> 
>>Two packages need to share a chunk of memory (the package authors do not
>>know each other and only have and Python as a common reference).  They
>>both want to describe that the memory they are sharing has some
>>underlying binary structure.
> 
> 
> As a quick sanity check, please tell me where I went off track.
> 
> it sounds to me like you are assuming that:
> 
> (1)  The memory chunk represents a single object (probably an array of
> some sort)
> (2)  That subchunks can themselves be described by a (single?)
> repeating C struct.
> (3)  You can't just use the C header, since you want this at run-time.
> (4)  It would be enough if you could say
> 
> This is an array of 500 elements that look like
> 
> struct {
>       int  simple;
>       struct nested {
>            char name[30];
>            char addr[45];
>            int  amount;
>       }
> 

Sure.  I think that's pretty much it.  I assume you mean object in the 
general sense and not as in (Python object).

> (5)  But is it not acceptable to use Martin's suggested ctypes
> equivalent of (building out from the inside):

Part of the problem is that ctypes uses a lot of different Python types 
(that's what I mean by "multi-object" to accomplish it's goal).  What 
I'm looking for is a single Python type that can be passed around and 
explains binary data.

Remember the buffer protocol is in compiled code.  So, as a result,

1) It's harder to construct a class to pass through the protocol using 
the multiple-types approach of ctypes.

2) It's harder to interpret the object recevied through the buffer 
protocol.

Sure, it would be *possible* to use ctypes, but I think it would be very 
difficult.  Think about how you would write the get_data_format C 
function in the extended buffer protocol for NumPy if you had to import 
ctypes and then build a class just to describe your data.  How would you 
interpret what you get back?

The ctypes "format-description" approach is not as unified as a single 
Python type object that I'm proposing.

In NumPy, we have a very nice, compact description of complicated data 
already available.  Why not use what we've learned?

I don't think we should just *use ctypes because it's there* when the 
way it describes binary data was not constructed with the extended 
buffer protocol in mind.

The other option, of course, which would not introduce a new Python type 
is to use the array interface specification and pass a list of tuples. 
But, I think this is also un-necessarily wasteful because the sending 
object has to construct it and the receiving object has to de-construct 
it.  The whole point of the (extended) buffer protocol is to communicate 
this information more quickly.

-Travis