[Python-Dev] PEP: Adding data-type objects to Python
Travis E. Oliphant
oliphant.travis at ieee.org
Tue Oct 31 16:32:39 CET 2006
Michael Chermside wrote:
> In this email I'm responding to a series of emails from Travis
> pretty much in the order I read them:
>
>>
>> In the mean-time, how are other packages supposed to communicate
>> binary information about data with each other?
>
> Here we disagree.
>
> I haven't used C-types. I have no idea whether it is well-designed or
> horribly unusable. So if someone wanted to argue that C-types is a
> mistake and should be thrown out, I'd be willing to listen.
> Until
> someone tries to make that argument, I'm presuming it's good enough to
> be part of the standard library for Python.
My problem with this argument is two fold:
1) I'm not sure you really know what your talking about since you
apparently haven't used either ctypes or NumPy (I've used both and so
forgive me if I claim to understand the strengths of the data-format
representations that each uses a bit better). Therefore, it's hard for
me to take your opinion seriously. I will try though. I understand you
have a preference for not wildly expanding the ways to do similar
things. I share that preference with you.
2) You are assuming that because it's good enough for the standard
library means that the way they describe data-formats (using a separate
Python type for each one) is the *one true way*. When was this
discussed? Frankly it's a weak argument because the struct module has
been around for a lot longer. Why didn't the ctypes module follow that
standard? Or the standard that's in the array module for describing
data-types. That's been there for a long time too. Why wasn't ctypes
forced to use that approach?
The reason it wasn't is because it made sense for ctypes to use a
separate type for each data-format object so that you could call
C-functions as if they were Python functions. If this is your goal,
then it seems like a good idea (though not strictly necessary) to use a
separate Python type for each data-format.
But, there are distinct disadvantages to this approach compared to what
I'm trying to allow. Martin claims that the ctypes approach is
*basically* equivalent but this is just not true. It could be made more
true if the ctypes objects inherited from a "meta-type" and if Python
allowed meta-types to expand their C-structures. But, last I checked
this is not possible.
A Python type object is a very particular kind of Python-type. As far
as I can tell, it's not as flexible in terms of the kinds of things you
can do with the "instances" of a type object (i.e. what ctypes types
are) on the C-level.
The other disadvantage of what you are describing is: Who is going to
write the code?
I'm happy to have the data-format object live separate from ctypes and
leave it to the ctypes author(s) to support it if desired. But, the
claim that the extended buffer protocol jump through all kinds of hoops
to conform to the "ctypes standard" when that "standard" was designed
with a different idea in mind is not acceptable.
Ctypes has only been in Python since 2.5 and the array interface was
around before that. Numeric has been around longer than ctypes. The
array module and the struct modules in Python have also both been around
longer than ctypes as well.
Where is the discussion that crowned the ctypes way of doing things as
"the one true way"
>
> In a different message, he writes:
>> It also bothers me that so many ways to describe binary data are
>> being used out there. This is a problem that deserves being solved.
>> And, no, ctypes hasn't solved it (we can't directly use the ctypes
>> solution).
>
> Really? Why? Is this a failing in C-types? Can C-types be "fixed"?
You can't grow C-function pointers on to an existing type object. You
are also carrying around a lot of weight in the Python type object that
is un-necessary if all you are doing is describing data.
>
> I just disagree. (1) I *DO* think we should "just use ctypes because it's
> there". After all, the problem we're trying to solve is one of
> COMPATIBILITY - you don't solve those by introducing competing standards.
> (2) From what I understand of it, I think ctypes is quite capable of
> describing data to be accessed via the buffer protocol.
Capable but not supporting all the things I'm talking about. The ctypes
objects don't have any of the methods or attributes (or C function
pointers) that I've described. Nor should they necessarily grow them.
>
> Why? Who cares? Seriously, if we were proposing to describe the layouts
> with a collection of rubber bands and potato chips, I'd say it was a
> crazy idea. But we're proposing using data structures in a computer
> memory. Why does it matter whether those data structures are of the same
> "python type" or different "python types"? I care whether the structure
> can be created, passed around, and interrogated. I don't care what
> Python type they are.
Sure, but the flexibility you have with an instance of a Python type is
different then when that instance must itself also be a Python type. It
*is* different. This is quite noticeable in C especially.
>
>> I'm saying that I don't like the idea of forcing this approach on
>> everybody else who wants to describe arbitrary binary data just
>> because ctypes is included.
>
> And I'm saying that I *do*. Hey, if someone proposed getting rid of
> the current syntax for the array module (for Py3K) and replacing it with
> use of ctypes, I'd give it serious consideration. There should be only
> one way to describe binary structures. It should be powerful enough to
> describe almost any structure, easy-to-use, and most of all it should be
> used consistently everywhere.
I'm not opposed to convergence, but ctypes must be willing to come to us
too. It's devleopment of a "standard" was not done with the array
interface in mind so why should it be surprising that it does not fill
the need for us.
-Travis
More information about the Python-Dev
mailing list