On 1/6/07, Travis Oliphant <oliphant@ee.byu.edu> wrote:
Tim Hochberg wrote:
> Christopher Barker wrote:
> [SNIP]
>> I think the PEP has far more chances of success if it's seen as a
>> request from a variety of package developers, not just the numpy crowd
>> (which, after all, already has numpy
> This seems eminently sensible. Getting a few developers from other
> projects on board would help a lot; it might also reveal some
> deficiencies to the proposal that we don't see yet.
It would help quite a bit.  Are there any suggestions of who to recruit
to review the proposal?

Before I can answer that, I need to ask you a question. How do you see this extension to the buffer protocol? Do you see it as an supplement to the earlier array protocol, or do you see it as a replacement?

The reason that I ask is that the two projects that I use regularly are wxPython and PIL generally operate on relatively large data chunks and it's not clear that they would see much benefit over this mechanism versus the array protocol.

I imagine that between us Chris Barker and I could hack together something for wxPython (not that I've asked him aout it). And code would probably go a long way to convincing people what a great idea this is. However, all else being equal, it'd be a lot easier to do this for the array protocol since there's no extra infrastructure involved.


>          1. Why do we need Py_ARRAYOF? Can't we get the same effect just
>             using longer shape and strides arrays?
Yes, this is true for a single data-format in isolation (and in fact
exactly what you get when you instantiate in NumPy a data-type that is
an array of another primitive data-type).   However, how do you describe
a structure whose second field is an array of a primitive type?  This is
where the ARRAYOF qualifier is needed.  In NumPy, actually, it's not
done this way, but a separate subarray field in the data-type object is
used.  After studying c-types,  however, I think this approach is better.

OK,. Needed for recursive data structures, check.

>          2. Is there any type besides Py_STRUCTURE that can have names
>             and fields. If so, what and what do they mean. If not, you
>             should just say that.
Yes, you can add fields to a multi-byte primitive if you want.  This
would be similar to thinking about the data-format as a C-like union.
Perhaps the data-field has meaning as a 4-byte integer but the
most-significant and least-significant bytes should also be addressable

Hmm. I think I understand this somewhat better now, but I can't decide if it's cool or overkill. Is this a supporting a feature that ctypes has?

>          3. And on this topic, why a tuple of ([names,..], {field})? Why
>             not simply a list of (name, dfobject, offset, meta) for
>             example? And what's the meta information if it's not PyNone?
>             Just a string? Anything at all?

The list of names is useful for having an ordered list so you can
traverse the structure in field order.   It is technically not necessary
but it makes it a lot easier to parse a data-format object in offset
order (it is used a bit in NumPy, for example).

Right, I got that. Between names and field you are simulating an ordered dict. What I still don't understand is why you chose to simulate this ordered dict using a list plus a dictionary rather than a list of tuples. This may well just be a matter of taste. However, for the small sizes I'd expect of these lists I would expect a list of of tuples would perform better than the dictionary solution.

The meta information is a place holder for field tags and future growth
(kind of like column headers in a spreadsheet).  It started as a place
to put a "longer" name or to pass along information about a field (like
units) through.


FWIW, the array protocol PEP seems more relevant to what I do since I'm not concerned so much with the overhead since I'm sending big chunks of data back and forth.