On 1/6/07, Travis Oliphant <oliphant@ee.byu.edu> wrote:
Tim Hochberg wrote:
> Christopher Barker wrote:
>
> [SNIP]
>
>> I think the PEP has far more chances of success if it's seen as a
>> request from a variety of package developers, not just the numpy crowd
>> (which, after all, already has numpy
>>
> This seems eminently sensible. Getting a few developers from other
> projects on board would help a lot; it might also reveal some
> deficiencies to the proposal that we don't see yet.
>
It would help quite a bit. Are there any suggestions of who to recruit
to review the proposal?
Before I can answer that, I need to ask you a question. How do you see this extension to the buffer protocol? Do you see it as an supplement to the earlier array protocol, or do you see it as a replacement?
The reason that I ask is that the two projects that I use regularly are wxPython and PIL generally operate on relatively large data chunks and it's not clear that they would see much benefit over this mechanism versus the array protocol.
I imagine that between us Chris Barker and I could hack together something for wxPython (not that I've asked him aout it). And code would probably go a long way to convincing people what a great idea this is. However, all else being equal, it'd be a lot easier to do this for the array protocol since there's no extra infrastructure involved.
[SNIP]
> 1. Why do we need Py_ARRAYOF? Can't we get the same effect just
> using longer shape and strides arrays?
>
Yes, this is true for a single data-format in isolation (and in fact
exactly what you get when you instantiate in NumPy a data-type that is
an array of another primitive data-type). However, how do you describe
a structure whose second field is an array of a primitive type? This is
where the ARRAYOF qualifier is needed. In NumPy, actually, it's not
done this way, but a separate subarray field in the data-type object is
used. After studying c-types, however, I think this approach is better.
OK,. Needed for recursive data structures, check.
> 2. Is there any type besides Py_STRUCTURE that can have names
> and fields. If so, what and what do they mean. If not, you
> should just say that.
>
Yes, you can add fields to a multi-byte primitive if you want. This
would be similar to thinking about the data-format as a C-like union.
Perhaps the data-field has meaning as a 4-byte integer but the
most-significant and least-significant bytes should also be addressable
individually.
Hmm. I think I understand this somewhat better now, but I can't decide if it's cool or overkill. Is this a supporting a feature that ctypes has?
> 3. And on this topic, why a tuple of ([names,..], {field})? Why
> not simply a list of (name, dfobject, offset, meta) for
> example? And what's the meta information if it's not PyNone?
> Just a string? Anything at all?
>
The list of names is useful for having an ordered list so you can
traverse the structure in field order. It is technically not necessary
but it makes it a lot easier to parse a data-format object in offset
order (it is used a bit in NumPy, for example).
Right, I got that. Between names and field you are simulating an ordered dict. What I still don't understand is why you chose to simulate this ordered dict using a list plus a dictionary rather than a list of tuples. This may well just be a matter of taste. However, for the small sizes I'd expect of these lists I would expect a list of of tuples would perform better than the dictionary solution.
The meta information is a place holder for field tags and future growth
(kind of like column headers in a spreadsheet). It started as a place
to put a "longer" name or to pass along information about a field (like
units) through.
OK.