On Sun, Aug 5, 2012 at 2:41 AM, Stefan Krah <stefan@bytereef.org> wrote:
Nick Coghlan <ncoghlan@gmail.com> wrote:
Think about trying to specify the buffer protocol using only C++ references rather than pointers. In Java, it's a lot easier to say "this value must be a reference to 'B'" than it is to say "this value must be NULL". (My Java is a little rusty, but I'm still pretty sure you can only get NullPointerException by messing about with the JNI).
I think it's worth defining an "OR" clause for each of the current "X must be NULL" cases, where it is legal for the provider to emit an appropriate non-NULL value that would be consistent with the consumer assuming that the returned value is consistent with what they requested.
I think any implementation that doesn't use the Py_buffer struct directly in a C-API should just always return a full buffer if a specific request can be met according to the rules.
Since Jeff is talking about an inspired-by API, rather than using the C API directly, I think that's the way Jython should go: *require* that those fields be populated appropriately, rather than allowing them to be None.
For the C-API, I would be cautious:
- The number of case splits in testing getbuffer flags is already staggering. Defining an "OR" clause would introduce new cases.
- Consumers may simply rely on the status-quo.
As I said in my earlier mail, for Python 4.0, I'd rather see that buffers have mandatory full information. Querying individual Py_buffer fields for NULL should be replaced by a set of flags that would determine contiguity, buffer "history" (has the buffer been cast to unsigned bytes?) etc.
Making a switch to mandatory full information later suggest that we need to at least make it optional now. I do agree with what you suggest though, which is that, if a buffer chooses to always publish full and accurate information it must do so for *all* fields.Tthat should reduce the combinatorial explosion. It does place a constraint on consumers that they can't assume those fields will be NULL just because they didn't ask for them, but I'm struggling to think of any reason why a client would actually *check* that instead of just assuming it. I guess the dodgy Py_buffer-copying code in the old memoryview implementation only mostly works because those fields are almost always NULL, but that approach was just deeply broken in general.
The main reason is that it turns out that in any general C function that takes a Py_buffer argument one has to reconstruct full information anyway, otherwise obscure cases *will* be overlooked (in the absence of a formal proof that takes care of all case splits).
Right, that's why I think we should declare it legal to *provide* full information even if the consumer didn't ask for it, *as long as* any consumer assumptions implied by the limited request (such as unsigned byte data, a single dimension or C contiguity) remain valid. Consumers that can't handle that correctly (which would likely include the pre-3.3 memoryview) are officially broken. As you say, we likely can't make providing full information mandatory during the 3.x cycle, but we can at least pave the way for it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia