[Python-Dev] Understanding the buffer API

Sun Aug 5 12:08:58 CEST 2012

- Summary:

The PEP, or sometimes just the documentation, definitely requires that 
features not requested shall be NULL.

The API would benefit from:

    a. stored flags that tell you the actual structural features.
    b. requiring exporters to provide full information (e.g. strides =
    {1}, format = "B") even when trivial.

It could and possibly should work this way in Python 4.0.

Nick thinks we could *allow* exporters to behave this way (PEP change) 
in Python 3.x. Stefan thinks not, because "Perhaps there is code that 
tests for shape==NULL to determine C-contiguity."

Jython exporters should return full information unconditionally from the 
start:  "any implementation that doesn't use the Py_buffer struct 
directly in a C-API should just always return a full buffer" (Stefan); 
"I think that's the way Jython should go: *require* that those fields be 
populated appropriately" (Nick).

- But what I now think is:

_If the only problem really is_ "code that tests for shape==NULL to 
determine C-contiguity", or makes similar deductions, I agree that 
providing unasked-for information is_safe_. I think the stipulation in 
PEP/documentation has some efficiency value: on finding shape!=NULL the 
code has to do a more complicated test, as inPyBuffer_IsContiguous(). I 
have the option to provide an isContiguous that has the answer written 
down already, so the risk is only from/to ported code. If it is only a 
risk to the efficiency of ported code, I'm relaxed: I hesitate only to 
check that there's no circumstance that logically requires nullity for 
correctness. Whether it was safe that was the key question.

In the hypothetical Python 4.0 buffer API (and in Jython) where feature 
flags are provided, the efficiency is still useful, but complicated 
deductive logic in the consumer should be deprecated in favour of 
(functions for) interrogating the flags.

An example illustrating the semantics would then be:
1. consumer requests a buffer, saying "I can cope with a strided arrays" 
(PyBUF_STRIDED);
2. exporter provides a strides array, but in the feature flags 
STRIDED=0, meaning "you don't need the strides array";
3. exporter (optionally) uses efficient, non-strided access.

_I do not think_ that full provision by the exporter has to be 
_mandatory_, as the discussion has gone on to suggest. I know your 
experience is that you have often had to regenerate the missing 
information to write generic code, but I think this does not continue 
once you have the feature flags. An example would be:
1. consumer requests a buffer, saying "I can cope with a N-dimensional 
but not strided arrays" (PyBUF_ND);
2. exporter sets strides=NULL, and the feature flag STRIDED=0;
3. exporter accesses the data, without reference to the strides array, 
as it planned;
4. new generic code that respects the feature flag STRIDED=0, does not 
reference the strides array;
5. old generic code, ignorant of the feature flags, finds the 
strides=NULL and so does not dereference strides.
Insofar as it is not necessary, there is some efficiency in not 
providing it. There would only be a problem with broken code that both 
ignores the feature flag and uses the strides array unchecked. But this 
code was always broken.

Really useful discussion this.
Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120805/51a95d9a/attachment.html>