[Numpy-discussion] moving forward around ABI/API compatibilities (was numpy 1.7.x branch)

Tue Jun 26 09:40:29 EDT 2012

On 06/26/2012 01:48 PM, David Cournapeau wrote:
> Hi,
>
> I am just continuing the discussion around ABI/API, the technical side
> of things that is, as this is unrelated to 1.7.x. release.
>
> On Tue, Jun 26, 2012 at 11:41 AM, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no>  wrote:
>> On 06/26/2012 11:58 AM, David Cournapeau wrote:
>>> On Tue, Jun 26, 2012 at 10:27 AM, Dag Sverre Seljebotn
>>> <d.s.seljebotn at astro.uio.no>    wrote:
>>>> On 06/26/2012 05:35 AM, David Cournapeau wrote:
>>>>> On Tue, Jun 26, 2012 at 4:10 AM, Ondřej Čertík<ondrej.certik at gmail.com>      wrote:
>>>>>
>>>>>>
>>>>>> My understanding is that Travis is simply trying to stress "We have to
>>>>>> think about the implications of our changes on existing users." and
>>>>>> also that little changes (with the best intentions!) that however mean
>>>>>> either a breakage or confusion for users (due to historical reasons)
>>>>>> should be avoided if possible. And I very strongly feel the same way.
>>>>>> And I think that most people on this list do as well.
>>>>>
>>>>> I think Travis is more concerned about API than ABI changes (in that
>>>>> example for 1.4, the ABI breakage was caused by a change that was
>>>>> pushed by Travis IIRC).
>>>>>
>>>>> The relative importance of API vs ABI is a tough one: I think ABI
>>>>> breakage is as bad as API breakage (but matter in different
>>>>> circumstances), but it is hard to improve the situation around our ABI
>>>>> without changing the API (especially everything around macros and
>>>>> publicly accessible structures). Changing this is politically
>>>>
>>>> But I think it is *possible* to get to a situation where ABI isn't
>>>> broken without changing API. I have posted such a proposal.
>>>> If one uses the kind of C-level duck typing I describe in the link
>>>> below, one would do
>>>>
>>>> typedef PyObject PyArrayObject;
>>>>
>>>> typedef struct {
>>>>      ...
>>>> } NumPyArray; /* used to be PyArrayObject */
>>>
>>> Maybe we're just in violent agreement, but whatever ends up being used
>>> would require to change the *current* C API, right ? If one wants to
>>
>> Accessing arr->dims[i] directly would need to change. But that's been
>> discouraged for a long time. By "API" I meant access through the macros.
>>
>> One of the changes under discussion here is to change PyArray_SHAPE from
>> a macro that accepts both PyObject* and PyArrayObject* to a function
>> that only accepts PyArrayObject* (hence breakage). I'm saying that under
>> my proposal, assuming I or somebody else can find the time to implement
>> it under, you can both make it a function and have it accept both
>> PyObject* and PyArrayObject* (since they are the same), undoing the
>> breakage but allowing to hide the ABI.
>>
>> (It doesn't give you full flexibility in ABI, it does require that you
>> somewhere have an "npy_intp dims[nd]" with the same lifetime as your
>> object, etc., but I don't consider that a big disadvantage).
>>
>>> allow for changes in our structures more freely, we have to hide them
>>> from the headers, which means breaking the code that depends on the
>>> structure binary layout. Any code that access those directly will need
>>> to be changed.
>>>
>>> There is the particular issue of iterator, which seem quite difficult
>>> to make "ABI-safe" without losing significant performance.
>>
>> I don't agree (for some meanings of "ABI-safe"). You can export the data
>> (dataptr/shape/strides) through the ABI, then the iterator uses these in
>> whatever way it wishes consumer-side. Sort of like PEP 3118 without the
>> performance degradation. The only sane way IMO of doing iteration is
>> building it into the consumer anyway.
>
> (I have not read the whole cython discussion yet)

I'll try to write a summary and post it when I can get around to it.

>
> What do you mean by "building iteration in the consumer" ? My

"consumer" is the user of the NumPy C API. So I meant that the iteration 
logic is all in C header files and compiled again for each such 
consumer. Iterators don't cross the ABI boundary.

> understanding is that any data export would be done through a level of
> indirection (dataptr/shape/strides). Conceptually, I can't see how one
> could keep ABI without that level of indirection without some compile.
> In the case of iterator, that means multiple pointer chasing per
> sample -- i.e. the tight loop issue you mentioned earlier for
> PyArray_DATA is the common case for iterator.

Even if you do indirection, iterator utilities that are compiled in the 
"consumer"/user code can cache the data that's retrieved.

Iterators just do

// setup crossing ABI
npy_intp *shape = PyArray_DIMS(arr);
npy_intp *strides = PyArray_STRIDES(arr);
...
// performance-sensitive code just accesses cached pointers and don't
// cross ABI

We're probably in violent agreement and just talking past one another...?

>
> I can only see two ways of doing fast (special casing) iteration:
> compile-time special casing or runtime optimization. Compile-time
> requires access to the internals (even if one were to use C++ with
> advanced template magic ala STL/iterator, I don't think one can get
> performance if everything is not in the headers, but maybe C++
> compilers are super smart those days in ways I can't comprehend). I
> would think runtime is the long-term solution, but that's far away,

Going slightly OT, then IMO, the *only* long-term solution in 2012 is 
LLVM. That allows you to do any level of inlining and special casing and 
optimization at run-time, which is the only way of matching needs for 
performance with using Python at all.

Mark Florisson is heading down that road this summer with his 'minivect' 
project (essentially, code generation for optimal iteration over NumPy 
(or NumPy-like) arrays that can be used both by Cython (C code 
generation backend) and Numba (LLVM code generation backend)).

Relying on C++ metaprogramming to implement iterators is like using the 
technology of the 80's to build the NumPy of the 2010's. It can only be 
exported to Python in a crippled form, so kind of useless. (C++ to 
implement the core that sits behind an ABI is another matter, I don't 
have an opinion on that. But iterators can't be behind the ABI, as I 
think we agree on.)

Dag