[Numpy-discussion] NumPy C API question

Stefan Seefeld stefan at seefeld.name
Thu May 22 09:30:06 EDT 2014


Hi Nathaniel,

On 2014-05-21 20:15, Nathaniel Smith wrote:
> Hi Stefan,
>
> One possibility that comes to mind: you may want in any case some way
> to temporarily "pin" an object's memory in place (e.g., to prevent one
> thread trying to migrate it while some other thread is working on it).
> If so then the Python wrapper could acquire a pin when the ndarray is
> allocated, and release it when it is released. (The canonical way to
> do this is to create a little opaque Python class that knows how to do
> the acquire/release, and then assign it to the 'base' attribute of
> your array -- the semantics of 'base' are simply that ndarray.__del__
> will decref whatever object is in 'base'.)

That's an interesting thought. So instead of creating an ndarray with a
lifetime as long as the wrapped C++ object, I would create an ndarray
only temporarily, as a view into my C++ object, and over whose lifetime
the storage is pinned to host memory. The (Python) API needs to make it
clear that, while it is ok to reference vector and matrix objects,
referring to their "array" members should be confined to small scopes,
as within those scopes the underlying memory is pinned, and no operation
that would involve a relocation of the data (such as OpenCL kernels) may
be called. Not following such rules may result in deadlocks...

I think I like that approach. Explicit is better than implicit. :-)

Thanks !

        Stefan



>
> -n
>
> On Thu, May 22, 2014 at 12:44 AM, Stefan Seefeld <stefan at seefeld.name> wrote:
>> Hi Nathaniel,
>>
>> thanks for the prompt and thorough answer. You are entirely right, I
>> hadn't thought things through properly, so let me back up a bit.
>>
>> I want to provide Python bindings to a C++ library I'm writing, which is
>> based on vector/matrix/tensor data types. In my naive view I would
>> expose these data types as NumPy arrays, creating PyArrayObject
>> instances as "wrappers", i.e. who borrow raw pointers to the storage
>> managed by the C++ objects. To make things slightly more interesting,
>> those C++ objects have their own storage management mechanism, which
>> allows data to migrate across different address spaces (such as from
>> host to GPU-device memory), and thus whether the host storage is valid
>> (i.e., contains up-to-date data) or not depends on where the last
>> operation was performed (which is controlled by an operation dispatcher
>> that is part of the library, too).
>>
>> It seems if I let Python control the data lifetime, and borrow the data
>> temporarily from C++ I may be fine. However, I may want to expose
>> pre-existing C++ objects into Python, though, and it sounds like that
>> might be dangerous unless I am willing to clone the data so the Python
>> runtime can hold on to that even after my C++ runtime has released
>> theirs. But that changes the semantics, as the Python runtime no longer
>> sees the same data as the C++ runtime, unless I keep the two in sync
>> each time I cross the language boundary, which may be quite a costly
>> operation...
>>
>> Does all that sound sensible ?
>>
>> It seems I have some more design to do.
>>
>> Thanks,
>>         Stefan
>>
>> --
>>
>>       ...ich hab' noch einen Koffer in Berlin...
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

      ...ich hab' noch einen Koffer in Berlin...




More information about the NumPy-Discussion mailing list