Handle type convertion in C API

Hello, I writing a python binding of one of our library. The binding intend to vectorize the function call. for exemple: double foo(double, double) will be bound to a python: <numpy.array of double> module.foo(<numpy.array>, <numpy.array>) and the function foo will be called like : for (int i = 0; i < size; ++i) outarr[i] = foo(inarr0[i], inarr[1]); My question is about how can I handle type conversion of input array, preferably in efficient manner, given that each input array may require different input type. Currently I basically enforce the type and no type conversion is performed. But I would like relax it. I thought of several possibility starting from the obvious solution consisting on recasting in the inner loop that would give: for (int i = 0; i < size; ++i) if (inarr[i] need recast) in0 = recast(inarr[i]) else in0 = inarr[i] [... same for all inputs parameter ...] outarr[i] = foo(in0, in1, ...); This solution is memory efficient, but not actually computationally efficient. The second solution is to copy&recast the entire inputs arrays, but in that case it's not memory efficient. And my final thought is to mix the first and the second by chunking the second method, i.e. converting N input in a raw, then applying the function to them en so on until all the array is processed. Thus my questions are: - there is another way to do what I want? - there is an existing or recommended way to do it? And a side question, I use the PyArray_FROM_OTF, but I do not understand well it's semantic. If I pass a python list, it is converted to the desired type and requirement; when I pass a non-continuous array it is converted to the continuous one; but when I pass a numpy array of another type than the one specified I do not get the conversion. There is a function that do the conversion unconditionally? Did I missed something ? Thank you by advance for your help Best regards

Hi Benoit, Since you have a function that takes two scalars to one scalar, it sounds to me as though you would be best off creating a ufunc. This will then handle the conversion to and looping over the arrays, etc for you. The documentation is available here: https://numpy.org/doc/1.18/user/c-info.ufunc-tutorial.html. Regards, Eric On Tue, Mar 10, 2020 at 12:28 PM Benoit Gschwind <gschwind@gnu-log.net> wrote:
Hello,
I writing a python binding of one of our library. The binding intend to vectorize the function call. for exemple:
double foo(double, double) will be bound to a python:
<numpy.array of double> module.foo(<numpy.array>, <numpy.array>)
and the function foo will be called like :
for (int i = 0; i < size; ++i) outarr[i] = foo(inarr0[i], inarr[1]);
My question is about how can I handle type conversion of input array, preferably in efficient manner, given that each input array may require different input type.
Currently I basically enforce the type and no type conversion is performed. But I would like relax it. I thought of several possibility starting from the obvious solution consisting on recasting in the inner loop that would give:
for (int i = 0; i < size; ++i) if (inarr[i] need recast) in0 = recast(inarr[i]) else in0 = inarr[i] [... same for all inputs parameter ...] outarr[i] = foo(in0, in1, ...);
This solution is memory efficient, but not actually computationally efficient.
The second solution is to copy&recast the entire inputs arrays, but in that case it's not memory efficient. And my final thought is to mix the first and the second by chunking the second method, i.e. converting N input in a raw, then applying the function to them en so on until all the array is processed.
Thus my questions are: - there is another way to do what I want? - there is an existing or recommended way to do it?
And a side question, I use the PyArray_FROM_OTF, but I do not understand well it's semantic. If I pass a python list, it is converted to the desired type and requirement; when I pass a non-continuous array it is converted to the continuous one; but when I pass a numpy array of another type than the one specified I do not get the conversion. There is a function that do the conversion unconditionally? Did I missed something ?
Thank you by advance for your help
Best regards
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion

Hello Eric, Thank you for pointing out ufunc, I implemented my binding using it, it's working and it's simpler than my previous implementation, but I still not have the flexibility to dynamically recast input array, i.e. using a int64 array as input for a int32. For instance for testing my code I use numpy.random that provide int64 array, I have to convert the array manually before calling my ufunc, which is somehow annoying. For function that have 1 or 2 parameters it's practical to have 4 variant of the function, but in case of 6-8 parameters it's becoming more difficult. Best regards Le mardi 10 mars 2020 à 13:13 -0400, Eric Moore a écrit :
Hi Benoit,
Since you have a function that takes two scalars to one scalar, it sounds to me as though you would be best off creating a ufunc. This will then handle the conversion to and looping over the arrays, etc for you. The documentation is available here: https://numpy.org/doc/1.18/user/c-info.ufunc-tutorial.html.
Regards,
Eric
On Tue, Mar 10, 2020 at 12:28 PM Benoit Gschwind < gschwind@gnu-log.net> wrote:
Hello,
I writing a python binding of one of our library. The binding intend to vectorize the function call. for exemple:
double foo(double, double) will be bound to a python:
<numpy.array of double> module.foo(<numpy.array>, <numpy.array>)
and the function foo will be called like :
for (int i = 0; i < size; ++i) outarr[i] = foo(inarr0[i], inarr[1]);
My question is about how can I handle type conversion of input array, preferably in efficient manner, given that each input array may require different input type.
Currently I basically enforce the type and no type conversion is performed. But I would like relax it. I thought of several possibility starting from the obvious solution consisting on recasting in the inner loop that would give:
for (int i = 0; i < size; ++i) if (inarr[i] need recast) in0 = recast(inarr[i]) else in0 = inarr[i] [... same for all inputs parameter ...] outarr[i] = foo(in0, in1, ...);
This solution is memory efficient, but not actually computationally efficient.
The second solution is to copy&recast the entire inputs arrays, but in that case it's not memory efficient. And my final thought is to mix the first and the second by chunking the second method, i.e. converting N input in a raw, then applying the function to them en so on until all the array is processed.
Thus my questions are: - there is another way to do what I want? - there is an existing or recommended way to do it?
And a side question, I use the PyArray_FROM_OTF, but I do not understand well it's semantic. If I pass a python list, it is converted to the desired type and requirement; when I pass a non-continuous array it is converted to the continuous one; but when I pass a numpy array of another type than the one specified I do not get the conversion. There is a function that do the conversion unconditionally? Did I missed something ?
Thank you by advance for your help
Best regards
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion

There are a variety of ways to resolve your issues. You can try using the optional arguments casting, dtype or signature that work for all ufuncs see https://numpy.org/doc/1.18/reference/ufuncs.html#optional-keyword-arguments This will allow you to override the default type checks. What you'll find for many ufuncs in, for instance, scipy.special, is that the underlying function in cython, C or Fortran is only defined for doubles, but the function has a signature for f->f, which just casts the input and output in the loop. Generally, downcasting is not permitted without explicit opt in, e.g. int64 to int32, since this is not a safe cast as there are many values that an int64 can hold that an int32 cannot. Generally speaking, you do have to manage the types of your arrays when the defaults aren't what you want. There really isn't anyway around it. Eric On Wed, Mar 11, 2020 at 4:43 AM Benoit Gschwind <gschwind@gnu-log.net> wrote:
Hello Eric,
Thank you for pointing out ufunc, I implemented my binding using it, it's working and it's simpler than my previous implementation, but I still not have the flexibility to dynamically recast input array, i.e. using a int64 array as input for a int32. For instance for testing my code I use numpy.random that provide int64 array, I have to convert the array manually before calling my ufunc, which is somehow annoying.
For function that have 1 or 2 parameters it's practical to have 4 variant of the function, but in case of 6-8 parameters it's becoming more difficult.
Best regards
Le mardi 10 mars 2020 à 13:13 -0400, Eric Moore a écrit :
Hi Benoit,
Since you have a function that takes two scalars to one scalar, it sounds to me as though you would be best off creating a ufunc. This will then handle the conversion to and looping over the arrays, etc for you. The documentation is available here: https://numpy.org/doc/1.18/user/c-info.ufunc-tutorial.html.
Regards,
Eric
On Tue, Mar 10, 2020 at 12:28 PM Benoit Gschwind < gschwind@gnu-log.net> wrote:
Hello,
I writing a python binding of one of our library. The binding intend to vectorize the function call. for exemple:
double foo(double, double) will be bound to a python:
<numpy.array of double> module.foo(<numpy.array>, <numpy.array>)
and the function foo will be called like :
for (int i = 0; i < size; ++i) outarr[i] = foo(inarr0[i], inarr[1]);
My question is about how can I handle type conversion of input array, preferably in efficient manner, given that each input array may require different input type.
Currently I basically enforce the type and no type conversion is performed. But I would like relax it. I thought of several possibility starting from the obvious solution consisting on recasting in the inner loop that would give:
for (int i = 0; i < size; ++i) if (inarr[i] need recast) in0 = recast(inarr[i]) else in0 = inarr[i] [... same for all inputs parameter ...] outarr[i] = foo(in0, in1, ...);
This solution is memory efficient, but not actually computationally efficient.
The second solution is to copy&recast the entire inputs arrays, but in that case it's not memory efficient. And my final thought is to mix the first and the second by chunking the second method, i.e. converting N input in a raw, then applying the function to them en so on until all the array is processed.
Thus my questions are: - there is another way to do what I want? - there is an existing or recommended way to do it?
And a side question, I use the PyArray_FROM_OTF, but I do not understand well it's semantic. If I pass a python list, it is converted to the desired type and requirement; when I pass a non-continuous array it is converted to the continuous one; but when I pass a numpy array of another type than the one specified I do not get the conversion. There is a function that do the conversion unconditionally? Did I missed something ?
Thank you by advance for your help
Best regards
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
participants (2)
-
Benoit Gschwind
-
Eric Moore