[Numpy-discussion] vectorizing recursive sequences

Jaime Fernández del Río jaime.frio at gmail.com
Fri Oct 25 15:32:36 EDT 2013

I recently came up with a way of vectorizing some recursive sequence
calculations. While it works, I am afraid it is relying on implementation
details potentially subject to change. The basic idea is illustrated by
this function, calculating the first n items of the Fibonacci sequence:

def fibonacci(n):

    fib = np.empty(n, dtype=np.intp)

    fib[:2] = 1

    np.add(fib[:-2], fib[1:-1], out=fib[2:])

    return fib

>>> fibonacci(10)

array([ 1,  1,  2,  3,  5,  8, 13, 21, 34, 55], dtype=int64)

I believe that the biggest issue that could break this is if the ufunc
decided to buffer the arrays, as this is relying on the inputs and outputs
of np.add sharing the same memory.

You can use the same idea to do more complicated things, for instance to
calculate the items of the sequence:

f[0] = a[0]

f[n] = a[n] + x * f[n-1]

from numpy.lib.stride_tricks import as_strided

from numpy.core.umath_tests import inner1d

def f(a, x):

    out = np.array(a, copy=True, dtype=np.double)

    n = len(a)

    out_view = as_strided(out, shape=(n-1, 2), strides=out.strides*2)

    inner1d(out_view, [x, 1], out=out[1:])

    return out

>>> f(np.arange(10), 0.1)

array([ 0.        ,  1.        ,  2.1       ,  3.21      ,  4.321     ,

        5.4321    ,  6.54321   ,  7.654321  ,  8.7654321 ,  9.87654321])

Again, I think  buffering is the clearest danger of doing something like
this, as the first input and output must share the same memory for this to
work. That this is a real concern is easy to see: since `inner1d` only has
loops registered for long ints and double floats:

>>> inner1d.types
['ll->l', 'dd->d']

the above function `f` doesn't work if the `out` array is created, e.g. as
np.float32, since there will be buffering happening because of the type

So I have two questions:

1. Is there some other reason, aside from buffering, that could go wrong,
or change in a future release?
2. As far as buffering is concerned, I thought of calling `np.setbuffer(1)`
before any of these functions, but it complains and requests that the value
be a multiple of 16. Is there any other way to ensure that the data is
fetched from an updated array in every internal iteration?



( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20131025/8fbfb85d/attachment.html>

More information about the NumPy-Discussion mailing list