
Hello, numpy arrays are great for interfacing python with libraries that expect continuous memory buffers for data passing. However, libraries interfacing to data acquisition hardware often use those buffers as ring buffers where, once the buffer has been filled with data, new data will be written overwriting the data at the beginning of the buffer. The application is supposed to have read this data meanwhile. Efficiently processing the data in the ring buffer requires addressing efficiently the content of the ring buffer. In C or Cython this is very easy, just compute the wrap around when accessing the single elements of the array: buffer = np.empty(size) adc.setup(buffer) start = 0 for n in adc.read(): # data processing for i in range(start, start + n): element = buffer[start % size] .... start += n My current approach to do the same thing in Python is to use the np.roll() function to "linearize" the buffer access: buffer = np.empty(size) adc.setup(buffer) start = 0 for n in adc.read(): data = np.roll(buffer, -start)[:n] start += n # data processing process(data) Since np.roll() returns a view on the array i suppose this is very efficient. Does anyone have a better idea on how to do this? Thank you. Cheers, Daniele

On Mon, May 6, 2013 at 9:51 AM, Daniele Nicolodi <daniele@grinta.net> wrote:
Hello,
numpy arrays are great for interfacing python with libraries that expect continuous memory buffers for data passing. However, libraries interfacing to data acquisition hardware often use those buffers as ring buffers where, once the buffer has been filled with data, new data will be written overwriting the data at the beginning of the buffer. The application is supposed to have read this data meanwhile.
Efficiently processing the data in the ring buffer requires addressing efficiently the content of the ring buffer. In C or Cython this is very easy, just compute the wrap around when accessing the single elements of the array:
buffer = np.empty(size) adc.setup(buffer) start = 0 for n in adc.read(): # data processing for i in range(start, start + n): element = buffer[start % size] .... start += n
My current approach to do the same thing in Python is to use the np.roll() function to "linearize" the buffer access:
buffer = np.empty(size) adc.setup(buffer) start = 0 for n in adc.read(): data = np.roll(buffer, -start)[:n] start += n # data processing process(data)
Since np.roll() returns a view on the array i suppose this is very efficient. Does anyone have a better idea on how to do this?
np.roll() copies all of the data every time. It does not return a view. Try a function like this instead: [~] |5> def ring_window(buffer, start, n): ..> length = len(buffer) ..> start %= length ..> end = start + n ..> if end <= length: ..> window = buffer[start:end] ..> else: ..> end %= length ..> window = np.concatenate((buffer[start:length], buffer[0:end])) ..> return end, window ..> [~] |6> buffer = np.arange(20) [~] |7> start = 0 [~] |8> for i in range(5): ..> start, window = ring_window(buffer, start, 6) ..> print start, window ..> 6 [0 1 2 3 4 5] 12 [ 6 7 8 9 10 11] 18 [12 13 14 15 16 17] 4 [18 19 0 1 2 3] 10 [4 5 6 7 8 9] -- Robert Kern

On 06/05/2013 11:01, Robert Kern wrote:
np.roll() copies all of the data every time. It does not return a view.
Are you sure about that? Either I'm missing something, or it returns a view in my testing (with a fairly old numpy, though): In [209]: np.__version__ Out[209]: '1.6.2' In [210]: v1 = np.arange(10) In [211]: v1.flags['OWNDATA'] Out[211]: True In [212]: v2 = np.roll(v1, -1) In [213]: v2.flags['OWNDATA'] Out[213]: False Cheers, Daniele

On Mon, 2013-05-06 at 11:39 +0200, Daniele Nicolodi wrote:
On 06/05/2013 11:01, Robert Kern wrote:
np.roll() copies all of the data every time. It does not return a view.
Are you sure about that? Either I'm missing something, or it returns a view in my testing (with a fairly old numpy, though):
In [209]: np.__version__ Out[209]: '1.6.2'
In [210]: v1 = np.arange(10)
In [211]: v1.flags['OWNDATA'] Out[211]: True
In [212]: v2 = np.roll(v1, -1)
In [213]: v2.flags['OWNDATA'] Out[213]: False
Don't trust owndata in that regard... since it returns a view, but a view into a copy. For example if you have subclasses that can be very common. Try np.may_share_memory(v1, v2) for example. - Sebastian
Cheers, Daniele
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On 06/05/2013 11:39, Daniele Nicolodi wrote:
On 06/05/2013 11:01, Robert Kern wrote:
np.roll() copies all of the data every time. It does not return a view.
Are you sure about that? Either I'm missing something, or it returns a view in my testing (with a fairly old numpy, though):
Ops... Yes, I missed something: np.roll() returns a view, but not a view on the original array, which is indeed copied. Your method however also copies the data in a temporary buffer for operating on it as a single chunk of data. It just reduces the copy to the interesting region of the buffer array. It looks like there is no way to use numpy to operate on a ring buffer without copying the data, and in my use case I would like to avoid the copying. I'll write it in Cython... Cheers, Daniele

On Mon, May 6, 2013 at 10:52 AM, Daniele Nicolodi <daniele@grinta.net> wrote:
On 06/05/2013 11:39, Daniele Nicolodi wrote:
On 06/05/2013 11:01, Robert Kern wrote:
np.roll() copies all of the data every time. It does not return a view.
Are you sure about that? Either I'm missing something, or it returns a view in my testing (with a fairly old numpy, though):
Ops... Yes, I missed something: np.roll() returns a view, but not a view on the original array, which is indeed copied.
Your method however also copies the data in a temporary buffer for operating on it as a single chunk of data. It just reduces the copy to the interesting region of the buffer array.
Yes, but only in the one case where the window overlaps past the end. The tradeoff depends on the window size and the buffer size. Alternately, you can write your processing code to work with one or two chunks, then you don't need to do any copying. -- Robert Kern

On Mon, May 6, 2013 at 10:39 AM, Daniele Nicolodi <daniele@grinta.net> wrote:
On 06/05/2013 11:01, Robert Kern wrote:
np.roll() copies all of the data every time. It does not return a view.
Are you sure about that? Either I'm missing something, or it returns a view in my testing (with a fairly old numpy, though):
In [209]: np.__version__ Out[209]: '1.6.2'
In [210]: v1 = np.arange(10)
In [211]: v1.flags['OWNDATA'] Out[211]: True
In [212]: v2 = np.roll(v1, -1)
In [213]: v2.flags['OWNDATA'] Out[213]: False
It may return a view on something, but it isn't a view on the original array. It can't be, because the rolled result cannot be represented as uniformly-strided memory access on the original data. Check the source. https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py#L1173 The reason you see a "view" is that final `reshape()` call. But it first copies the original data using `take()`. -- Robert Kern
participants (3)
-
Daniele Nicolodi
-
Robert Kern
-
Sebastian Berg