Is there a pure numpy recipe for this?

I am working on solving a recent recreational mathematical problem on Project Euler <http://projecteuler.net> . I have a solution, which works fine for small N up to 10^5 but it takes too long to compute for the actual problem, where N is of the order 2*10^7. The problem is nested loops, and I am hoping to avoid one level in the loop by using clever numpy magic (as I have done often before). However, there is one step, which I cannot figure out how to do using numpy operations alone, and I am hoping for some help The subproblem is that I have in principle k = 1, ..., N sets of boolean arrays f_k and g_k each of length N. For each k I need to find the number of elements i where both f_k[i] and g_k[i] are True and sum that up over all N values of k. A problem of the order 4*10^14 if I just do it brute force. This takes way too long (there is a one minute rule). However, after a lot of thinking and by using some properties of the f_k and g_k I have managed to construct using only pure numpy function and only a single loop over k, arrays f_k_changes_at g_k_changes_at which contain the indices i at which the functions change it boolean value from True to False or False to True. It so happens that the number of changes is only a small fraction of N, the fraction decreases with larger N, so the size of these changes_at arrays contains perhaps only 1000 elements instead of 10000000 for each k, a significant reduction of complexity. Now, my problem is to figure out for how many values of i both f_k and g_k are True given the changes_at arrays. As this may be a little hard to understand here is a specific example of how these arrays can look like for k = 2 and N = 150 f_2_changes_at = [ 2 3 39 41 58 59 65 66 93 102 145] g_2_changes_at = [ 2 94 101 146 149] with the boundary condition that f_2[0] = g_2[0] = False Which expands to i f_2 g_2 f_2 and g_2 0 F F F 1 F F F <- 2 T T T <- 3 F T F 4 F T F ... 38 F T F <- 39 T T T 40 T T T <- 41 F T F 42 F T F ... 57 F T F <- 58 T T T <- 59 F T F 60 F T F ... 64 F T F <- 65 T T T <- 66 F T F ... 92 F T F <- 93 T T T <- 94 T F F ... 100 T F F <- 101 T T T <- 102 F T F ... 144 F T F <- 145 T T T <- 146 T F F 147 T F F 148 T F F <- 149 T T T <- With the sum of elements fulfilling the condition being (see arrows) (2 - 1) + (40 - 38) + (58 - 57) + (65 - 64) + (93 - 92) + (101 - 100) + (145 - 144) + (149 - 148) = 1 + 2 + 1 + 1 + 1 + 1 + 1 + 1 = 9 So, is there a numpy recipe for doing the equivalent process without expanding it into the full arrays? I have tried looping over each element in the changes_at arrays and build up the sums, but that is too inefficient as I then have an inner for loop containing conditional branching code Thanks in advance, Slaunger -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for... Sent from the Numpy-discussion mailing list archive at Nabble.com.

On Thu, Mar 27, 2014 at 1:18 AM, Slaunger <Slaunger@gmail.com> wrote:
I am working on solving a recent recreational mathematical problem on Project Euler <http://projecteuler.net> . I have a solution, which works fine for small N up to 10^5 but it takes too long to compute for the actual problem, where N is of the order 2*10^7. The problem is nested loops, and I am hoping to avoid one level in the loop by using clever numpy magic (as I have done often before). However, there is one step, which I cannot figure out how to do using numpy operations alone, and I am hoping for some help
The subproblem is that I have in principle k = 1, ..., N sets of boolean arrays f_k and g_k each of length N.
For each k I need to find the number of elements i where both f_k[i] and g_k[i] are True and sum that up over all N values of k.
A problem of the order 4*10^14 if I just do it brute force. This takes way too long (there is a one minute rule).
However, after a lot of thinking and by using some properties of the f_k and g_k I have managed to construct using only pure numpy function and only a single loop over k, arrays
f_k_changes_at g_k_changes_at
which contain the indices i at which the functions change it boolean value from True to False or False to True.
It so happens that the number of changes is only a small fraction of N, the fraction decreases with larger N, so the size of these changes_at arrays contains perhaps only 1000 elements instead of 10000000 for each k, a significant reduction of complexity.
Now, my problem is to figure out for how many values of i both f_k and g_k are True given the changes_at arrays.
As this may be a little hard to understand here is a specific example of how these arrays can look like for k = 2 and N = 150
f_2_changes_at = [ 2 3 39 41 58 59 65 66 93 102 145] g_2_changes_at = [ 2 94 101 146 149]
with the boundary condition that f_2[0] = g_2[0] = False
Which expands to i f_2 g_2 f_2 and g_2 0 F F F 1 F F F <- 2 T T T <- 3 F T F 4 F T F ... 38 F T F <- 39 T T T 40 T T T <- 41 F T F 42 F T F ... 57 F T F <- 58 T T T <- 59 F T F 60 F T F ... 64 F T F <- 65 T T T <- 66 F T F ... 92 F T F <- 93 T T T <- 94 T F F ... 100 T F F <- 101 T T T <- 102 F T F ... 144 F T F <- 145 T T T <- 146 T F F 147 T F F 148 T F F <- 149 T T T <-
With the sum of elements fulfilling the condition being (see arrows)
(2 - 1) + (40 - 38) + (58 - 57) + (65 - 64) + (93 - 92) + (101 - 100) + (145 - 144) + (149 - 148) = 1 + 2 + 1 + 1 + 1 + 1 + 1 + 1 = 9
So, is there a numpy recipe for doing the equivalent process without expanding it into the full arrays?
I have tried looping over each element in the changes_at arrays and build up the sums, but that is too inefficient as I then have an inner for loop containing conditional branching code
Thanks in advance, Slaunger
-- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for... Sent from the Numpy-discussion mailing list archive at Nabble.com. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Can you provide a link to the problem itself? -- JD

Jaidev Deshpande wrote
Can you provide a link to the problem itself?
-- JD
I'd rather not state the problem number since it should not be so easy to search for it and find this thread, but I can state that at the the time being, it is the problem with the highest problem number (released this Saturday) -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for... Sent from the Numpy-discussion mailing list archive at Nabble.com.

On Wed, Mar 26, 2014 at 3:48 PM, Slaunger <Slaunger@gmail.com> wrote:
I am working on solving a recent recreational mathematical problem on Project Euler <http://projecteuler.net> . I have a solution, which works fine for small N up to 10^5 but it takes too long to compute for the actual problem, where N is of the order 2*10^7. The problem is nested loops, and I am hoping to avoid one level in the loop by using clever numpy magic (as I have done often before). However, there is one step, which I cannot figure out how to do using numpy operations alone, and I am hoping for some help
The subproblem is that I have in principle k = 1, ..., N sets of boolean arrays f_k and g_k each of length N.
For each k I need to find the number of elements i where both f_k[i] and g_k[i] are True and sum that up over all N values of k.
IIUC, [~/] [1]: np.logical_and([True, False, True], [False, False, True]) [1]: array([False, False, True], dtype=bool) You can avoid looping over k since they're all the same length [~/] [3]: np.logical_and([[True, False],[False, True],[False, True]], [[False, False], [False, True], [True, True]]) [3]: array([[False, False], [False, True], [False, True]], dtype=bool) [~/] [4]: np.sum(np.logical_and([[True, False],[False, True],[False, True]], [[False, False], [False, True], [True, True]]), axis=0) [4]: array([0, 2])
A problem of the order 4*10^14 if I just do it brute force. This takes way too long (there is a one minute rule).
However, after a lot of thinking and by using some properties of the f_k and g_k I have managed to construct using only pure numpy function and only a single loop over k, arrays
f_k_changes_at g_k_changes_at
which contain the indices i at which the functions change it boolean value from True to False or False to True.
It so happens that the number of changes is only a small fraction of N, the fraction decreases with larger N, so the size of these changes_at arrays contains perhaps only 1000 elements instead of 10000000 for each k, a significant reduction of complexity.
Now, my problem is to figure out for how many values of i both f_k and g_k are True given the changes_at arrays.
As this may be a little hard to understand here is a specific example of how these arrays can look like for k = 2 and N = 150
f_2_changes_at = [ 2 3 39 41 58 59 65 66 93 102 145] g_2_changes_at = [ 2 94 101 146 149]
with the boundary condition that f_2[0] = g_2[0] = False
Which expands to i f_2 g_2 f_2 and g_2 0 F F F 1 F F F <- 2 T T T <- 3 F T F 4 F T F ... 38 F T F <- 39 T T T 40 T T T <- 41 F T F 42 F T F ... 57 F T F <- 58 T T T <- 59 F T F 60 F T F ... 64 F T F <- 65 T T T <- 66 F T F ... 92 F T F <- 93 T T T <- 94 T F F ... 100 T F F <- 101 T T T <- 102 F T F ... 144 F T F <- 145 T T T <- 146 T F F 147 T F F 148 T F F <- 149 T T T <-
With the sum of elements fulfilling the condition being (see arrows)
(2 - 1) + (40 - 38) + (58 - 57) + (65 - 64) + (93 - 92) + (101 - 100) + (145 - 144) + (149 - 148) = 1 + 2 + 1 + 1 + 1 + 1 + 1 + 1 = 9
So, is there a numpy recipe for doing the equivalent process without expanding it into the full arrays?
I have tried looping over each element in the changes_at arrays and build up the sums, but that is too inefficient as I then have an inner for loop containing conditional branching code
Thanks in advance, Slaunger
-- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for... Sent from the Numpy-discussion mailing list archive at Nabble.com. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

jseabold wrote
IIUC,
[~/] [1]: np.logical_and([True, False, True], [False, False, True]) [1]: array([False, False, True], dtype=bool)
You can avoid looping over k since they're all the same length
[~/] [3]: np.logical_and([[True, False],[False, True],[False, True]], [[False, False], [False, True], [True, True]]) [3]: array([[False, False], [False, True], [False, True]], dtype=bool)
[~/] [4]: np.sum(np.logical_and([[True, False],[False, True],[False, True]], [[False, False], [False, True], [True, True]]), axis=0) [4]: array([0, 2])
Well, yes, if you work with the pure f_k and g_k that is true, but this two-dimensional array will have 4*10^14 elements and will exhaust my memory. That is why I have found a more efficient method for finding only the much fewer changes_at elements for each k, and these arrays have unequal length, and has to be considered for eack k (which is tolerable as long as I avoid a further inner loop for each k in explicit Python). I could implement this in C and get it done sufficiently efficient. I just like to make a point in demonstrating this is also doable in finite time in Python/numpy. -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for... Sent from the Numpy-discussion mailing list archive at Nabble.com.

On Wed, Mar 26, 2014 at 1:28 PM, Slaunger <Slaunger@gmail.com> wrote: See if you can make sense of the following. It is a little cryptic, but it works: f_change = np.array([2, 3, 39, 41, 58, 59, 65, 66, 93, 102, 145]) g_change = np.array([2, 94, 101, 146, 149]) N = 150 if len(f_change) % 2 : f_change = np.append(f_change, N) if len(g_change) % 2 : g_change = np.append(g_change, N) idx = np.searchsorted(f_change, g_change) f_change_exp = np.insert(np.insert(f_change, idx, g_change), idx + np.arange(len(idx)), g_change) idx2 = np.searchsorted(g_change, f_change_exp) f_change_lens = f_change_exp[1::2] - f_change_exp[::2] true_true_intervals = idx2[1::2] % 2 != 0 total = np.sum(f_change_lens[true_true_intervals])
total
9 I'll gladly elaborate on what's going on, just ask! Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.

Jaime Fernández del Río wrote
On Wed, Mar 26, 2014 at 1:28 PM, Slaunger <
Slaunger@
> wrote:
See if you can make sense of the following. It is a little cryptic, but it works:
f_change = np.array([2, 3, 39, 41, 58, 59, 65, 66, 93, 102, 145])
g_change = np.array([2, 94, 101, 146, 149])
N = 150
if len(f_change) % 2 :
f_change = np.append(f_change, N)
if len(g_change) % 2 :
g_change = np.append(g_change, N)
idx = np.searchsorted(f_change, g_change)
f_change_exp = np.insert(np.insert(f_change, idx, g_change),
idx + np.arange(len(idx)), g_change)
idx2 = np.searchsorted(g_change, f_change_exp)
f_change_lens = f_change_exp[1::2] - f_change_exp[::2]
true_true_intervals = idx2[1::2] % 2 != 0
total = np.sum(f_change_lens[true_true_intervals])
total
9
I'll gladly elaborate on what's going on, just ask!
Jaime
Holá Jaime! YOU ARE A GENIUS!. I understand exactly what you are doing! You know what. I just had a shower, nothing like having a shower for thinking about hard problems, and I got the same idea: Make the changes_at arrays even length if needed by appending N, then use searchsorted to figure out what values to merge and where! Only I did not know about the append and insert methods. Very, very nice! (I only knew concatenate, which would be clumsy for just appending one element), and for the insert I would have concatenated and in-place sorted, but your solution is much more elegant! You saved my evening! Actually, my head has been spinning about this problem the last three evenings without having been able to nail it down. Abrazos de Dinamarca, Slaunger -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for... Sent from the Numpy-discussion mailing list archive at Nabble.com.

On Wed, Mar 26, 2014 at 2:23 PM, Slaunger <Slaunger@gmail.com> wrote:
Jaime Fernández del Río wrote
You saved my evening! Actually, my head has been spinning about this problem the last three evenings without having been able to nail it down.
I had to quit Project Euler about 5 years ago because it was taking a huge toll on my mental health. I did learn/remember a ton of math, but was staying up all night banging my head against the problems much too often. Every now and then I do peek back and sometimes attempt a problem or two, but try to stay away for my own good. If you want to be projecteuler friends, I'm jfrio over there... Jaime

Without looking ahead, here is what I came up with; but I see more elegant solutions have been found already. import numpy as np def as_dense(f, length): i = np.zeros(length+1, np.int) i[f[0]] = 1 i[f[1]] = -1 return np.cumsum(i)[:-1] def as_sparse(d): diff = np.diff(np.concatenate(([0], d))) on, = np.nonzero(diff) on = on if on.size%2==0 else np.append(on, len(d)) return on.reshape(-1,2).T def join(f, g): on = np.sort(np.concatenate((f[0], g[0]))) off = np.sort(np.concatenate((f[1], g[1]))) I = np.argsort( np.concatenate((on, off)) ).argsort().reshape(2,-1) Q = -np.ones((2,I.size), np.int) Q[0,I[0]] = on Q[1,I[1]] = off idx_on = np.logical_and( Q[0,1:]*Q[0,:-1] < 0, Q[0,:-1]!=-1) idx_off = np.logical_and( Q[1,1:]*Q[1,:-1] < 0, Q[1,1:]!=-1) idx_on = np.concatenate( (idx_on, [False])) idx_off = np.concatenate( ([False], idx_off)) return np.array(( Q[0,idx_on], Q[1,idx_off])) length = 150 f_2_changes_at = np.array( [ 2 , 3 , 39, 41 , 58 , 59, 65 , 66 , 93 ,102, 145, length]) g_2_changes_at = np.array( [ 2 , 94 ,101, 146, 149, length]) f = f_2_changes_at.reshape(-1,2).T g = g_2_changes_at.reshape(-1,2).T dense_result = as_sparse( np.logical_and( as_dense(f, length), as_dense(g,length))) sparse_result = join(f,g) print np.allclose(dense_result, sparse_result) On Wed, Mar 26, 2014 at 10:50 PM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
On Wed, Mar 26, 2014 at 2:23 PM, Slaunger <Slaunger@gmail.com> wrote:
Jaime Fernández del Río wrote
You saved my evening! Actually, my head has been spinning about this problem the last three evenings without having been able to nail it down.
I had to quit Project Euler about 5 years ago because it was taking a huge toll on my mental health. I did learn/remember a ton of math, but was staying up all night banging my head against the problems much too often. Every now and then I do peek back and sometimes attempt a problem or two, but try to stay away for my own good.
If you want to be projecteuler friends, I'm jfrio over there...
Jaime
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Wed, Mar 26, 2014 at 2:23 PM, Slaunger <Slaunger@gmail.com> wrote:
Only I did not know about the append and insert methods. Very, very nice! (I only knew concatenate, which would be clumsy for just appending one element),
Sorry -- I dont have the time to actually figure out what you are doing, but::: note that numpy arrays are not re-sizable, so np.append() and np.insert() have to make a new array, and copy all the old data over. If you are appending one at a time, this can be pretty darn slow. I wrote a "grow_array" class once, it was a wrapper around a numpy array that pre-allocated extra data to make appending more efficient. It's kind of half-baked code now, but let me know if you are interested. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Chris Barker - NOAA Federal wrote
note that numpy arrays are not re-sizable, so np.append() and np.insert() have to make a new array, and copy all the old data over. If you are appending one at a time, this can be pretty darn slow.
I wrote a "grow_array" class once, it was a wrapper around a numpy array that pre-allocated extra data to make appending more efficient. It's kind of half-baked code now, but let me know if you are interested.
Hi Chris, Yes, it is a good point and I am aware of it. For some of these functions it would have been nice if i could have parsed a preallocated, properly sliced array to the functions, which i could then reuse in each iteration step. It is indeed the memory allocation which appear to take more time than the actual calculations. Still it is much faster to create a few arrays than to loop through a thousand individual elements in pure Python. Interesting with the grow_array class. I think that what I have for now is sufficient, but i will keep your offer in mind:) --Slaunger -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for... Sent from the Numpy-discussion mailing list archive at Nabble.com.

I find this interesting, since I work with medical data sets of 100s of MB, and regularly run into memory allocation problems when doing a lot of Fourrier analysis, waterfalls etc. The per-process limit seems to be about 1.3GB on this 6GB quad-i7 with Win7. For live data collection routines I simply creates zeros() of say 300MB and trim the array when saving to disk. memmaps are also limited to RAM, and take a looooong time to create (seconds). So, I've been investigating Pandas and segmentaxis - just a bit so far. - Ray Schumacher At 12:02 AM 3/27/2014, you wrote:
Chris Barker - NOAA Federal wrote
note that numpy arrays are not re-sizable, so np.append() and np.insert() have to make a new array, and copy all the old data over. If you are appending one at a time, this can be pretty darn slow.
I wrote a "grow_array" class once, it was a wrapper around a numpy array that pre-allocated extra data to make appending more efficient. It's kind of half-baked code now, but let me know if you are interested.
Hi Chris,
Yes, it is a good point and I am aware of it. For some of these functions it would have been nice if i could have parsed a preallocated, properly sliced array to the functions, which i could then reuse in each iteration step.
It is indeed the memory allocation which appear to take more time than the actual calculations.
Still it is much faster to create a few arrays than to loop through a thousand individual elements in pure Python.
Interesting with the grow_array class. I think that what I have for now is sufficient, but i will keep your offer in mind:)
--Slaunger

You might want to look at hdf5 if you're routinely running out of ram. I'm using h5py with multi gigabyte data on an ssd right now. It is very fast. You still have to be careful with your computations and try to avoid creating copies though. hypy: www.h5py.org aaron On Thu 27 Mar, RayS wrote:
I find this interesting, since I work with medical data sets of 100s of MB, and regularly run into memory allocation problems when doing a lot of Fourrier analysis, waterfalls etc. The per-process limit seems to be about 1.3GB on this 6GB quad-i7 with Win7. For live data collection routines I simply creates zeros() of say 300MB and trim the array when saving to disk. memmaps are also limited to RAM, and take a looooong time to create (seconds). So, I've been investigating Pandas and segmentaxis - just a bit so far.
- Ray Schumacher
At 12:02 AM 3/27/2014, you wrote:
Chris Barker - NOAA Federal wrote
note that numpy arrays are not re-sizable, so np.append() and np.insert() have to make a new array, and copy all the old data over. If you are appending one at a time, this can be pretty darn slow.
I wrote a "grow_array" class once, it was a wrapper around a numpy array that pre-allocated extra data to make appending more efficient. It's kind of half-baked code now, but let me know if you are interested.
Hi Chris,
Yes, it is a good point and I am aware of it. For some of these functions it would have been nice if i could have parsed a preallocated, properly sliced array to the functions, which i could then reuse in each iteration step.
It is indeed the memory allocation which appear to take more time than the actual calculations.
Still it is much faster to create a few arrays than to loop through a thousand individual elements in pure Python.
Interesting with the grow_array class. I think that what I have for now is sufficient, but i will keep your offer in mind:)
--Slaunger
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Thu, 27 Mar 2014 16:19:54 +0000 "Aaron O'Leary" <aaron.oleary@gmail.com> wrote:
You might want to look at hdf5 if you're routinely running out of ram. I'm using h5py with multi gigabyte data on an ssd right now. It is very fast. You still have to be careful with your computations and try to avoid creating copies though.
Both for h5py and for memmapped files ... switching from windows to linux are likely to help ... -- Jérôme Kieffer tel +33 476 882 445

On Thu, Mar 27, 2014 at 7:42 AM, RayS <rays@blue-cove.com> wrote:
I find this interesting, since I work with medical data sets of 100s of MB, and regularly run into memory allocation problems when doing a lot of Fourrier analysis, waterfalls etc. The per-process limit seems to be about 1.3GB on this 6GB quad-i7 with Win7.
This sounds like 32 bit -- have you tried a 64 bit Python_numpy? Nt that you wont have issues anyway, but you should be abel to do better than 1.3GB...
memmaps are also limited to RAM,
I don't think so, no -- but are limited to 2GB (I think) if you're using a 32 bit process There is also a compressed array package out there -- I can't remember what it's called -- but if you have large compressible arrays -- that might help. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Wed, Mar 26, 2014 at 4:28 PM, Slaunger <Slaunger@gmail.com> wrote:
jseabold wrote
IIUC,
[~/] [1]: np.logical_and([True, False, True], [False, False, True]) [1]: array([False, False, True], dtype=bool)
You can avoid looping over k since they're all the same length
[~/] [3]: np.logical_and([[True, False],[False, True],[False, True]], [[False, False], [False, True], [True, True]]) [3]: array([[False, False], [False, True], [False, True]], dtype=bool)
[~/] [4]: np.sum(np.logical_and([[True, False],[False, True],[False, True]], [[False, False], [False, True], [True, True]]), axis=0) [4]: array([0, 2])
Well, yes, if you work with the pure f_k and g_k that is true, but this two-dimensional array will have 4*10^14 elements and will exhaust my memory.
That is why I have found a more efficient method for finding only the much fewer changes_at elements for each k, and these arrays have unequal length, and has to be considered for eack k (which is tolerable as long as I avoid a further inner loop for each k in explicit Python).
I could implement this in C and get it done sufficiently efficient. I just like to make a point in demonstrating this is also doable in finite time in Python/numpy.
If you want to attack it straight on and keep it conceptually simple, this looks like it would work. Fair warning, I've never done this and have no idea if it's actually memory and computationally efficient, so I'd be interested to hear from experts. I just wanted to see if it would work from disk. I wonder if a solution using PyTables would be faster. Provided that you can chunk your data into a memmap array, then something you *could* do N = 2*10**7 chunk_size = 100000 farr1 = 'scratch/arr1' farr2 = 'scratch/arr2' arr1 = np.memmap(farr1, dtype='uint8', mode='w+', shape=(N, 4)) arr2 = np.memmap(farr2, dtype='uint8', mode='w+', shape=(N, 4)) for i in xrange(0, N, chunk_size): arr1[i:i+chunk_size] = np.random.randint(2, size=(chunk_size, 4)).astype(np.uint8) arr2[i:i+chunk_size] = np.random.randint(2, size=(chunk_size, 4)).astype(np.uint8) del arr1 del arr2 arr1 = np.memmap(farr1, mode='r', dtype='uint8', shape=(N,4)) arr2 = np.memmap(farr2, mode='r', dtype='uint8', shape=(N,4)) equal = np.logical_and(arr1[:chunk_size], arr2[:chunk_size]).sum(0) for i in xrange(chunk_size, N, chunk_size): equal += np.logical_and(arr1[i:i+chunk_size], arr2[i:i+chunk_size]).sum(0) Skipper

jseabold wrote
Well, yes, if you work with the pure f_k and g_k that is true, but this two-dimensional array will have 4*10^14 elements and will exhaust my memory.
That is why I have found a more efficient method for finding only the much fewer changes_at elements for each k, and these arrays have unequal length, and has to be considered for eack k (which is tolerable as long as I avoid a further inner loop for each k in explicit Python).
I could implement this in C and get it done sufficiently efficient. I just like to make a point in demonstrating this is also doable in finite time in Python/numpy.
If you want to attack it straight on and keep it conceptually simple, this looks like it would work. Fair warning, I've never done this and have no idea if it's actually memory and computationally efficient, so I'd be interested to hear from experts. I just wanted to see if it would work from disk. I wonder if a solution using PyTables would be faster.
Provided that you can chunk your data into a memmap array, then something you *could* do
N = 2*10**7 chunk_size = 100000
farr1 = 'scratch/arr1' farr2 = 'scratch/arr2'
arr1 = np.memmap(farr1, dtype='uint8', mode='w+', shape=(N, 4)) arr2 = np.memmap(farr2, dtype='uint8', mode='w+', shape=(N, 4))
for i in xrange(0, N, chunk_size): arr1[i:i+chunk_size] = np.random.randint(2, size=(chunk_size, 4)).astype(np.uint8) arr2[i:i+chunk_size] = np.random.randint(2, size=(chunk_size, 4)).astype(np.uint8)
del arr1 del arr2
arr1 = np.memmap(farr1, mode='r', dtype='uint8', shape=(N,4)) arr2 = np.memmap(farr2, mode='r', dtype='uint8', shape=(N,4))
equal = np.logical_and(arr1[:chunk_size], arr2[:chunk_size]).sum(0)
for i in xrange(chunk_size, N, chunk_size): equal += np.logical_and(arr1[i:i+chunk_size], arr2[i:i+chunk_size]).sum(0)
Skipper
Thanks for the proposal Skipper, I have used memmap before, and this may work, but still the number of elementary and operations needed (although hidden under the hood of chunked logical_and) will be about a factor of 1000 larger than what is actually needed due to the sparsity in the "roots" of the logical functions I actually have, and that will result in hours or days of computation instead of minute(s). I think I will first give it a go using the procedure described by Jaime (tomorrow, no more time today), as I have gone through a lot of pain constructing the changes_at arrays using a fast and efficient method. --Slaunger -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for... Sent from the Numpy-discussion mailing list archive at Nabble.com.
participants (9)
-
Aaron O'Leary
-
Chris Barker
-
Eelco Hoogendoorn
-
Jaidev Deshpande
-
Jaime Fernández del Río
-
Jerome Kieffer
-
RayS
-
Skipper Seabold
-
Slaunger