
[apologies if this is a resend, my mail just flaked out] I have a boolean array and would like to find the lowest index "ind" where N contiguous elements are all True. Eg, if x is In [101]: x = np.random.rand(20)>.4 In [102]: x Out[102]: array([False, True, True, False, False, True, True, False, False, True, False, True, False, True, True, True, False, True, False, True], dtype=bool) I would like to find ind=1 for N=2 and ind=13 for N=2. I assume with the right cumsum, diff and maybe repeat magic, this can be vectorized, but the proper incantation is escaping me. for N==3, I thought of In [110]: x = x.astype(int) In [112]: y = x[:-2] + x[1:-1] + x[2:] In [125]: ind = (y==3).nonzero()[0] In [126]: if len(ind): ind = ind[0] In [128]: ind Out[128]: 13 Thanks, JDH

On Fri, Feb 29, 2008 at 11:53 AM, John Hunter <jdh2358@gmail.com> wrote:
[apologies if this is a resend, my mail just flaked out]
I have a boolean array and would like to find the lowest index "ind" where N contiguous elements are all True. Eg, if x is
In [101]: x = np.random.rand(20)>.4
In [102]: x Out[102]: array([False, True, True, False, False, True, True, False, False, True, False, True, False, True, True, True, False, True, False, True], dtype=bool)
I would like to find ind=1 for N=2 and ind=13 for N=2. I assume with the right cumsum, diff and maybe repeat magic, this can be vectorized, but the proper incantation is escaping me.
For smallish N (< 100 perhaps), I'd do something like this: In [57]: from numpy import * In [58]: prng = random.RandomState(1234567890) In [59]: x = prng.random_sample(50) < 0.5 In [60]: x Out[60]: array([False, False, False, False, True, False, True, False, False, False, True, False, True, False, True, True, True, True, True, False, False, False, True, False, True, False, False, False, True, True, True, True, False, False, True, False, False, False, False, False, False, False, False, True, False, False, True, False, True, False], dtype=bool) In [61]: N = 2 In [62]: mask = ones(len(x) - N + 1, dtype=bool) In [63]: for i in range(N): ....: mask &= x[i:len(x)-N+1+i] ....: ....: In [64]: mask Out[64]: array([False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True, False, False, False, False, False, False, False, False, False, False, True, True, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False], dtype=bool) In [65]: nonzero(mask)[0][0] Out[65]: 14 In [66]: x[13:20] Out[66]: array([False, True, True, True, True, True, False], dtype=bool) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

On Fri, Feb 29, 2008 at 10:53 AM, John Hunter <jdh2358@gmail.com> wrote:
[apologies if this is a resend, my mail just flaked out]
I have a boolean array and would like to find the lowest index "ind" where N contiguous elements are all True. Eg, if x is
In [101]: x = np.random.rand(20)>.4
In [102]: x Out[102]: array([False, True, True, False, False, True, True, False, False, True, False, True, False, True, True, True, False, True, False, True], dtype=bool)
I would like to find ind=1 for N=2 and ind=13 for N=2. I assume with the right cumsum, diff and maybe repeat magic, this can be vectorized, but the proper incantation is escaping me.
for N==3, I thought of
In [110]: x = x.astype(int) In [112]: y = x[:-2] + x[1:-1] + x[2:]
In [125]: ind = (y==3).nonzero()[0]
In [126]: if len(ind): ind = ind[0]
In [128]: ind Out[128]: 13
This may be more involved than you want, but In [37]: prng = random.RandomState(1234567890) In [38]: x = prng.random_sample(50) < 0.5 In [39]: y1 = concatenate(([False], x[:-1])) In [40]: y2 = concatenate((x[1:], [False])) In [41]: beg = ind[x & ~y1] In [42]: end = ind[x & ~y2] In [43]: cnt = end - beg + 1 In [44]: i = beg[cnt == 4] In [45]: i Out[45]: array([28]) In [46]: x Out[46]: array([False, False, False, False, True, False, True, False, False, False, True, False, True, False, True, True, True, True, True, False, False, False, True, False, True, False, False, False, True, True, True, True, False, False, True, False, False, False, False, False, False, False, False, True, False, False, True, False, True, False], dtype=bool) produces a list of the indices where sequences of length 4 begin. Chuck

On Fri, Feb 29, 2008 at 11:12 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
On Fri, Feb 29, 2008 at 10:53 AM, John Hunter <jdh2358@gmail.com> wrote:
[apologies if this is a resend, my mail just flaked out]
I have a boolean array and would like to find the lowest index "ind" where N contiguous elements are all True. Eg, if x is
In [101]: x = np.random.rand(20)>.4
In [102]: x Out[102]: array([False, True, True, False, False, True, True, False, False, True, False, True, False, True, True, True, False, True, False, True], dtype=bool)
I would like to find ind=1 for N=2 and ind=13 for N=2. I assume with the right cumsum, diff and maybe repeat magic, this can be vectorized, but the proper incantation is escaping me.
for N==3, I thought of
In [110]: x = x.astype(int) In [112]: y = x[:-2] + x[1:-1] + x[2:]
In [125]: ind = (y==3).nonzero()[0]
In [126]: if len(ind): ind = ind[0]
In [128]: ind Out[128]: 13
This may be more involved than you want, but
In [37]: prng = random.RandomState(1234567890)
In [38]: x = prng.random_sample(50) < 0.5
In [39]: y1 = concatenate(([False], x[:-1]))
In [40]: y2 = concatenate((x[1:], [False]))
In [41]: beg = ind[x & ~y1]
In [42]: end = ind[x & ~y2]
In [43]: cnt = end - beg + 1
In [44]: i = beg[cnt == 4]
In [45]: i Out[45]: array([28])
In [46]: x Out[46]: array([False, False, False, False, True, False, True, False, False, False, True, False, True, False, True, True, True, True, True, False, False, False, True, False, True, False, False, False, True, True, True, True, False, False, True, False, False, False, False, False, False, False, False, True, False, False, True, False, True, False], dtype=bool)
produces a list of the indices where sequences of length 4 begin.
Chuck
Oops, ind = arange(len(x)). I suppose nonzero would work as well. Chuck

On 01/03/2008, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Fri, Feb 29, 2008 at 10:53 AM, John Hunter <jdh2358@gmail.com> wrote:
I have a boolean array and would like to find the lowest index "ind" where N contiguous elements are all True. Eg, if x is
[...]
Oops, ind = arange(len(x)). I suppose nonzero would work as well.
I'm guessing you're alluding to the fact that diff(nonzero(x)) gives you a list of the run lengths of Falses in x (except possibly for the first one). If you have a fondness for the baroque, you can try numpy.where(numpy.convolve(x,[1,]*N,'valid')==N) For large N this can even use Fourier-domain convolution (though you'd then have to be careful about round-off error). Silly, really, it's O(NM) or O(N log M) instead of O(N). Anne
participants (4)
-
Anne Archibald
-
Charles R Harris
-
John Hunter
-
Robert Kern