[Numpy-discussion] slicing with boolean in numpy master

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Jun 26 14:16:53 EDT 2013


On Wed, Jun 26, 2013 at 1:56 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Jun 26, 2013 at 1:16 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
>> On Wed, 2013-06-26 at 12:52 -0400, josef.pktd at gmail.com wrote:
>>> On Wed, Jun 26, 2013 at 12:01 PM, Sebastian Berg
>>> <sebastian at sipsolutions.net> wrote:
>>> > On Wed, 2013-06-26 at 11:30 -0400, josef.pktd at gmail.com wrote:
>>> >> Is there a change in the behavior of boolean slicing in current master?
>>> >>
>>> >
>>> > Yes, but I think this is probably a bug in statsmodel. I would expect
>>> > you should be using "..." and not ":" here, because ":" requires the
>>> > dimension to actually exist, and I *expect* that your mask actually has
>>> > the same dimensionality as the array itself.
>>> >
>>> > I.e.:
>>> >
>>> > x = np.arange(8).reshape(4,4)
>>> > mask = np.ones_like(x)
>>> > x[mask,:] # should NOT work, but this was buggy before current master.
>>>
>>> Why should this not work?

that's fine, I didn't see that mask is 2d

>>>
>>> How do you select rows that don't have nans in them?
>>>
>>> mask = np.isfinite(x).all(1)
>>> x[mask, :]
>>>
>>> or columns with switched axis.
>>>
>>> >>> x[mask[:, None]]
>>> array([ 1.,  1.,  1.,  1.])
>>> ???
>>>
>>
>> I assume you wanted to write x[:, mask] there. Since boolean masks do
>> *not* broadcast, instead they eat away as many dimensions as they have.
>>
>> Maybe these examples will help explain why the new behaviour is correct:
>>
>> x = np.random.random((3,3))
>> mask = np.ones((3,3), dtype=np.bool_)
>>
>> # Check slices:
>> x[:,:] # OK, result 2-d
>> x[:,:,:] # too many indices.
>>
>> # replace first dimension with the mask:
>> x[mask[:,0], :] # OK, result 2-d
>
> Good, if this still works, then I go hunting in our code for the "too
> many indices"

Was easy to find given the clues, even without being able to get the
error messages form numpy > 1.7.1

Thanks,

Josef

>
> Thanks for the clarification,



>
> Josef
>
>> x[mask[:,0], :, :] # too many indices.
>>
>> # replace *both* slices with a (single) mask:
>> x[mask] # OK, result 1-d (i.e. there nothing more then the mask)
>> x[mask, :] # too many indices! But it still works in 1.7.
>>
>> # In fact we can make this absurd:
>> x[mask, :, :, :, :, :] # Too many slices even without the mask!
>>
>> The last case used to work in pre-master due to a bug.
>>
>> - Sebastian
>>
>>
>>> (I have to check the usage in statsmodels, but I thought this is standard.)
>>>
>>> Josef
>>>
>>> >
>>> > - Sebastian
>>> >
>>> >> If not I have to find another candidate in numpy master.
>>> >>
>>> >> (py27d) E:\Josef\testing\tox\py27d\Scripts>python
>>> >> Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit
>>> >> (Intel)] on win32
>>> >> Type "help", "copyright", "credits" or "license" for more information.
>>> >> >>> import numpy as np
>>> >> >>> np.__version__
>>> >> '1.7.1'
>>> >> >>> x = np.ones((5,3))
>>> >> >>> mask = np.arange(5) < 4
>>> >> >>> x[mask, :]
>>> >> array([[ 1.,  1.,  1.],
>>> >>        [ 1.,  1.,  1.],
>>> >>        [ 1.,  1.,  1.],
>>> >>        [ 1.,  1.,  1.]])
>>> >>
>>> >>
>>> >> We get errors like the following when running the statsmodels tests
>>> >> with a current or recent numpy master, but not with numpy 1.7.1
>>> >>
>>> >> ======================================================================
>>> >> ERROR: Failure: IndexError (too many indices)
>>> >> ----------------------------------------------------------------------
>>> >> Traceback (most recent call last):
>>> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/nose/loader.py",
>>> >> line 518, in makeTest
>>> >>     return self._makeTest(obj, parent)
>>> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/nose/loader.py",
>>> >> line 577, in _makeTest
>>> >>     return MethodTestCase(obj)
>>> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/nose/case.py",
>>> >> line 345, in __init__
>>> >>     self.inst = self.cls()
>>> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-macosx-10.8-x86_64.egg/statsmodels/emplike/tests/test_aft.py",
>>> >> line 19, in __init__
>>> >>     super(Test_AFTModel, self).__init__()
>>> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-macosx-10.8-x86_64.egg/statsmodels/emplike/tests/test_aft.py",
>>> >> line 12, in __init__
>>> >>     self.mod1 = sm.emplike.emplikeAFT(endog, exog, data.censors)
>>> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-macosx-10.8-x86_64.egg/statsmodels/emplike/aft_el.py",
>>> >> line 248, in __init__
>>> >>     self.uncens_endog = self.endog[np.bool_(self.censors), :].\
>>> >> IndexError: too many indices
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Josef
>>> >> _______________________________________________
>>> >> NumPy-Discussion mailing list
>>> >> NumPy-Discussion at scipy.org
>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >>
>>> >
>>> >
>>> > _______________________________________________
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion at scipy.org
>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list