On Thu, Aug 25, 2016 at 4:37 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Do, 2016-08-25 at 10:36 -0400, Joseph Fox-Rabinovitz wrote:
> This issue recently came up on Stack Overflow: http://stackoverflow.c
> om/questions/39145795/masking-a-series-with-a-boolean-array. The
> poster attempted to index an ndarray with a pandas boolean Series
> object (all False), but the result was as if he had indexed with an
> array of integer zeros.
>
> Can someone explain this behavior? I can see two obvious
> possibilities:
> ndarray checks if the input to __getitem__ is of exactly the right
> type, not using instanceof.
> pandas actually uses a wider datatype than boolean internally, so
> indexing with the series is in fact indexing with an integer array.

You are overthinking it ;). The reason is quite simply that the logic
used to be:

 * Boolean array? -> think about boolean indexing.
 * Everything "array-like" (not caught earlier) -> convert to `intp`
array and do integer indexing.

Now you might wonder why, but probably it is quite simply because
boolean indexing was tagged on later.

- Sebastian


> In my attempt to reproduce the poster's results, I got the following
> warning:
> FutureWarning: in the future, boolean array-likes will be handled as
> a boolean array index
> This indicates that the issue is probably #1 and that a fix is
> already on the way. Please correct me if I am wrong. Also, where does
> the code for ndarray.__getitem__ live?
> Thanks,
>     -Joe
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


This makes perfect sense. I would like to help fix it if a fix is desired and has not been done already. Could you point me to where the "Boolean array?, etc." decision happens? I have had trouble navigating to `__getitem__` (which I assume is somewhere in np.core.multiarray C code.

    -Joe