On Thu, Aug 25, 2016 at 4:37 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:

On Do, 2016-08-25 at 10:36 -0400, Joseph Fox-Rabinovitz wrote:

> This issue recently came up on Stack Overflow: http://stackoverflow.c

> om/questions/39145795/masking-a-series-with-a-boolean-array. The You are overthinking it ;). The reason is quite simply that the logic

> poster attempted to index an ndarray with a pandas boolean Series

> object (all False), but the result was as if he had indexed with an

> array of integer zeros.

>

> Can someone explain this behavior? I can see two obvious

> possibilities:

> ndarray checks if the input to __getitem__ is of exactly the right

> type, not using instanceof.

> pandas actually uses a wider datatype than boolean internally, so

> indexing with the series is in fact indexing with an integer array.

used to be:

* Boolean array? -> think about boolean indexing.

* Everything "array-like" (not caught earlier) -> convert to `intp`

array and do integer indexing.

Now you might wonder why, but probably it is quite simply because

boolean indexing was tagged on later.

- Sebastian

> In my attempt to reproduce the poster's results, I got the following

> warning:

> FutureWarning: in the future, boolean array-likes will be handled as

> a boolean array index

> This indicates that the issue is probably #1 and that a fix is

> already on the way. Please correct me if I am wrong. Also, where does

> the code for ndarray.__getitem__ live?

> Thanks,

> -Joe

>

This makes perfect sense. I would like to help fix it if a fix is desired and has not been done already. Could you point me to where the "Boolean array?, etc." decision happens? I have had trouble navigating to `__getitem__` (which I assume is somewhere in np.core.multiarray C code.

-Joe