On Thu, Aug 25, 2016 at 4:37 PM, Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Do, 2016-08-25 at 10:36 -0400, Joseph Fox-Rabinovitz wrote:
This issue recently came up on Stack Overflow: http://stackoverflow.c om/questions/39145795/masking-a-series-with-a-boolean-array. The poster attempted to index an ndarray with a pandas boolean Series object (all False), but the result was as if he had indexed with an array of integer zeros.
Can someone explain this behavior? I can see two obvious possibilities: ndarray checks if the input to __getitem__ is of exactly the right type, not using instanceof. pandas actually uses a wider datatype than boolean internally, so indexing with the series is in fact indexing with an integer array.
You are overthinking it ;). The reason is quite simply that the logic used to be:
* Boolean array? -> think about boolean indexing. * Everything "array-like" (not caught earlier) -> convert to `intp` array and do integer indexing.
Now you might wonder why, but probably it is quite simply because boolean indexing was tagged on later.
- Sebastian
In my attempt to reproduce the poster's results, I got the following warning: FutureWarning: in the future, boolean array-likes will be handled as a boolean array index This indicates that the issue is probably #1 and that a fix is already on the way. Please correct me if I am wrong. Also, where does the code for ndarray.__getitem__ live? Thanks, -Joe
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
This makes perfect sense. I would like to help fix it if a fix is desired and has not been done already. Could you point me to where the "Boolean array?, etc." decision happens? I have had trouble navigating to `__getitem__` (which I assume is somewhere in np.core.multiarray C code. -Joe