[Numpy-discussion] Boolean arrays with nulls?
Stefan van der Walt
stefanv at berkeley.edu
Thu Apr 18 12:45:41 EDT 2019
Hi Stuart,
On Thu, 18 Apr 2019 09:12:31 -0700, Stuart Reynolds wrote:
> Is there an efficient way to represent bool arrays with null entries?
You can use the bool dtype:
In [5]: x = np.array([True, False, True])
In [6]: x
Out[6]: array([ True, False, True])
In [7]: x.dtype
Out[7]: dtype('bool')
You should note that this stores one True/False value per byte, so it is
not optimal in terms of memory use. There is no easy way to do
bit-arrays with NumPy, because we use strides to determine how to move
from one memory location to the next.
See also: https://www.reddit.com/r/Python/comments/5oatp5/one_bit_data_type_in_numpy/
> What I’m hoping for is that there’s a structure that is ‘viewed’ as
> nan-able float data, but backed but a more efficient structures
> internally.
There are good implementations of this idea, such as:
https://github.com/ilanschnell/bitarray
Those structures cannot typically utilize the NumPy machinery, though.
With the new array function interface, you should at least be able to
build something that has something close to the NumPy API.
Best regards,
Stéfan
More information about the NumPy-Discussion
mailing list