[Numpy-discussion] Best dtype for Boolean values

Anne Archibald peridot.faceted at gmail.com
Mon Apr 12 16:52:55 EDT 2010


On 12 April 2010 11:59, John Jack <itsmilesdavis at gmail.com> wrote:
> Hello all.
> I am (relatively) new to python, and 100% new to numpy.
> I need a way to store arrays of booleans and compare the arrays for
> equality.
> I assume I want arrays of dtype Boolean, and I should compare the arrays
> with array_equal
>>>> tmp.all_states
> array([False,  True, False], dtype=bool)
>>>> tmp1.all_states
> array([False, False, False], dtype=bool)
>>>> tmp1.all_states[1]=True
>>>> tmp1.all_states
> array([False,  True, False], dtype=bool)
>>>> array_equal(tmp.all_states,tmp1.all_states)
> True
>>>> any(tmp.all_states)
> True
> Would this be (a) the cheapest way (w.r.t. memory) to store Booleans and (b)
> the most efficient way to compare two lists of Booleans?

The short answer is yes and yes.

The longer answer is, that uses one byte per Boolean, which is a
tradeoff. In some sense, modern machines are happier working with 32-
or 64-bit quantities, so loading a one-byte Boolean requires a small
amount of byte-shuffling. On the other hand, if you're really short of
memory, 8 bits for a Boolean is really wasteful. In fact, since modern
machines are almost always limited by memory bandwidth, a packed
Boolean data structure would probably be much faster for almost all
operations in spite of the bit-fiddling required. But such a
representation is incompatible with the whole infrastructure of numpy
and so would require a great deal of messy code to support.

So yes, it's the best representation of Booleans available, unless
you're dealing with mind-bogglingly large arrays of them, in which
case some sort of packed-Boolean representation would be better. This
can even be partially supported by numpy, using uint8s, bitwise
operations, and manually-specified bitmasks. There are probably not
many applications for which this is worth the pain.

Anne
P.S. There's actually at least one python package for bit vectors,
outside numpy; I can't speak for how good it is, though. -A

> Thanks for your advice.
> -JJ.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>



More information about the NumPy-Discussion mailing list