# BitVector

Ross Ridge rridge at caffeine.csclub.uwaterloo.ca
Mon Mar 10 19:52:00 CET 2008

```DanielJohnson  <diffuser78 at gmail.com> wrote:
>I am trying to solve a genetic algorithm problem where I want to read
>a bitvector of very large size (say 10000) and manipulate bits based
>on certain algorithms.
>
>I am a newbie in Python. What data structure are good to read such
>huge data set. Are there any built in classes for bit fiddling.

A 10000 bit data set is by no means huge.  Depending on exactly what
operations you want to perform converting your "bitvector" to a byte
array, one bit per byte may be fast way to go.  This increases the size
of data set by 8 times, but it's probably the fastest way in Python to
test and set individual bits.

Here's a couple of functions I wrote for very quickly converting a
"bitvector" in the form of a string (eg. read from a file) in to a
byte array:

import string
import array
import binascii

_tr_16 = string.maketrans("0123456789abcdef",
"\x00\x01\x02\x03"
"\x10\x11\x12\x13"
"\x20\x21\x22\x23"
"\x30\x31\x32\x33")
_tr_4 = string.maketrans("0123",
"\x00\x01"
"\x10\x11")
_tr_2 = string.maketrans("01", "\x00\x01")

def string_to_bit_array(s):
"""Convert a string to an array containing a sequence of bits."""
s = binascii.hexlify(s).translate(_tr_16)
s = binascii.hexlify(s).translate(_tr_4)
s = binascii.hexlify(s).translate(_tr_2)
a = array.array('B', s)
return a

_tr_rev_2 = string.maketrans("\x00\x01", "01")
_tr_rev_4 = string.maketrans("\x00\x01"
"\x10\x11",
"0123")
_tr_rev_16 = string.maketrans("\x00\x01\x02\x03"
"\x10\x11\x12\x13"
"\x20\x21\x22\x23"
"\x30\x31\x32\x33",
"0123456789abcdef")
def bit_array_to_string(a):
"""Convert an array containing a sequence of bits to a string."""
remainder = len(a) % 8
if remainder != 0:
a.fromlist([0] * (8 - remainder))
s = a.tostring()
s = binascii.unhexlify(s.translate(_tr_rev_2))
s = binascii.unhexlify(s.translate(_tr_rev_4))
return binascii.unhexlify(s.translate(_tr_rev_16))

I've used these functions to implement a data compression algorithim
in Python.  The algorithm still runs very slow in Python, but it runs
much faster than it would have if I had used Python's bitwise operators.

Ross Ridge

--
l/  //	  Ross Ridge -- The Great HTMU
[oo][oo]  rridge at csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/
db  //

```