# How to locate the bit in bits string?

Dave Angel davea at ieee.org
Thu Apr 30 04:35:20 CEST 2009

```casevh wrote:
> On Apr 28, 5:39 pm, Li Wang <li.wan... at gmail.com> wrote:
>
>> 2009/4/29 Tim Chase <python.l... at tim.thechases.com>:
>>
>>
>>>> I want to concatenate two bits string together: say we have '1001' and
>>>> '111' which are represented in integer. I want to concatenate them to
>>>> '1001111' (also in integer form), my method is:
>>>> ('1001' << 3) | 111
>>>> which is very time consuming.
>>>>
>>> You omit some key details -- namely how do you know that "1001" is 4 bits
>>> and not "00001001" (8-bits)?  If it's a string (as your current code shows),
>>> you can determine the length.  However, if they are actually ints, your code
>>> should work fine & be O(1).
>>>
>> Actually, what I have is a list of integer numbers [3,55,99,44], and
>> by using Huffman coding or fixed length coding, I will know how the
>> bits-length for each number. When I try to concatenate them (say
>> 10,000 items in the list) all together, the speed is going down
>> quickly (because of the shifting operations of python long).
>>
>>
>>
>>
>>> This can be abstracted if you need:
>>>
>>>  def combine_bits(int_a, int_b, bit_len_b):
>>>    return (int_a << bit_len_b) | int_b
>>>
>>>  a =x09
>>>  b =x07
>>>  print combine_bits(a, b, 3)
>>>
>>> However, if you're using gargantuan ints (as discussed before), it's a lot
>>> messier.  You'd have to clarify the storage structure (a byte string?  a
>>> python long?)
>>>
>> I am using a single python long to store all the items in the list
>> (say, 10,000 items), so the work does become messier...
>>
>
> Using GMPY (http://code.google.com/p/gmpy/) may offer a performance
> improvement. When shifting multi-thousand bit numbers, GMPY is several
> times faster than Python longs. GMPY also support functions to scan
> for 0 or 1 bits.
>
>
>>> -tkc
>>>
>>> PS:  You may want to CC the mailing list so that others get a crack at
>>>
>> Sorry, this is the first time I am using mail-list....and always
>>
>> Thank you very much:D
>>
>>
>>
>> --
>> Li
>> ------
>> Time is all we have
>> and you may find one day
>> you have less than you think
>>
Seems to me you'd do better with a byte array than a Python long.  At least while you're doing all those shifts.  That way you do the shifts in an int, then update a byte in the array whenever you have 8 bits accumulated.

You can then look up a bit in the array trivially.  And if you really need a Python long, I predict the single conversion will be much faster than doing all those shifts on the long as you go.

```