How to locate the bit in bits string?
davea at ieee.org
Thu Apr 30 04:35:20 CEST 2009
> On Apr 28, 5:39 pm, Li Wang <li.wan... at gmail.com> wrote:
>> 2009/4/29 Tim Chase <python.l... at tim.thechases.com>:
>>>> I want to concatenate two bits string together: say we have '1001' and
>>>> '111' which are represented in integer. I want to concatenate them to
>>>> '1001111' (also in integer form), my method is:
>>>> ('1001' << 3) | 111
>>>> which is very time consuming.
>>> You omit some key details -- namely how do you know that "1001" is 4 bits
>>> and not "00001001" (8-bits)? If it's a string (as your current code shows),
>>> you can determine the length. However, if they are actually ints, your code
>>> should work fine & be O(1).
>> Actually, what I have is a list of integer numbers [3,55,99,44], and
>> by using Huffman coding or fixed length coding, I will know how the
>> bits-length for each number. When I try to concatenate them (say
>> 10,000 items in the list) all together, the speed is going down
>> quickly (because of the shifting operations of python long).
>>> This can be abstracted if you need:
>>> def combine_bits(int_a, int_b, bit_len_b):
>>> return (int_a << bit_len_b) | int_b
>>> a =x09
>>> b =x07
>>> print combine_bits(a, b, 3)
>>> However, if you're using gargantuan ints (as discussed before), it's a lot
>>> messier. You'd have to clarify the storage structure (a byte string? a
>>> python long?)
>> I am using a single python long to store all the items in the list
>> (say, 10,000 items), so the work does become messier...
> Using GMPY (http://code.google.com/p/gmpy/) may offer a performance
> improvement. When shifting multi-thousand bit numbers, GMPY is several
> times faster than Python longs. GMPY also support functions to scan
> for 0 or 1 bits.
>>> PS: You may want to CC the mailing list so that others get a crack at
>>> answering your questions...I've been adding it back in, but you've been
>>> replying just to me.
>> Sorry, this is the first time I am using mail-list....and always
>> forgot "reply to all"
>> Thank you very much:D
>> Time is all we have
>> and you may find one day
>> you have less than you think
Seems to me you'd do better with a byte array than a Python long. At least while you're doing all those shifts. That way you do the shifts in an int, then update a byte in the array whenever you have 8 bits accumulated.
You can then look up a bit in the array trivially. And if you really need a Python long, I predict the single conversion will be much faster than doing all those shifts on the long as you go.
More information about the Python-list