How to locate the bit in bits string?
Tim Chase
python.list at tim.thechases.com
Tue Apr 28 21:34:07 EDT 2009
>> You omit some key details -- namely how do you know that
>> "1001" is 4 bits and not "00001001" (8-bits)? If it's a
>> string (as your current code shows), you can determine the
>> length. However, if they are actually ints, your code
>> should work fine & be O(1).
>
> Actually, what I have is a list of integer numbers
> [3,55,99,44], and by using Huffman coding or fixed length
> coding, I will know how the bits-length for each number. When
> I try to concatenate them (say 10,000 items in the list) all
> together, the speed is going do
Ah, this makes more sense!
I'd try creating a bitwriter object that takes fragments of bytes
and writes them sequentially to a file-like object. Something
like this untested class:
class BitWriter:
def __init__(self, outstream):
self.out = outstream
self.datum = 0 # the data accrued
self.bits = 0 # the number of bits we've accrued
def write(self, byte_data, bit_count):
# maybe add some code to ensure
# that byte_data doesn't have any
# significant bits beyond bit_count
datum = (self.datum << bit_count) | byte_data
self.bits += bit_count
if self.bits >= 8:
overflow = self.bits - 8
self.out.write(chr(datum >> overflow))
#discard the stuff we wrote
datum -= (datum >> overflow) << overflow
self.bits -= 8
def close(self):
if self.bits: # we have unflushed data
data = self.datum << (8 - self.bits)
self.out.write(chr(data))
self.out.close()
out = file('compressed.dat', 'wb')
bw = BitWriter(out)
for data, bit_count in source():
out.write(data, bit_count)
out.close()
As mentioned, it's 100% untested, I don't know what sort of
performance it has, nor the endian'ness of your data streams, so
you might have to twiddle bits in the opposite directions and
test the performance.
-tkc
More information about the Python-list
mailing list