Add count (and dtype) to packbits
In my application I need to pack bits of a specified group size into integral values. Currently np.packbits only packs into full bytes. For example, I might have a string of bits encoded as a np.uint8 vector with each uint8 item specifying a single bit 1/0. I want to encode them 4 bits at a time into a np.uint32 vector. python code to implement this: --------------- def pack_bits (inp, bits_per_word, dir=1, dtype=np.int32): assert bits_per_word <= np.dtype(dtype).itemsize * 8 assert len(inp) % bits_per_word == 0 out = np.empty (len (inp)//bits_per_word, dtype=dtype) i = 0 o = 0 while i < len(inp): ret = 0 for b in range (bits_per_word): if dir > 0: ret |= inp[i] << b else: ret |= inp[i] << (bits_per_word - b - 1) i += 1 out[o] = ret o += 1 return out --------------- It looks like unpackbits has a "count" parameter but packbits does not. Also would be good to be able to specify an output dtype.
On Wed, Jul 21, 2021 at 2:40 PM Neal Becker <ndbecker2@gmail.com> wrote:
In my application I need to pack bits of a specified group size into integral values. Currently np.packbits only packs into full bytes. For example, I might have a string of bits encoded as a np.uint8 vector with each uint8 item specifying a single bit 1/0. I want to encode them 4 bits at a time into a np.uint32 vector.
python code to implement this:
--------------- def pack_bits (inp, bits_per_word, dir=1, dtype=np.int32): assert bits_per_word <= np.dtype(dtype).itemsize * 8 assert len(inp) % bits_per_word == 0 out = np.empty (len (inp)//bits_per_word, dtype=dtype) i = 0 o = 0 while i < len(inp): ret = 0 for b in range (bits_per_word): if dir > 0: ret |= inp[i] << b else: ret |= inp[i] << (bits_per_word - b - 1) i += 1 out[o] = ret o += 1 return out ---------------
Can't you just `packbits` into a uint8 array and then convert that to uint32? If I change `dtype` in your code from `np.int32` to `np.uint32` (as you mentioned in your email) I can do this: rng = np.random.default_rng() arr = (rng.uniform(size=32) < 0.5).astype(np.uint8) group_size = 4 original = pack_bits(arr, group_size, dtype=np.uint32) new = np.packbits(arr.reshape(-1, group_size), axis=-1, bitorder='little').ravel().astype(np.uint32) print(np.array_equal(new, original)) # True There could be edge cases where the result dtype is too small, but I haven't thought about that part of the problem. I assume this would work as long as `group_size <= 8`. András
It looks like unpackbits has a "count" parameter but packbits does not. Also would be good to be able to specify an output dtype. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Well that's just the point, I wanted to consider group size > 8. On Wed, Jul 21, 2021 at 8:53 AM Andras Deak <deak.andris@gmail.com> wrote:
On Wed, Jul 21, 2021 at 2:40 PM Neal Becker <ndbecker2@gmail.com> wrote:
In my application I need to pack bits of a specified group size into integral values. Currently np.packbits only packs into full bytes. For example, I might have a string of bits encoded as a np.uint8 vector with each uint8 item specifying a single bit 1/0. I want to encode them 4 bits at a time into a np.uint32 vector.
python code to implement this:
--------------- def pack_bits (inp, bits_per_word, dir=1, dtype=np.int32): assert bits_per_word <= np.dtype(dtype).itemsize * 8 assert len(inp) % bits_per_word == 0 out = np.empty (len (inp)//bits_per_word, dtype=dtype) i = 0 o = 0 while i < len(inp): ret = 0 for b in range (bits_per_word): if dir > 0: ret |= inp[i] << b else: ret |= inp[i] << (bits_per_word - b - 1) i += 1 out[o] = ret o += 1 return out ---------------
Can't you just `packbits` into a uint8 array and then convert that to uint32? If I change `dtype` in your code from `np.int32` to `np.uint32` (as you mentioned in your email) I can do this:
rng = np.random.default_rng() arr = (rng.uniform(size=32) < 0.5).astype(np.uint8) group_size = 4 original = pack_bits(arr, group_size, dtype=np.uint32) new = np.packbits(arr.reshape(-1, group_size), axis=-1, bitorder='little').ravel().astype(np.uint32) print(np.array_equal(new, original)) # True
There could be edge cases where the result dtype is too small, but I haven't thought about that part of the problem. I assume this would work as long as `group_size <= 8`.
András
It looks like unpackbits has a "count" parameter but packbits does not. Also would be good to be able to specify an output dtype. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
-- Those who don't understand recursion are doomed to repeat it
participants (2)
-
Andras Deak
-
Neal Becker