[Numpy-discussion] Proposal for adding bit_count
Sebastian Berg
sebastian at sipsolutions.net
Mon Aug 2 13:50:14 EDT 2021
On Thu, 2021-07-29 at 21:46 +0530, Ganesh Kathiresan wrote:
> Hi All,
>
>
>
> I am working <https://github.com/numpy/numpy/pull/19355> on a new
> UFunc, `
> bit_count <https://en.wikipedia.org/wiki/Hamming_weight>` (popcount
> in
> other languages) that aims to count the number of 1-bits in
> the absolute value of an Integer.
>
Thanks for the proposal! Since `int.bit_count()` is now a Python
builtin (as a method on integers), I feel it is a good idea to add it
to NumPy.
It is requested fairly commonly as well.
Aside from comments about the inclusion in NumPy, I had two questions
where I would love input:
* Should `np.ndarray.bit_count()` exist? I tend against this;
but we should have it on (integer) scalars to mirror the
Python `int`.
* The return value is currently the same type as the input. That
means that: `np.bit_count(uint8)` returns the count as `uint8`
while `np.bit_count(int32)` returns it as `int32`, etc.
I think `bit_count` is different from typical math functions,
so I am not quite sure this is what we want? The main
alternative I see right now would be returning the default
integer (usually int64) – unless otherwise specified.
As an aside, I am not sure what is returned for booleans right now
int8, uint8, or boolean? (Returning boolean for a count seems an
oversight though).
Cheers,
Sebastian
>
> Implementation
> ----------------------------------
>
> The primary reference for the implementation is CountBitsSetParallel
> <
> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
> >.
> Here we take 12 operations to achieve the result which is the same as
> the
> lookup table method but does not suffer from memory issues or cache
> misses.
>
> The implementation is aimed at unsigned integers, absolute value of
> signed
> integers and objects that support the operation.
>
>
> Usage
> --------------
>
> >>> np.bit_count(1023)
>
> 10
>
> >>> a = np.array([2**i - 1 for i in range(16)])
>
> >>> np.bit_count(a)
>
> array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
> 14, 15])
>
> >>> np.int32(1023).bit_count()
>
> 10
>
>
> Notes
> -------------
>
> 1. Python has included this method here
> <
> https://github.com/python/cpython/commit/8bd216dfede9cb2d5bedb67f20a30c99844dbfb8
> >
> (3.10+). Tracking issue <https://bugs.python.org/issue29882>
>
> 2. NumPy tracking issue
> <https://github.com/numpy/numpy/issues/16325>
>
> 3. Interesting read
> <
> https://archive.org/details/dr_dobbs_journal_vol_08/page/n185/mode/2up
> > on
> how we get the magic number. Needed a bit of digging :)
>
>
> Please let us know what you think about the implementation and where
> we can
> improve in terms of performance or interface.
>
> Regards,
> Ganesh
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210802/97ddc81a/attachment.sig>
More information about the NumPy-Discussion
mailing list