Mailman 3 Efficiently represent 128-bit integers - NumPy-Discussion

Feb. 2, 2023

      Hello,

I wonder what are the alternatives to efficiently represent 128-bit
integers? I have found a GitHub issue that proposed a support for
int128 data type - https://github.com/numpy/numpy/issues/9992 - but it
is closed now, and I was wondering what are the options meanwhile.

The context of my question: in the end, I am looking for an efficient
way to work UUID values in Pandas (sorry, it goes down to that). Since
I am not doing computations on this data, but only use it as
identifiers, I am mostly looking into the efficient memory usage, I
guess (unless I can get a performance hit for some other related
reasons too?).

With Python's regular integral number type, 128-bit integers use 44
bytes (according to the `sys.getsizeof'). Interestingly, a 16-byte
array in Python uses 49 bytes, so I'm still better off with an
integer. I understand that even with a theoretical np.int128 I'd get
40 bytes (24 + 16), but hey, it is less than 44 anyway. I could
artificially split 128-integers into two int64 numbers (high 64 bits
and low 64 bits) and consider the set of these two as a single value
for all meaningful usages, but this is quite inconvenient and actually
will use more space in the end.

So I wonder are there other options? From the GitHub issue I
understood that NumPy internally uses 128-bit integers for some
purpose, where can I find out more about it?

Thank you!

Efficiently represent 128-bit integers

Tim Candid

Aaron Meurer

tags

participants (2)