The easiest way to do this would to to write a pure python implementation using Python ints of a masked integer sampler.  This way you could draw unsigned integers and then treat this as a bit pool.  You would than take the number of bits needed for your integer, transform these to be a Python int, and finally apply the mask.  

This is how integers are generated in the legacy Random state code.  

Kevin


On Sat, Aug 19, 2023, 15:43 Dan Schult <dschult@colgate.edu> wrote:
How can we use numpy's random `integers` function to get uniformly selected integers from an arbitrarily large `high` limit? This is important when dealing with exact probabilities in combinatorially large solution spaces.

I propose that we add the capability for `integers` to construct arrays of type object_ by having it construct python int's as the objects in the returned array. This would allow arbitrarily large integers.

The Python random library's `randrange` constructs values for arbitrary upper limits -- and they are exact when using subclasses of `random.Random` with a `getrandbits` methods (which includes the default rng for most operating systems).

Numpy's random `integers` function rightfully raises on `integers(20**20, dtype=int64)` because the upper limit is above what can be held in an `int64`. But Python `int` objects store arbitrarily large integers. So I would expect `integers(20**20, dtype=object)` to create random integers on the desired range. Instead a TypeError is raised `Unsupported dtype dtype('O') for integers`. It seems we could provide support for dtype('O') by constructing Python `int` values and this would allow arbitrarily large ranges of integers.

The core of this functionality would be close to the seven lines used in [the code of random.Random._randbelow](https://github.com/python/cpython/blob/eb953d6e4484339067837020f77eecac61f8d4f8/Lib/random.py#L242) which
1) finds the number of bits needed to describe the `high` argument.
2) generates that number of random bits.
3) converts them to a python int and checks if it is larger than the input `high`. If so, repeat from step 2.

I realize that people can just use `random.randrange` to obtain this functionality, but that doesn't return an array, and uses a different RNG possibly requiring tracking two RNG states. 

This text was also used to create [Issue #24458](https://github.com/numpy/numpy/issues/24458)
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-leave@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: kevin.k.sheppard@gmail.com