Consistency of random number functions APIs.
Some NumPy random number generation functions take a dtype parameter whereas others don't. Some of them take an out parameter whereas others don't. Just glancing at it, there seems to be no rhyme or reason why this would be the case but is there some hidden consistency underneath the hood to explain why some have these params and others don't? Is there any reason that things like random.randn and numpy.random.Generator.normal don't take a dtype and out parameters? If I need to create a huge array of random numbers whose dtype is float16 or float32 then what is the BKM to do this when the routine I would like to use generates an array of float64 and with the 64-bit data type the array won't fit in memory?
On Sat, Oct 30, 2021 at 9:44 PM Todd Anderson <drtodd13@comcast.net> wrote:
Some NumPy random number generation functions take a dtype parameter whereas others don't. Some of them take an out parameter whereas others don't. Just glancing at it, there seems to be no rhyme or reason why this would be the case but is there some hidden consistency underneath the hood to explain why some have these params and others don't? Is there any reason that things like random.randn and numpy.random.Generator.normal don't take a dtype and out parameters?
Let's not compare the legacy and the new API, the former is what it is by now. The Generator API does indeed look a little inconsistent though. There's not many methods, see https://numpy.org/devdocs/reference/random/generator.html#simple-random-data. Basic methods like `integers` and `random` should be consistent I'd think, so one having an out= keyword and the other not is a little odd. Cheers, Ralf
If I need to create a huge array of random numbers whose dtype is float16 or float32 then what is the BKM to do this when the routine I would like to use generates an array of float64 and with the 64-bit data type the array won't fit in memory? _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: ralf.gommers@gmail.com
Some NumPy random number generation functions take a dtype parameter whereas others don't. Some of them take an out parameter whereas others don't. Just glancing at it, there seems to be no rhyme or reason why this would be the case but is there some hidden consistency underneath the hood to explain why some have these params and others don't? Is there any reason that things like random.randn and numpy.random.Generator.normal don't take a dtype and out parameters? If I need to create a huge array of random numbers whose dtype is float16 or float32 then what is the BKM to do this when the routine I would like to use generates an array of float64 and with the 64-bit data type the array won't fit in memory?
There is definitely inconsistency. The out keyword was added to a core set of floating generators as part of a feature request to the project that became Generator. The dtype argument was the same. integers already had dtype support so it was different from the core generators that have 32-bit implementations. The floating generators that have `dtype` support all use bit-efficient methods internally and so are not just `astype(np.float32)` wrappers around the double-precision implementations. IMO a redesign of the internals would be needed to support `out` and `dtype`. There is an issue about rewriting variate construction functions as ufuncs so that one would get all of these for free. This would need a dedicated individual to undertake it since it would not be possible (as far as I know) to do this using Cython. This would also raise the maintenance bar for many users. Another possible approach would be to use C++ and templates now that this is moving into (slowly) mainstream. In the meantime, Generator is meant to be extensible using either compiled code (C/Cython) or JIT code (Numba), so that you can pretty easily produce fast, consistent generation code across a wide range of use cases. Finally as noted previously, all of the generators in the np.random namespace, which are methods of a RandomState object, are forever frozen in their API (per NEP, and also effectively frozen in implementation as well). The only place for changes is np.random.Generator (or any successor). Kevin
participants (3)
-
Kevin Sheppard
-
Ralf Gommers
-
Todd Anderson