[Numpy-discussion] ENH: Add complex random normal generator (PR #9561)

Fri Sep 8 06:56:45 EDT 2017

I would like to add a complex random normal generator to
mtrand/RandomState.

A scalar complex normal is a (double) bivariate normal.  The main
motivation is to simplify the construction of complex normals where are
generally parameterized in terms of three values: location, covariance and
relation.  location is the same as in a standard normal.  The covariance
and the relation jointly determine the variance of the real and imaginary
parts as well as the covariance between the two.

#1 The initial implementation in the PR has followed the standard template
for scalar RV generators with three paths, scalar->scalar, scalar->array
and array->array.  It is bulky since the existing array fillers that handle
the scalar->array and array->array for double rvs cannot be used.  It
supports broadcasting and has a similar API to other scalar RV generators
(e.g. normal).

#2 The PR discussion has moved towards exploiting the relationship with the
multivariate normal. Currently the MV normal doesn't broadcast, and so
following this path would only allow the scalar->scalar and the
scalar->array paths.  This could theoretically be extended to allow
broadcasting if multivariate_normal was extended to allow broadcasting.

#3 If broadcasting is off the table, then it might make more sense to skip
a scalar complex normal and just move directly to a
multivariate_complex_normal since this is also just a higher dimension
(double) multivariate normal. This function could just wrap
multivariate_normal and would be relatively straightforward forward.  The
only down side of this path is that it would not easily support  a
scalar->scalar path, although this could be added.

This probably isn't much of a performance hit for following #2 or #3.  I
checked how normal and multivariate normal perform for large draws:

%timeit np.random.normal(2.0,4.0,size=1000000)
30.8 ms ± 125 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit np.random.multivariate_normal([2.0],[[4.0]],size=1000000)
32.2 ms ± 308 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

For smaller draws the performance difference is larger:

%timeit np.random.normal(2.0,4.0,size=10)
2.95 µs ± 16.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.random.multivariate_normal([2.0],[[4.0]],size=10)
49.4 µs ± 249 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

And for scalars the scalar path is about 30x faster than the multivariate
path.  It is also worth noting that multivariate_normal will only return a
vector even if the inputs only generate a single scalar.

%timeit np.random.normal(2.0,4.0)
1.42 µs ± 3.05 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit np.random.multivariate_normal([2.0],[[4.0]])
47.9 µs ± 167 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

It would be helpful do determine which path,

#1 Clone standard scalar generator with 3 paths including broadcasting
#2 Scalar generator using multivariate normal, excluding broadcasting
#3 Multivariate generator using multivariate normal, excluding broadcasting

is the preferred one.

Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170908/0dc93e18/attachment.html>