On Tue, May 12, 2015 at 1:17 AM, Stefan Otte <stefan.otte@gmail.com> wrote:

indeed I was looking for the cartesian product.

I timed the two stackoverflow answers and the winner is not quite as clear:

n_elements:    10  cartesian  0.00427 cartesian2  0.00172
n_elements:   100  cartesian  0.02758 cartesian2  0.01044
n_elements:  1000  cartesian  0.97628 cartesian2  1.12145
n_elements:  5000  cartesian 17.14133 cartesian2 31.12241

(This is for two arrays as parameters: np.linspace(0, 1, n_elements))
cartesian2 seems to be slower for bigger.

On my system, the following variation on Pauli's answer is 2-4x faster than his for your test cases:

def cartesian4(arrays, out=None):
    arrays = [np.asarray(x).ravel() for x in arrays]
    dtype = np.result_type(*arrays)

    n = np.prod([arr.size for arr in arrays])
    if out is None:
        out = np.empty((len(arrays), n), dtype=dtype)
        out = out.T

    for j, arr in enumerate(arrays):
        n /= arr.size
        out.shape = (len(arrays), -1, arr.size, n)
        out[j] = arr[np.newaxis, :, np.newaxis]
    out.shape = (len(arrays), -1)

    return out.T

I'd really appreciate if this was be part of numpy. Should I create a pull request?

There hasn't been any opposition, quite the contrary, so yes, I would go ahead an create that PR. I somehow feel this belongs with the set operations, rather than with the indexing ones. Other thoughts?

Also for consideration: should it work on flattened arrays? or should we give it an axis argument, and then "broadcast on the rest", a la generalized ufunc?


( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.