[Numpy-discussion] A little about XND
skrah at bytereef.org
Mon Jun 18 15:09:50 EDT 2018
On Mon, Jun 18, 2018 at 12:34:03PM -0400, Marten van Kerkwijk wrote:
> That looks quite nice and expressive. In the context of a discussion we
> have been having about describing `matmul/@` and possibly broadcastable
> dimensions, I think from your description it sounds like one would describe
> `@` with multiple functions (the multiple dispatch we have been (are?)
> considering as well):
> "... * N * M * T, ... * M * P * T -> ... * N * P * T"
> "M * T, ... * M * P * T -> ... P * T"
> "... * N * M * T, M * T -> ... * N * T"
> "M * T, M * T -> T"
Yes, that's the way, and the outer dimensions (the part matched by the
ellipsis) are always broadcast like in NumPy.
> Is there a way to describe broadcasting? The sample case we've come up
> with is a function that calculates a weighted mean. This might take
> (values, sigmas) and return (mean, sigma_mean), which would imply a
> signature like:
> "... N * T, ... N * T -> ... * T, ... * T"
> But would your signature allow indicating that one could pass in a single
> sigma? I.e., broadcast the second 1 to N if needed?
Actually I came across this today when implementing optimized matching
for binary functions.
I wanted the faster kernel
"... * N * int64, ... * N * int64 -> ... * N * int64"
to also match e.g. the input
"int64, 10 * int64".
The generic datashape spec would forbid this, but perhaps the '?' that
you propose in nep-0020 would offer a way out of this for ndtypes.
It's a bit confusing for datashape, since there is already a questionmark
for missing variable dimensions (that have shape==0 in the data).
>>> ndt("var * ?var * int64")
ndt("var * ?var * int64")
This would be the type for e.g. [, None, [1,2,3]].
But for symbolic dimensions (which only match fixed dimensions) perhaps this
"... * ?N * int64, ... * ?N * int64 -> ... * ?N * int64"
or, as in the NEP,
"... * N? * int64, ... * N? * int64 -> ... * N? * int64"
should mean "At least one input has ndim >= 1, broadcast as necessary".
This still means that for the "all ndim==0" case one would need an
additional kernel "int64, int64 -> int64".
> I realize that this is no longer about describing precisely what the
> function doing the calculation expects, but rather what an upper level is
> allowed to do before calling the function (i.e., take a dimension of 1 and
> broadcast it).
Yes, for datashape the problem is that it also allows non-broadcastable
signatures like "N * float64", really the same as "double x" in C.
But the '?' with occasionally one additional kernel for ndim==0 could
More information about the NumPy-Discussion