# [Numpy-discussion] Funny business with 'is' operator?

Robert Kern robert.kern at gmail.com
Mon Sep 9 08:10:42 EDT 2013

```On Fri, Sep 6, 2013 at 6:50 PM, James Bergstra <bergstrj at iro.umontreal.ca>
wrote:
>
> Thanks, this is exactly what I was looking for. I'll look into what this
Diophantine equation is.

Let's say we have two arrays with shape tuples `shape0` and `shape1`,
stride tuples `stride0` and `stride1` and memory offsets `offset0` and
`offset1`. Without loss of generality, let's assume that itemsize=1 since
it is trivial to convert the general case to itemsize=1. Now, if you will
permit Einstein summation notation, you can generate the memory address of
every item in the first array like so:

index0[i]*stride0[i] + offset0
0 <= i < len(shape0)
0 <= index0[i] < shape0[i]

There exists an overlap between the two arrays iff there exists two tuples
`index0` and `index1` such that

index0[i]*stride0[i] + offset0 = index1[j]*stride0[j] + offset1
0 <= i < len(shape0)
0 <= index0[i] < shape0[i]
0 <= j < len(shape1)
0 <= index1[j] < shape1[j]

This is a bounded linear Diophantine equation. Diophantine because the
variables `index0[i]` and `index1[j]` are integer-valued, linear because
the variables only appear multiplied by a constant, and bounded because
each variable must stay within the size of its corresponding axis.

Unbounded linear Diophantine equations can be solved using an extended
version of the Euclidean GCD algorithm. Bounded linear Diophantine
equations are NP-<mumble> to solve. With the right heuristics, a
branch-and-bound approach could probably work well.

> Also, relatedly, a few months ago Julian Taylor at least wrote what was
there in C, which made it faster, if not better.

The main benefit of that was to make it available at the C level for other
C-implemented routines, IIRC, not speed.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130909/370aca46/attachment.html>
```