[Numpy-discussion] aligned / unaligned structured dtype behavior

Frédéric Bastien nouiz at nouiz.org
Fri Mar 8 10:16:43 EST 2013

On Fri, Mar 8, 2013 at 5:22 AM, Francesc Alted <francesc at continuum.io> wrote:
> On 3/7/13 7:26 PM, Frédéric Bastien wrote:
>> I'm surprised that Theano worked with the unaligned input. I added
>> some check to make this raise an error, as we do not support that!
>> Francesc, can you check if Theano give the good result? It is possible
>> that someone (maybe me), just copy the input to an aligned ndarray
>> when we receive an not aligned one. That could explain why it worked,
>> but my memory tell me that we raise an error.
> It seems to work for me:
> In [10]: f = theano.function([a], a**2)
> In [11]: f(baligned)
> Out[11]: array([ 1.,  1.,  1., ...,  1.,  1.,  1.])
> In [12]: f(bpacked)
> Out[12]: array([ 1.,  1.,  1., ...,  1.,  1.,  1.])
> In [13]: f2 = theano.function([a], a.sum())
> In [14]: f2(baligned)
> Out[14]: array(1000000.0)
> In [15]: f2(bpacked)
> Out[15]: array(1000000.0)

I understand what happen. You declare the symbolic variable like this:

a = theano.tensor.vector()

This create a symbolic variable with dtype floatX that is float64 by
default. baligned and bpacked are of dtype int64.

When a Theano function receive as input an ndarray of the wrong dtype,
we try to cast it to the good dtype and check we don't loose
precission. As the input are only 1s, there is no lost of precission,
so the input is silently accepted and copied. So when we check later
for the aligned flags, it pass.

If you change the symbolic variable to have a dtype of int64, there
won't be a copy and we will see the error:

a = theano.tensor.lvector()
f = theano.function([a], a ** 2)

TypeError: ('Bad input argument to theano function at index
0(0-based)', 'The numpy.ndarray object is not aligned. Theano C code
does not support that.', '', 'object shape', (1000000,), 'object
strides', (9,))

If I time now this new function I have:

In [14]: timeit baligned**2
100 loops, best of 3: 7.5 ms per loop

In [15]: timeit bpacked**2
100 loops, best of 3: 8.25 ms per loop

In [16]: timeit f(baligned)
100 loops, best of 3: 7.36 ms per loop

So the Theano overhead was the copy in this case. It is not the first
time I saw this. We added the automatic cast to allow specifing most
python int/list/real as input.


More information about the NumPy-Discussion mailing list