On Fri, Nov 9, 2012 at 10:20 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Fri, Nov 9, 2012 at 2:45 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
>
>
>
> On Fri, Nov 9, 2012 at 3:32 AM, Nathaniel Smith <njs@pobox.com> wrote:
>>
>> On Fri, Nov 9, 2012 at 6:18 AM, Nicolas SCHEFFER
>> <scheffer.nicolas@gmail.com> wrote:
>> > Fred,
>> >
>> > Thanks for the advice.
>> > The code will only affect the part in _dotblas.c where gemm is called.
>> > There's tons of check before that make sure both matrices are of ndim 2.
>> > We should check though if we can do these tricks in other parts of the
>> > function.
>> >
>> > Otherwise:
>> > - I've built against ATLAS 3.10
>> > - I'm happy to add a couple more test for C and F-contiguous. I'm not
>> > sure how to get the third type (strided), would you have an example?
>>
>> def with_memory_order(a, order):
>>     assert order in ("C", "F", "discontig")
>>     assert a.ndim == 2
>>     if order in ("C", "F"):
>>         return np.asarray(a, order=order)
>>     else:
>>         buf = np.empty((a.shape[0] * 2, a.shape[1] * 2), dtype=a.dtype)
>>         buf[::2, ::2] = a
>>         # This returns a view onto every other element of 'buf':
>>         result = buf[::2, ::2]
>>         assert not result.flags.c_contiguous and not
>> result.flags.f_contiguous
>>         return result
>>
>> > The following test for instance checks integrity against
>> > multiarray.dot, which I believe is default when not compiled with
>> > BLAS.
>> > Dot is a hard function to test imho, so if anybody has ideas on what
>> > kind of test they'd like to see, please let me know.
>> >
>> > If that's ok I might now be able to:
>> > - Check for more bugs, I need to dig a bit more in the gemm call, make
>> > sure everything is ok.
>> > - Create an issue on github and link to this discussion
>> > - Make a commit in a seperate branch
>> > - Move forward like that.
>> >
>> > ==
>> > import numpy as np
>> > from time import time
>> > from numpy.testing import assert_almost_equal
>> >
>> > def test_dot_regression():
>> >     """ Test numpy dot by comparing with multiarray dot
>> >     """
>> >     np.random.seed(7)
>> >     a = np.random.randn(3, 3)
>> >     b = np.random.randn(3, 2)
>> >     c = np.random.randn(2, 3)
>> >
>> >     _dot = np.core.multiarray.dot
>> >
>> >     assert_almost_equal(np.dot(a, a), _dot(a, a))
>> >     assert_almost_equal(np.dot(b, c), _dot(b, c))
>> >     assert_almost_equal(np.dot(b.T, c.T), _dot(b.T, c.T))
>> >
>> >     assert_almost_equal(np.dot(a.T, a), _dot(a.T, a))
>> >     assert_almost_equal(np.dot(a, a.T), _dot(a, a.T))
>> >     assert_almost_equal(np.dot(a.T, a.T), _dot(a.T, a.T))
>>
>> You should check that the result is C-contiguous in all cases too.
>>
>> for a_order in ("C", "F", "discontig"):
>>     for b_order in ("C", "F", "discontig"):
>>         this_a = with_memory_order(a, a_order)
>>         this_b = with_memory_order(b, b_order)
>>         result = np.dot(this_a, this_b)
>>         assert_almost_equal(result, expected)
>>         assert result.flags.c_contiguous
>>
>> You could also wrap the above in yet another loop to try a few
>> different combinations of a and b matrices (perhaps after sticking the
>> code into a utility function, like run_dot_tests(a, b, expected), so
>> the indentation doesn't get out of hand ;-)). Then you can easily test
>> some of the edge cases, like Nx1 matrices.
>
>
> I agree that tests are needed the for Nx1 and variant cases. I saw blas
> error being raised with some blas version.
>
> You also need to test with the output provided, so there is 3 loops:
>
> for a_order in ("C", "F", "discontig", "neg"):
>     for b_order in ("C", "F", "discontig", "neg"):
>         for c_order in ("C", "F", "discontig", "neg"):
>
> I also added the stride type "neg", I'm not sure if it is needed, but that
> is other corner cases. neg =>  result = buf[::-1, ::-1]
>
> I just looked again at our code and there is another constrain: that the
> strides are multiple of the elemsize. Theano do not support not aligned
> array, but numpy does, so there is a need for test for this. You can make an
> unaligned array like this:
>
> dtype = "b1,f4"
> a = numpy.empty(1e4, dtype=dtype)['f1']

I think you're mixing up two issues here... requiring that the strides
and the itemsize match is part of the requirement for C- and
F-contiguity, the example code you give here produces a non-contiguous
array, so it's already checked for by the function we're talking
about.

However, there can be contiguous arrays that are not aligned, like:

In [25]: a = np.empty(100, dtype="i1")[1:-3].view(np.float32)

In [26]: a.flags.c_contiguous
Out[26]: True

In [27]: a.flags.aligned
Out[27]: False

I suspect the np.dot code that Nicolas is looking at already checks
for the ALIGNED flag and makes a copy if necessary, but it would be
good to have an array like this in the tests to be sure.

You are right, my test wasn't good and your example is good.

Nicolas, I hope I don't de-motivate you with all those details, this need to be done. When dealing with blas lib, the devil is in the detail... and the fact that not all lib accept the same inputs... We hit a few of them. I just try to tell you our experience to make the implementation easier and shorter to stabilize.

Fred