Re: [SciPy-dev] New clapack and flapack

Jan. 19, 2002

      On Sat, 19 Jan 2002, eric wrote:
...
Hey Pearu,
Jeeze.  You work fast.  I haven't even had a chance to grab the new f2py
yet...  I'm guessing I need to update f2py from CVS to try all this out.
I'll try and spend some time with this today, but it may be tomorrow before
I have a chance to take a look.
I made also a tarball available with the latest changes.
...
...
3.
  LU,piv,X,info = sgesv(A,B,rowmajor=1,copy_A=1,copy_B=1)
  LU,piv,info = sgetrf(A,rowmajor=1,copy_A=1)
A few points about naming.
1.
SciPy has settled on all lower case convention for consistency across
interfaces.  So using:
lu,piv,x,info = sgesv(a,b,rowmajor=1,copy_a=1,copy_b=1)
will fit the scheme.
Ok, that's not a problem. It is easy to fix by query-replace in emacs...
Just thought that capitalized arguments will look more like matrices.
...
2.
Second, is "copy_A" a f2py generated name or can we specify it?  I prefer
overwrite = 0 to copy=1 because it is more explicit about what is happening.
It really warns people that they are clobbering data when they choose to set
the flag.  Also, the "A" part of copy_A will be ambiguous during usage.  In
the worst case for the sgesv example, imagine someone does the following:
...
...
...
a = rhs_vector
b = lhs_matrix
lu,piv,x,info = sgesv(b,a,copy_a=1,copy_b=0)
If the person is trying to save their variable "a" here, they didn't do it,
because the "copy_a" is refering to the function's argument "a", not the
variable bound to it which is "b" in this example.  I ran into this
conundrum once when writing an interface generator in the past.
It be better if the names somehow were more descriptive.  For sgetrf, the
following seems fine:
lu,piv,info = sgetrf(a,overwrite=0,rowmajor=1)
For sgesv, the naming isn't quite as obvious.  Maybe,
lu,piv,x,info = sgesv(a,b,overwrite_rhs=0,overwrite_lhs=0)
copy_A is a f2py generated name. overwrite certainly is more suitable to
warn users of an in situ changes (I assume that intent(copy) is not
used for arguments that are not changed in situ by the wrapped function).
However, intent(copy), or should I call it now intent(overwrite), may
have wider applications than wrapping lapack. There overwrite arguments
may be used differently, that is, for one argument this flag is set to 0,
and for the other one, to 1. So, I would like to keep overwrite_<name>
rather than having a global overwrite keyword argument.
...
3.
In the C API's, I think the overwrite argument will be used much more than
the rowmajor flag.  It should come first in the argument list.  In fact, in
the "old" wrappers, rowmajor was just hardcoded to 1 because the flalpack
routines were considered the colummajor functions.  Keeping it is fine, it
just should go to the end.
>>> lu,piv,info = sgetrf(a, overwrite=0, rowmajor=1)
Not sure how hard these changes are -- I expect they might require some
enhancements to f2py.  I'm not opposed to having Python wrappers around the
routines that help with this stuff.  We used to have to do that for the
clapack wrappers (until your additions to f2py's capabilities made this
unnecessary).  If it is easy to get f2py to do this stuff, then, of course,
that is better.
Since overwrite argument is f2py generated and rowmajor is specified in
the routine signature, you see that rowmajor becomes first (f2py
generated arguments are alway appended to the end of argument list). I
agree that rowmajor will be probably rarely used but I don't have any good
ideas how to tell f2py about user-specified argument ordering. I
looked to it and found that the required changes are not just quick hacks.

On the other hand. I don't think that lapack or blas routines will be
often used in interactive work. They will be more like to be used in
nice interfaces such as solve() etc that can have arguments ordered as we
want.
In using f2py wrappers, I would always recommend
...
...
...
lu,piv,info = sgetrf(a, overwrite_a=0)
instead of
...
...
...
lu,piv,info = sgetrf(a, 0)
that is less informative.
...
Comment on testing.
There was a fairly complete set of tests written for the *interfaces* of the
blas routines.  These guys only tested how the interfaces worked and didn't
do much testing of the functions accuracy.  We need both.  I don't remember
if the LAPACK tests were as complete -- I certainly didn't write them, but
Travis O. may have.
Based on the errors we saw a week or two ago in the CLAPACK functions, we
really do need to get some tests implemented for this stuff.  You and I
probably handle writing the interface tests (very tedious, but maybe not as
bad with all your changes to f2py), but we really should find an expert to
put the linear algebra routines through their paces.  We can write some
"smoke tests" though that make sure LU and solvers, etc. work for general
problems.  I'd stick this way up there as one of the most important packages
in SciPy, so we should really try and make sure it works right!
Maybe there is a test set of matrices laying around somewhere to help us out
in all this.  Anyone know of such a thing?  -- perhaps in the LAPACK
distribution itself.
Yes, testing is important. I don't dare to claim that f2py is bug free ;-)
but testing whether the interface works should not be the main goal for
linalg, indeed.

More important is that the interface acts according to its documentation.
I mean here things like whether getrf really returns mathematically
correct LU decomposition (leaving aside rounding errors). If it would
return, for example, a transpose LU (which would be incorrect), then
passing LU to getrs (that may have the same bug), the final result may be
correct.

I looked already the tests in the lapack distribution but they seem rather
difficult to use for us. These tests seem to be "hardcoded" to the Fortran
files and extracting the test data from there is not simple.

Testing linalg stuff is not a small or trivial project.
Any one wants to volunteer?
...
...
Few remarks:
1) The sources to these extension modules are quite big (more than 5000
lines in both). Should we factor these modules to
   cslapack.so cclapack.so cdlapack.so czlapack.so
   fslapack.so fclapack.so fdlapack.so fzlapack.so
(any better name convention?). What do you think?
Well, I think in practice, they'll rarely be needed separately.  Having big
wrappers isn't that big a deal (heck, I've had SWIG wrappers that were 60000
lines...).  I'd vote for keeping them all together unless there is a
compelling reason besides size to split them.  This keeps things a little
simpler (import once, get everything).
That issue is then solved. Good.
...
...
2) How shall I commit all this to the scipy CVS repository? In a
subdirectory of linalg? Eventually it should be in linalg directory, I
think.
We'll since we don't know how to do CVS branches reliably... :| Why don't
you make a directory called linalg2.  We'll work in there until it gets
stable.
Ok.
...
Pearu, this all looking very promising.  Thanks a ton.
Your are very welcome.
	Pearu

Re: [SciPy-dev] New clapack and flapack

Pearu Peterson