From pauldubois at home.com  Thu Jan 20 18:07:52 2000
From: pauldubois at home.com (Paul F. Dubois)
Date: Thu, 20 Jan 2000 15:07:52 -0800
Subject: [Numpy-discussion] test
Message-ID: <NDBBIEFMILBFPMDHJIMFEEAGCCAA.pauldubois@home.com>

This is a test.
Ignore.

Paul


From aurag at crm.umontreal.ca  Sat Jan 22 11:25:36 2000
From: aurag at crm.umontreal.ca (Hassan Aurag)
Date: Sat, 22 Jan 2000 16:25:36 GMT
Subject: [Numpy-discussion] Hello
Message-ID: <20000122.16253600@adam-aurag.penguinpowered.com>

 Hi,

I have just seen with pleasure that numpy is on sourceforge. So 
welcome.

I am maintaining and writing a generic mathematical session manager, 
in which I have a numpy session.

The project if you haven't noticed it is at 
http://gmath.sourceforge.net

Now, this done, I'd like to know if you will be making rpm/tar.gz 
releases available on your site always and preferably on the anonymous 
ftp site for latest release, like in:
ftp://numpy.sourceforge.net/latest/srpms ....

If yes, then I'd be happy to make the app go fetch the packages from 
that site or another instead of making the srpms myself (which means 
possible bugs!).

Thank you in advance.

H. Aurag
aurag at crm.umontreal.ca
aurag at users.sourceforge.net
 

From pearu at ioc.ee  Wed Jan 26 12:00:31 2000
From: pearu at ioc.ee (Pearu Peterson)
Date: Wed, 26 Jan 2000 19:00:31 +0200 (EET)
Subject: [Numpy-discussion] Proclamation: column-wise arrays
Message-ID: <Pine.HPX.4.05.10001261719130.25917-100000@egoist.ioc.ee>

Hi!

Problem:
	Using Fortran routines from Python C/API is "tricky" when
multi-dimensional arrays are passed in.

Cause:
	Arrays in Fortran are stored in column-wise order while arrays in
C are stored in row-wise order.

Standard solutions:
	1) Create a new C array; copy the data from the old one in
column-wise order; pass the new array to fortran; copy changed array back
to old one in row-wise order; deallocate the array.
	2) Change the storage order of an array in place: element-wise
swapping; pass the array to fortran; change the storage order back with
element-wise swapping

Why standard solutions are not good?
	1) Additional memory allocation, that is problem for large arrays;
Element-wise copying is time consuming (2 times).
	2) It is good as no extra memory is needed but element-wise
swapping (2 times) is approx. equivalent with the element-wise copying (4
times).

Proclamation:
	Introduce a column-wise array to Numeric Python where data is
stored in column-wise order that can be used specifically for fortran
routines.

Proposal sketch:
	1) Introduce a new flag `row_order'to PyArrayObject structure:
row_order == 1  -> the data is stored in row-wise order (default, as it is
		now)
row_order == 0  -> the data is stored in column-wise order
Note that now the concept of contiguousness depends on this flag. 
	2) Introduce new array "constructors" such as PyArray_CW_FromDims,
PyArray_CW_FromDimsAndData, PyArray_CW_ContiguousFromObject,
PyArray_CW_CopyFromObject, PyArray_CW_FromObject, etc. that all return
arrays with row_order=0 and data stored in column-wise order (that is in
case of contiguous results, otherwise strides feature is employd).
	3) In order to operations between arrays (possibly with different 
storage order) would work correctly, many internal functions of NumPy
C/API need to be modifyied.
	4) anything else?

What is the good of this?
	1) The fact is that there is a large number of very good scietific
tools freely available written in Fortran (Netlib, for instance). And I
don't mean only Fortran 77 codes but also Fortran 90/95 codes.
	2) Having Numeric Python arrays with data stored in column-wise
order, calling Fortran routines from Python becomes really efficient and
space-saving.
	3) There should be little performance hit if, say, two
arrays with different storage order are multiplied (compared to the
operations between non-contiguous arrays in the current implementation).
	4) I don't see any reason why older C/API modules would broke
because of this change if it is carried out carefully enough. So,
back-ward compability should be there.
	5) anything else?

What are against of this?
	1) Lots of work but with current experience it should not be a
problem.
	2) The size of the code will grow.
	3) I suppose that most people using Numerical Python will not care
of calling Fortran routines from Python. Possible reasons: too "tricky" or
no need. In the first case, the answer is that there are tools such as
PyFort, f2py that solve this problem. In the later case, there is no
problem:-)
	4) anything else?

I understand that my proposal is quite radical but taking into account
that we want to use Python for many years to come, the use would be more
pleasing if one cause of (constant) confusion would be less during this
time.

Best regards,
	Pearu


From Oliphant.Travis at mayo.edu  Wed Jan 26 12:45:46 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Wed, 26 Jan 2000 11:45:46 -0600 (CST)
Subject: [Numpy-discussion] Re: Proclamation: column-wise arrays
In-Reply-To: <Pine.HPX.4.05.10001261719130.25917-100000@egoist.ioc.ee>
Message-ID: <Pine.LNX.4.10.10001261123050.1022-100000@us2.mayo.edu>

> Proclamation:
> 	Introduce a column-wise array to Numeric Python where data is
> stored in column-wise order that can be used specifically for fortran
> routines.

This is a very interesting proposal that we should consider carefully.   I
seem to recall reading that Jim Hugunin originally had this idea in
mind when he established the concept of contiguousness, etc.

My current thoughts on this issue are that it is of only syntatic value
and seems like a lot of extra code has to be written in order to provide
this "user-friendliness."   I don't see why it is so confusing to
recognize that Fortran just references it's arrays "backwards" (or Python
references them backwards --- whatever your preference).  How you index
into an array is an arbitrary decision.  Numerical Python and Fortran just
have opposite conventions.  As long as that is clear, I don't see the
real trouble.  If the Fortran documentation calls for an array of
dimension (M,N,L) you pass it a contiguous Python array of shape (L,N,M)
--- pretty simple.   

Perhaps someone could enlighten me as to why this is more than just a
aesthetic problem. Right now, I would prefer that the time spent by
someone to "fix" this "problem" went to expanding the availability of
easy-to-use processing routines for Numerical Python, or improving the
cross-platform plotting capabilities.

I agree that it can be most confusing when you are talking about matrix
math since we are so used to thinking of matrix multiplication as A * B =
C with a shape analysis of:

     M X N * N X L = M X L

If the matrix multiplacation code is in Fortran, then it expects to get an
(M,N) array and a (N,L) array and returns an (M,L) array.  But from Python
you would pass it arrays with shape (N,M) and (L,N) and get back an (L,M)
array which can be confusing to our "understanding" of shape analysis in
matrix multiplication:

Python matrix multiplication rule if calling a Fortran routine to do the
multiplication:

     (N,M) (L,N) = (L, M)

I think a Python-only class could solve this problem much more easily than
changing the underlying C-code.  This new Python Fortran-array class would
just make the user think that the shapes were (M,N) and (N,L) and the
output shape was (M,L).  

For future reference, any array-processing codes that somebody writes
should take a strides array as an argument, so that it doesn't matter what
"order" the array is in. 

--Travis


From pearu at ioc.ee  Wed Jan 26 13:58:58 2000
From: pearu at ioc.ee (Pearu Peterson)
Date: Wed, 26 Jan 2000 20:58:58 +0200 (EET)
Subject: [Numpy-discussion] Re: Proclamation: column-wise arrays
In-Reply-To: <Pine.LNX.4.10.10001261123050.1022-100000@us2.mayo.edu>
Message-ID: <Pine.HPX.4.05.10001262028070.28130-100000@egoist.ioc.ee>

On Wed, 26 Jan 2000, Travis Oliphant wrote:

> > Proclamation:
> > 	Introduce a column-wise array to Numeric Python where data is
> > stored in column-wise order that can be used specifically for fortran
> > routines.
> 
> This is a very interesting proposal that we should consider carefully.   I
> seem to recall reading that Jim Hugunin originally had this idea in
> mind when he established the concept of contiguousness, etc.
> 
> My current thoughts on this issue are that it is of only syntatic value
> and seems like a lot of extra code has to be written in order to provide
> this "user-friendliness."   I don't see why it is so confusing to
> recognize that Fortran just references it's arrays "backwards" (or Python
> references them backwards --- whatever your preference).  How you index
> into an array is an arbitrary decision.  Numerical Python and Fortran just
> have opposite conventions.  As long as that is clear, I don't see the
> real trouble.  If the Fortran documentation calls for an array of
> dimension (M,N,L) you pass it a contiguous Python array of shape (L,N,M)
> --- pretty simple.   
> 
> Perhaps someone could enlighten me as to why this is more than just a
> aesthetic problem. Right now, I would prefer that the time spent by
> someone to "fix" this "problem" went to expanding the availability of
> easy-to-use processing routines for Numerical Python, 

I think that this expansion would be quicker if the Python/Fortran
connection would not introduce this additional question to worry about.

> or improving the
> cross-platform plotting capabilities.
Here I agree with you completely. 

I can see the following problems when two different conventions are mixed:
1) if your application Python code is larger than "just an example that
demonstrates the correct usage of two different conventions" and it can
call other C/API modules that do calculations in C convention then you
need some kind of book keeping where your matrices need to be transposed
and where not, and where to insert additional code for doing
transposition. I think this can be done in lower level and more
efficiently than most ordinary users would do anyway.
2) Another but minor drawback of having two conventions is that if you
have square matrix that is non-symmetric, then its misuse would be easy
and (may be) difficult to discover.

On the other hand, I completely understand why my proposal would not be
implemented --- it looks like it needs lots of work and in short term the
gain would not be visible to most users.


Pearu


From archiver at db.geocrawler.com  Wed Jan 26 17:18:31 2000
From: archiver at db.geocrawler.com (fredrik)
Date: Wed, 26 Jan 2000 14:18:31 -0800
Subject: [Numpy-discussion] install numPy with lapack
Message-ID: <200001262218.OAA19813@www.geocrawler.com>

This message was sent from Geocrawler.com by "fredrik" <fredriks at pinkfloyd.com>
Be sure to reply to that address.

How do i install the latest version of
numPy if i have lapack already installed?
I'm using  distutils.


Geocrawler.com - The Knowledge Archive


From hanche at math.ntnu.no  Fri Jan 28 21:37:07 2000
From: hanche at math.ntnu.no (Harald Hanche-Olsen)
Date: Sat, 29 Jan 2000 03:37:07 +0100
Subject: [Numpy-discussion] A proposal for dot (or inner)
Message-ID: <20000129033707V.hanche@math.ntnu.no>

I am having some problems relating to the current function dot
(identical to matrixmultiply, though I haven't seen the equivalence in
any documentation).  Here is what the docs say:

  dot(a,b) 

  Will return the dot product of a and b. This is equivalent to matrix
  multiply for 2d arrays (without the transpose).  Somebody who does
  more linear algebra really needs to do this function right some day!

Or the builtin doc string:

  >>> print Numeric.dot.__doc__
  dot(a,b) returns matrix-multiplication between a and b.  The product-sum
      is over the last dimension of a and the second-to-last dimension of b.

First, this is misleading.  It seems to me to indicate that b must
have rank at least 2, which experiments indicate is not necessary.
Instead, the rule appears to be to use the only axis of b if b has
rank 1, and otherwise to use the second-to-last one.

Frankly, I think this convention is ill motivated, hard to remember,
and even harder to justify.  As a mathematician, I can see only one
reasonable default choice: One should sum over the last index of a,
and the first index of b.  Using the Einstein summation convention
[*], that would mean that

   dot(a,b)[j,k,...,m,n,...] = a[j,k,...,i] * b[i,m,n,...]

[*] that is, summing over repeated indices -- i in this example


This would of course yield the current behaviour in the important
cases where the rank of b is 1 or 2.

But we could do better than this:  Why not leave the choice up to the
user?  We could allow an optional third parameter which should be a
pair of indices, indicating the axes to be summed over.  The default
value of this parameter would be (-1, 0).  Returning to my example
above, the user could then easily compute, for example,

  dot(a,b,(1,2))[j,k,...,m,n,...] = a[j,i,k,...] * b[m,n,i,...]

while the current behaviour of dot would correspond to the new
behaviour of dot(a,b,(-1,-2)) whenever b has rank at least 2.

Actually, there is probably a lot of code out there that uses the
current behaviour of dot.  So I would propose leaving dot untouched,
and introducing inner instead, with the behaviour I outlined above.
We could even allow any number of pairs of axes to be summed over, for
example

  inner(a,b,(1,2),(2,0))[k,l,...,m,n,...] = a[k,i,j,l,..] * b[j,m,i,n,...]

With this notation, one can for example write the Hilbert-Schmidt
inner product of two real 2x2 matrices (the sum of a[i,j]b[j,i] over
all i and j) as inner(a,b,(0,1),(1,0)).

If my proposal is accepted, the documentation should probably declare
dot (and its alias matrixmultiply?) as deprecated and due to disappear
in the future, with a pointer to its replacement inner.  In the
meantime, dot could in fact be replaced by a simple wrapper to inner:

def dot(a,b):
    if len(b.shape) > 1:
        return inner(a,b,(-1,-2)
    else:
        return inner(a,b)

(with the proper embellishments to allow it to be used with python
sequences, of course).

- Harald


From hanche at math.ntnu.no  Fri Jan 28 21:38:01 2000
From: hanche at math.ntnu.no (Harald Hanche-Olsen)
Date: Sat, 29 Jan 2000 03:38:01 +0100
Subject: [Numpy-discussion] trace does not behave as advertised on arrays of rank > 2
Message-ID: <20000129033801C.hanche@math.ntnu.no>

>> print Numeric.trace.__doc__
trace(a,offset=0, axis1=0, axis2=1) returns the sum along diagonals
    (defined by the last two dimenions) of the array.

For arrays of rank 2, trace does what you expect, but for arrays of
larger rank, it appears to simply sum along each of the two given
axes.  A simple experiment follows:

>>> B
array([[[       1,       10],
        [     100,     1000]],
       [[   10000,   100000],
        [ 1000000, 10000000]]])

>>> # What I thought trace(B) would be:
>>> B[0,0,0]+B[1,1,0], B[0,0,1]+B[1,1,1]
(1000001, 10000010)

>>> # But that is not what numpy thinks:
>>> Numeric.trace(B)
array([   10001, 10001000])

>>> # Instead, it must be computing it as follows:
>>> B[0,0,0]+B[1,0,0], B[0,1,1]+B[1,1,1]
(10001, 10001000)

That is, trace(B) is the vector C, given by C[i]=sum(B[j,i,i]: j=0,...).
A bit more experimentation reveals that trace ignores its fourth
argument, consistent with the above result:

>>> Numeric.trace(B,0,0,1)
array([   10001, 10001000])
>>> Numeric.trace(B,0,0,2)
array([   10001, 10001000])

Evidently, trace is going to need a rewrite.  It might perhaps also
benefit from further optional arguments in groups of three, e.g.,

  trace(A, p, 0, 3, q, 1, 2)[k,l,...] = A[i+p,j+q,j,i,k,l,...]

with summing over repeated indices (i, j) ala Einstein.

- Harald


From pauldubois at home.com  Thu Jan 20 18:07:52 2000
From: pauldubois at home.com (Paul F. Dubois)
Date: Thu, 20 Jan 2000 15:07:52 -0800
Subject: [Numpy-discussion] test
Message-ID: <NDBBIEFMILBFPMDHJIMFEEAGCCAA.pauldubois@home.com>

This is a test.
Ignore.

Paul


From aurag at crm.umontreal.ca  Sat Jan 22 11:25:36 2000
From: aurag at crm.umontreal.ca (Hassan Aurag)
Date: Sat, 22 Jan 2000 16:25:36 GMT
Subject: [Numpy-discussion] Hello
Message-ID: <20000122.16253600@adam-aurag.penguinpowered.com>

 Hi,

I have just seen with pleasure that numpy is on sourceforge. So 
welcome.

I am maintaining and writing a generic mathematical session manager, 
in which I have a numpy session.

The project if you haven't noticed it is at 
http://gmath.sourceforge.net

Now, this done, I'd like to know if you will be making rpm/tar.gz 
releases available on your site always and preferably on the anonymous 
ftp site for latest release, like in:
ftp://numpy.sourceforge.net/latest/srpms ....

If yes, then I'd be happy to make the app go fetch the packages from 
that site or another instead of making the srpms myself (which means 
possible bugs!).

Thank you in advance.

H. Aurag
aurag at crm.umontreal.ca
aurag at users.sourceforge.net
 

From pearu at ioc.ee  Wed Jan 26 12:00:31 2000
From: pearu at ioc.ee (Pearu Peterson)
Date: Wed, 26 Jan 2000 19:00:31 +0200 (EET)
Subject: [Numpy-discussion] Proclamation: column-wise arrays
Message-ID: <Pine.HPX.4.05.10001261719130.25917-100000@egoist.ioc.ee>

Hi!

Problem:
	Using Fortran routines from Python C/API is "tricky" when
multi-dimensional arrays are passed in.

Cause:
	Arrays in Fortran are stored in column-wise order while arrays in
C are stored in row-wise order.

Standard solutions:
	1) Create a new C array; copy the data from the old one in
column-wise order; pass the new array to fortran; copy changed array back
to old one in row-wise order; deallocate the array.
	2) Change the storage order of an array in place: element-wise
swapping; pass the array to fortran; change the storage order back with
element-wise swapping

Why standard solutions are not good?
	1) Additional memory allocation, that is problem for large arrays;
Element-wise copying is time consuming (2 times).
	2) It is good as no extra memory is needed but element-wise
swapping (2 times) is approx. equivalent with the element-wise copying (4
times).

Proclamation:
	Introduce a column-wise array to Numeric Python where data is
stored in column-wise order that can be used specifically for fortran
routines.

Proposal sketch:
	1) Introduce a new flag `row_order'to PyArrayObject structure:
row_order == 1  -> the data is stored in row-wise order (default, as it is
		now)
row_order == 0  -> the data is stored in column-wise order
Note that now the concept of contiguousness depends on this flag. 
	2) Introduce new array "constructors" such as PyArray_CW_FromDims,
PyArray_CW_FromDimsAndData, PyArray_CW_ContiguousFromObject,
PyArray_CW_CopyFromObject, PyArray_CW_FromObject, etc. that all return
arrays with row_order=0 and data stored in column-wise order (that is in
case of contiguous results, otherwise strides feature is employd).
	3) In order to operations between arrays (possibly with different 
storage order) would work correctly, many internal functions of NumPy
C/API need to be modifyied.
	4) anything else?

What is the good of this?
	1) The fact is that there is a large number of very good scietific
tools freely available written in Fortran (Netlib, for instance). And I
don't mean only Fortran 77 codes but also Fortran 90/95 codes.
	2) Having Numeric Python arrays with data stored in column-wise
order, calling Fortran routines from Python becomes really efficient and
space-saving.
	3) There should be little performance hit if, say, two
arrays with different storage order are multiplied (compared to the
operations between non-contiguous arrays in the current implementation).
	4) I don't see any reason why older C/API modules would broke
because of this change if it is carried out carefully enough. So,
back-ward compability should be there.
	5) anything else?

What are against of this?
	1) Lots of work but with current experience it should not be a
problem.
	2) The size of the code will grow.
	3) I suppose that most people using Numerical Python will not care
of calling Fortran routines from Python. Possible reasons: too "tricky" or
no need. In the first case, the answer is that there are tools such as
PyFort, f2py that solve this problem. In the later case, there is no
problem:-)
	4) anything else?

I understand that my proposal is quite radical but taking into account
that we want to use Python for many years to come, the use would be more
pleasing if one cause of (constant) confusion would be less during this
time.

Best regards,
	Pearu


From Oliphant.Travis at mayo.edu  Wed Jan 26 12:45:46 2000
From: Oliphant.Travis at mayo.edu (Travis Oliphant)
Date: Wed, 26 Jan 2000 11:45:46 -0600 (CST)
Subject: [Numpy-discussion] Re: Proclamation: column-wise arrays
In-Reply-To: <Pine.HPX.4.05.10001261719130.25917-100000@egoist.ioc.ee>
Message-ID: <Pine.LNX.4.10.10001261123050.1022-100000@us2.mayo.edu>

> Proclamation:
> 	Introduce a column-wise array to Numeric Python where data is
> stored in column-wise order that can be used specifically for fortran
> routines.

This is a very interesting proposal that we should consider carefully.   I
seem to recall reading that Jim Hugunin originally had this idea in
mind when he established the concept of contiguousness, etc.

My current thoughts on this issue are that it is of only syntatic value
and seems like a lot of extra code has to be written in order to provide
this "user-friendliness."   I don't see why it is so confusing to
recognize that Fortran just references it's arrays "backwards" (or Python
references them backwards --- whatever your preference).  How you index
into an array is an arbitrary decision.  Numerical Python and Fortran just
have opposite conventions.  As long as that is clear, I don't see the
real trouble.  If the Fortran documentation calls for an array of
dimension (M,N,L) you pass it a contiguous Python array of shape (L,N,M)
--- pretty simple.   

Perhaps someone could enlighten me as to why this is more than just a
aesthetic problem. Right now, I would prefer that the time spent by
someone to "fix" this "problem" went to expanding the availability of
easy-to-use processing routines for Numerical Python, or improving the
cross-platform plotting capabilities.

I agree that it can be most confusing when you are talking about matrix
math since we are so used to thinking of matrix multiplication as A * B =
C with a shape analysis of:

     M X N * N X L = M X L

If the matrix multiplacation code is in Fortran, then it expects to get an
(M,N) array and a (N,L) array and returns an (M,L) array.  But from Python
you would pass it arrays with shape (N,M) and (L,N) and get back an (L,M)
array which can be confusing to our "understanding" of shape analysis in
matrix multiplication:

Python matrix multiplication rule if calling a Fortran routine to do the
multiplication:

     (N,M) (L,N) = (L, M)

I think a Python-only class could solve this problem much more easily than
changing the underlying C-code.  This new Python Fortran-array class would
just make the user think that the shapes were (M,N) and (N,L) and the
output shape was (M,L).  

For future reference, any array-processing codes that somebody writes
should take a strides array as an argument, so that it doesn't matter what
"order" the array is in. 

--Travis


From pearu at ioc.ee  Wed Jan 26 13:58:58 2000
From: pearu at ioc.ee (Pearu Peterson)
Date: Wed, 26 Jan 2000 20:58:58 +0200 (EET)
Subject: [Numpy-discussion] Re: Proclamation: column-wise arrays
In-Reply-To: <Pine.LNX.4.10.10001261123050.1022-100000@us2.mayo.edu>
Message-ID: <Pine.HPX.4.05.10001262028070.28130-100000@egoist.ioc.ee>

On Wed, 26 Jan 2000, Travis Oliphant wrote:

> > Proclamation:
> > 	Introduce a column-wise array to Numeric Python where data is
> > stored in column-wise order that can be used specifically for fortran
> > routines.
> 
> This is a very interesting proposal that we should consider carefully.   I
> seem to recall reading that Jim Hugunin originally had this idea in
> mind when he established the concept of contiguousness, etc.
> 
> My current thoughts on this issue are that it is of only syntatic value
> and seems like a lot of extra code has to be written in order to provide
> this "user-friendliness."   I don't see why it is so confusing to
> recognize that Fortran just references it's arrays "backwards" (or Python
> references them backwards --- whatever your preference).  How you index
> into an array is an arbitrary decision.  Numerical Python and Fortran just
> have opposite conventions.  As long as that is clear, I don't see the
> real trouble.  If the Fortran documentation calls for an array of
> dimension (M,N,L) you pass it a contiguous Python array of shape (L,N,M)
> --- pretty simple.   
> 
> Perhaps someone could enlighten me as to why this is more than just a
> aesthetic problem. Right now, I would prefer that the time spent by
> someone to "fix" this "problem" went to expanding the availability of
> easy-to-use processing routines for Numerical Python, 

I think that this expansion would be quicker if the Python/Fortran
connection would not introduce this additional question to worry about.

> or improving the
> cross-platform plotting capabilities.
Here I agree with you completely. 

I can see the following problems when two different conventions are mixed:
1) if your application Python code is larger than "just an example that
demonstrates the correct usage of two different conventions" and it can
call other C/API modules that do calculations in C convention then you
need some kind of book keeping where your matrices need to be transposed
and where not, and where to insert additional code for doing
transposition. I think this can be done in lower level and more
efficiently than most ordinary users would do anyway.
2) Another but minor drawback of having two conventions is that if you
have square matrix that is non-symmetric, then its misuse would be easy
and (may be) difficult to discover.

On the other hand, I completely understand why my proposal would not be
implemented --- it looks like it needs lots of work and in short term the
gain would not be visible to most users.


Pearu


From archiver at db.geocrawler.com  Wed Jan 26 17:18:31 2000
From: archiver at db.geocrawler.com (fredrik)
Date: Wed, 26 Jan 2000 14:18:31 -0800
Subject: [Numpy-discussion] install numPy with lapack
Message-ID: <200001262218.OAA19813@www.geocrawler.com>

This message was sent from Geocrawler.com by "fredrik" <fredriks at pinkfloyd.com>
Be sure to reply to that address.

How do i install the latest version of
numPy if i have lapack already installed?
I'm using  distutils.


Geocrawler.com - The Knowledge Archive


From hanche at math.ntnu.no  Fri Jan 28 21:37:07 2000
From: hanche at math.ntnu.no (Harald Hanche-Olsen)
Date: Sat, 29 Jan 2000 03:37:07 +0100
Subject: [Numpy-discussion] A proposal for dot (or inner)
Message-ID: <20000129033707V.hanche@math.ntnu.no>

I am having some problems relating to the current function dot
(identical to matrixmultiply, though I haven't seen the equivalence in
any documentation).  Here is what the docs say:

  dot(a,b) 

  Will return the dot product of a and b. This is equivalent to matrix
  multiply for 2d arrays (without the transpose).  Somebody who does
  more linear algebra really needs to do this function right some day!

Or the builtin doc string:

  >>> print Numeric.dot.__doc__
  dot(a,b) returns matrix-multiplication between a and b.  The product-sum
      is over the last dimension of a and the second-to-last dimension of b.

First, this is misleading.  It seems to me to indicate that b must
have rank at least 2, which experiments indicate is not necessary.
Instead, the rule appears to be to use the only axis of b if b has
rank 1, and otherwise to use the second-to-last one.

Frankly, I think this convention is ill motivated, hard to remember,
and even harder to justify.  As a mathematician, I can see only one
reasonable default choice: One should sum over the last index of a,
and the first index of b.  Using the Einstein summation convention
[*], that would mean that

   dot(a,b)[j,k,...,m,n,...] = a[j,k,...,i] * b[i,m,n,...]

[*] that is, summing over repeated indices -- i in this example


This would of course yield the current behaviour in the important
cases where the rank of b is 1 or 2.

But we could do better than this:  Why not leave the choice up to the
user?  We could allow an optional third parameter which should be a
pair of indices, indicating the axes to be summed over.  The default
value of this parameter would be (-1, 0).  Returning to my example
above, the user could then easily compute, for example,

  dot(a,b,(1,2))[j,k,...,m,n,...] = a[j,i,k,...] * b[m,n,i,...]

while the current behaviour of dot would correspond to the new
behaviour of dot(a,b,(-1,-2)) whenever b has rank at least 2.

Actually, there is probably a lot of code out there that uses the
current behaviour of dot.  So I would propose leaving dot untouched,
and introducing inner instead, with the behaviour I outlined above.
We could even allow any number of pairs of axes to be summed over, for
example

  inner(a,b,(1,2),(2,0))[k,l,...,m,n,...] = a[k,i,j,l,..] * b[j,m,i,n,...]

With this notation, one can for example write the Hilbert-Schmidt
inner product of two real 2x2 matrices (the sum of a[i,j]b[j,i] over
all i and j) as inner(a,b,(0,1),(1,0)).

If my proposal is accepted, the documentation should probably declare
dot (and its alias matrixmultiply?) as deprecated and due to disappear
in the future, with a pointer to its replacement inner.  In the
meantime, dot could in fact be replaced by a simple wrapper to inner:

def dot(a,b):
    if len(b.shape) > 1:
        return inner(a,b,(-1,-2)
    else:
        return inner(a,b)

(with the proper embellishments to allow it to be used with python
sequences, of course).

- Harald


From hanche at math.ntnu.no  Fri Jan 28 21:38:01 2000
From: hanche at math.ntnu.no (Harald Hanche-Olsen)
Date: Sat, 29 Jan 2000 03:38:01 +0100
Subject: [Numpy-discussion] trace does not behave as advertised on arrays of rank > 2
Message-ID: <20000129033801C.hanche@math.ntnu.no>

>> print Numeric.trace.__doc__
trace(a,offset=0, axis1=0, axis2=1) returns the sum along diagonals
    (defined by the last two dimenions) of the array.

For arrays of rank 2, trace does what you expect, but for arrays of
larger rank, it appears to simply sum along each of the two given
axes.  A simple experiment follows:

>>> B
array([[[       1,       10],
        [     100,     1000]],
       [[   10000,   100000],
        [ 1000000, 10000000]]])

>>> # What I thought trace(B) would be:
>>> B[0,0,0]+B[1,1,0], B[0,0,1]+B[1,1,1]
(1000001, 10000010)

>>> # But that is not what numpy thinks:
>>> Numeric.trace(B)
array([   10001, 10001000])

>>> # Instead, it must be computing it as follows:
>>> B[0,0,0]+B[1,0,0], B[0,1,1]+B[1,1,1]
(10001, 10001000)

That is, trace(B) is the vector C, given by C[i]=sum(B[j,i,i]: j=0,...).
A bit more experimentation reveals that trace ignores its fourth
argument, consistent with the above result:

>>> Numeric.trace(B,0,0,1)
array([   10001, 10001000])
>>> Numeric.trace(B,0,0,2)
array([   10001, 10001000])

Evidently, trace is going to need a rewrite.  It might perhaps also
benefit from further optional arguments in groups of three, e.g.,

  trace(A, p, 0, 3, q, 1, 2)[k,l,...] = A[i+p,j+q,j,i,k,l,...]

with summing over repeated indices (i, j) ala Einstein.

- Harald