[Numpy-discussion] fixing diag() for matrices

Fri Jul 28 18:41:07 EDT 2006

Here's my attempt at summarizing the diag-discussion.

The symptom is that currently transforming something like the vector
a b c
into the diagonal matrix
a 0 0
0 b 0
0 0 c

is not directly possible if you're working with numpy-matrices (i.e. the
vector is of type matrix and has shape (n,1) or (1,n)). Blindly doing a
diag(vector) just gives you the first element of the vector. Instead you
need to do something like mat(diag(vector.A.squeeze).

Proposals that were put forward:

1) Modify the existing diag() to treat not only 1d-arrays as vectors,
but also numpy-matrices of shapes (1,n) and (n,1), and therefore have
diag() turn them into a diagonal matrix with all their elements.

This would be my preferred solution. Robert is against this, because it
is not 100% backwards compatible and because it requires the
diag-function to differentiate between the input types.

2) Deprecate the use of diag which is overloaded with making diagonal
matrices as well as getting diagonals. Instead, use the existing
.diagonal() for getting a diagonal, and introduce a new make_diag()
function which could easily work for numpy-arrays and numpy-matrices alike.

This is a blend of Christopher's and Robert's comments. As robert noted,
the problem is that diag() will probably be more attractive than
make_diag() for numpy-array users, and so make_diag() would effectively
only be used by numpy-matrix users, a split which would not be nice. An
effectively equivalent solution would then be a new method .make_diag()
for numpy-matrices, which was suggested by Robert (if I understood
correctly).

3) Add a new function diag() to numpy.matlib implementing the
workaround, something like:
def diag(vector):
	return mat(N.diag(vector.A.squeeze)
(not sure if all namespace issues are avoided like that, or if it's
efficient)
That way, numpy.matlib.diag() could be used with numpy-matrices just
like numpy.diag() with numpy-arrays. Side effect: this even turns a
1d-array into a numpy-matrix, which is useful e.g. for dealing with
eigenvalues (which are 1d-arrays even for numpy-matrix input).

The third solution is probably the way of least resistance. However,
afaik this would be the first time that numpy needs different functions
to perform essentially the same operation. Since Robert already cited
the obligatory zen phrase, what about this special casing?

Anyway, any of those solutions is better than the status quo imo. I'd be
happy if one of them could be implemented. When I get back from my
vacation and in case nothing has been changed, I'm going to file a
ticket so that it doesn't get forgotten.

Thanks for your patience and have a good time!
Sven

Robert Kern schrieb:
> Christopher Barker wrote:
>> Keith Goodman wrote:
>>> diag(NxN matrix) should return a Nx1 matrix
>>> diag(Nx1 or 1xN matrix) should return a NxN matrix
>> This is the key problem: extracting the diagonal of a matrix and 
>> creating a matrix from a diagonal are two different operations: 
>> overloading one function to do both was a bad idea to begin with.
>>
>> Maybe we should just keep diag() as is is for backward compatibility 
>> (deprecated), and make:
>>
>> get_diag() and make_diag() instead.
>>
>> Then it would be unambiguous what you wanted with:
>>
>> make_diag(<Nx1array>)
>>
>> You can call them something else, but you get the idea.
> 
> As .diagonal() is a method on both arrays and matrices, we don't need a separate 
> get_diag().
> 
> In the abstract, I heartily approve. However, I'm not sure that deprecation in 
> favor of a clumsier name is going to be effective in practice. diag([a, b, c]) 
> is just too tempting and will remain an attractive nuisance. It matches 
> mathematical notation ever so nicely.
>