[SciPy-User] code for multidimensional scaling?

denis denis-bz-gg at t-online.de
Sun Jan 9 10:54:31 EST 2011


Brian, folks,
  here's a simple mdscale in 10 lines, using np.linalg.eigh: fast
enough for N up to 1000 or so.
What are your N, dim, p ?

(Two threads, one on MDS and another on Min-cost Path, got merged here
in google groups ??)

cheers
  -- denis

# Multidim scaling using np.linalg.eigh
from __future__ import division
import numpy as np

__date__ = "2011-01-09 Jan denis"

def mdscale_eigh( D, p=2 ):
    """ in: D pairwise distances, |Xi - Xj|
        out: M pvecs N x p  with |Mi - Mj| ~ D (mod flip, rotate)
        uses np.linalg.eigh
    """
    D2 = D**2
    av = D2.mean( axis=0 )
    B = -.5 * ((D2 - av) .T - av + D2.mean())
    evals, evecs = np.linalg.eigh( B )  # few biggest evals / evecs ?
    pvals = evals[-p:] ** .5
    pvecs = evecs[:,-p:]
    M = pvecs * pvals
    return M - M.mean( axis=0 )

#...............................................................................
if __name__ == "__main__":
    import sys
    import scipy.spatial.distance as ssdist  # $scipy/spatial/
distance.py

    N = 1000
    dim = 3
    p = 2
    plot = 0
    seed = 1
    exec "\n".join( sys.argv[1:] )  # run this.py N=1000  dim=3  p=2:
19 sec
    np.set_printoptions( 1, threshold=100, edgeitems=5,
suppress=True )
    np.random.seed(seed)

    print "\nN=%d  dim=%d  p=%d" % (N, dim, p)
    X = np.random.uniform( -10, 10, size=(N,dim) )
    X -= X.mean( axis=0 )
    D = ssdist.squareform( ssdist.pdist( X, "euclidean" ))

    def pa( A, name="" ):
        """ print A summary: mean histo etc. """
        histo = np.histogram( A, 5 )
        print "%s av %.2g  histo %s edges %s" % (
            name, A.mean(), histo[0], histo[1])
        # print A.round().astype(int)
    pa( D, "D" )

#...............................................................................
    M = mdscale_eigh( D, p )
    M -= M.mean( axis=0 )  # ?
    print "M eigenvectors:\n", M.T.round().astype(int)
    mdist = ssdist.squareform( ssdist.pdist( M, "euclidean" ))
    pa( mdist, "mdist" )


On Jan 7, 12:30 pm, Brian Murphy <brian.mur... at unitn.it> wrote:
> Hi,
>
> I'm new to the list, so I hope my question is appropriate. I'm looking
> for code that implements multi-dimensional scaling (e.g. like Matlab's
> mdscale command) in Python.



More information about the SciPy-User mailing list