[SciPy-user] k means
eric jones
eric at enthought.com
Sun Sep 22 15:58:08 EDT 2002
The expensive part of kmeans is the underlying vq algorithm. It is very
parallelizable.
The kmeans algorithm lives in scipy/cluster/vq.py. The C++ version of
the vq algorithm lives in scipy/cluster/src/vq.h. There is a template
in this algorithm that looks like:
template<class T>
void tvq(T* obs,T* code_book, int Nobs, int Ncodes, int Nfeatures,
int* codes, T* lowest_dist)
{
int i;
for( i = 0; i < Nobs; i++)
{
tvq_obs<T>(&(obs[i*Nfeatures]),code_book,Ncodes,Nfeatures,
codes[i],lowest_dist[i]);
}
}
Parallelizing this loop with MPI or whatever is probably a good first
cut.
Good luck with your project,
eric
------
Dear sir,
i am a student doing Masters in Computer Science. I am doing a project
on parallel computing. For that my instructor wants me to run K-Means
algorithm on a cluster of 5 nodes and d some some sort of performance
analysis of this distributed architecture...since u were in search of
the source code..i request u if u could help me in providing teh
algorithm in c or c++ which could be implemented in pararl;lel here..i
would be extremely grateful if u help me
M farhan ul haq
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
More information about the SciPy-User
mailing list