# [scikit-learn] CLUSTER ANALYSIS AND THE SEARCH OF A SAMPLE MODE

Ulderico Santarelli ulderico.santarelli at gmail.com
Mon Sep 18 02:54:10 EDT 2023

work
array([[ 5.63011247],
[-2.31453939],
[22.23122848],
[15.37678101]])
np.shape(work)
(4, 1)

*my best regards. *
*Ulderico.*
import numpy as np
import pandas as pd
#standardize data --- dataraw is a DataFrame
#locate data in the DataFrame
datar = dataraw.iloc[:,1:5]
means = datar.mean(axis = 0)
stdev = datar.std(axis = 0)
data = (datar-means)/stdev
#keep just quantitative variables
#CENTRALITY INDEX
scalar = pd.merge(data, data, how = 'cross')
point1 = scalar.loc[:, 'sepal length _x':'petal width _x']
point2 = scalar.loc[:, 'sepal length _y':'petal width _y']
apoint1 = point1.to_numpy(dtype = float)
apoint2 = point2.to_numpy(dtype = float)
delta = (apoint1 - apoint2)
force = 0
if delta.any() != 0:
force = np.exp(-abs(delta))
sig = np.sign(delta)
sforce = sig*force
dsforce = pd.DataFrame(sforce)
#dsforce.to_excel('C:\Pyth\dsforce.xlsx')
arr = np.ones((150, 1),)
sforcet = sforce.T
sum_force =np.zeros((1, 4),)   #do not use empty arrays
start = 0
end = 150
for i in range(150):
s_forcet = sforcet[:, start:end]
work = np.matmul(s_forcet, arr)
sum_force =np.concatenate((sum_force, work.reshape(1, 4)), axis = 0)
start = end
end +=150
sumforce = sum_force[1:, :]
dsumforce = pd.DataFrame(sumforce)
dsumforce.to_excel('C:\Pyth\sumforce_sqc.xlsx')
sum_force_square = sumforce**2
ssT = np.ones((4, 1),)
T_w_ = np.sqrt(np.matmul(sum_force_square, ssT))
dT_w_ = pd.DataFrame(T_w_, )
dT_w_.to_excel('C:\Pyth\T_w_.xlsx')

> I got interested in your project, but I found this error from the
> beginning (see attached image).
> The work array cannot be reshaped to (1,4), cause it has shape (2,1), any
> suggestions?
>
> JL
>
>>       *I am an old guy who started programming around the seventies of
>> the last century* with ASSEMBLER 360, then FORTRAN, PL1, APL, IBM
>> APPLICATION SYSTEM and, last, the marvelous SAS. Having heard around about
>> the powerful, flexible, functionally complete PYTHON UNIVERSE”,
>> encompassing an advanced Object-Oriented Language and a very wide family of
>> packages, I decided to run an exercise about a problem I've been
>> tackling since my youth (have a look at the Bibliography). I succeeded in
>> completing it in a few days and I'm attaching my solution to the problem of
>> finding the points in a sample that are "central" in a surrounding
>> topological neighborhood. They are eligible as centroids for a Cluster
>> Analysis after the aggregation of "too near points'. The solution is based
>> on the search of potential wells in a suitable potential field, similar to
>> the one all of us studied in high school. Therefore, too near points may be
>> in the same potential well.
>> No more words, have a look at the attachment.
>> My coding is that of a beginner. I'm sure everybody would find more
>> efficient coding.  As a comment: I started studying Python around May 15th
>> 2023.
>> My best regards.
>> Ulderico Santarelli.
