[scikit-learn] urgent help in scikit-learn

Thu Mar 30 10:04:19 EDT 2017

Hi, Shuchi,

> 1. How can I add data to the data set of the package?

You don’t need to add your dataset to the dataset module to run your analysis. A convenient way to load it into a numpy array would be via pandas. E.g.,

import pandas as pd
df = pd.read_csv(‘your_data.txt', delimiter=r"\s+”)
X = df.values

> 2. How I can calculate Rand index for my data?

After you ran the clustering, you can use the “adjusted_rand_score” function, e.g., see
http://scikit-learn.org/stable/modules/clustering.html#adjusted-rand-score

> 3. How to use make_blobs command for my data?

The make_blobs command is just a utility function to create toydatasets, you wouldn’t need it in your case since you already have “real” data.

Best,
Sebastian

> On Mar 30, 2017, at 4:51 AM, Shuchi Mala <shuchi.23 at gmail.com> wrote:
> 
> Hi everyone,
> 
> I have the data with following attributes: (Latitude, Longitude). Now I am performing clustering using DBSCAN for my data. I have following doubts:
> 
> 1. How can I add data to the data set of the package?
> 2. How I can calculate Rand index for my data?
> 3. How to use make_blobs command for my data?
> 
> Sample of my data is :
> Latitude	Longitude
> 37.76901	-122.429299
> 37.76904	-122.42913
> 37.76878	-122.429092
> 37.7763	-122.424249
> 37.77627	-122.424657
> 
> 
> With Best Regards,
> Shuchi  Mala
> Research Scholar
> Department of Civil Engineering
> MNIT Jaipur
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn