[scikit-learn] urgent help in scikit-learn

Mon Apr 3 23:45:59 EDT 2017

Hi Raschka,

I want to know how to use cross validation when other regression model such
as poisson is used in place of linear?

Kindly help.

With Best Regards,
Shuchi  Mala
Research Scholar
Department of Civil Engineering
MNIT Jaipur

On Mon, Apr 3, 2017 at 8:05 PM, Sebastian Raschka <se.raschka at gmail.com>
wrote:

> Don’t get me wrong, but you’d have to either manually label them yourself,
> asking domain experts, or use platforms like Amazon Turk (or collect them
> in some other way).
>
> > On Apr 3, 2017, at 7:38 AM, Shuchi Mala <shuchi.23 at gmail.com> wrote:
> >
> > How can I get  ground truth labels of the training examples in my
> dataset?
> >
> > With Best Regards,
> > Shuchi  Mala
> > Research Scholar
> > Department of Civil Engineering
> > MNIT Jaipur
> >
> >
> > On Fri, Mar 31, 2017 at 8:17 PM, Sebastian Raschka <se.raschka at gmail.com>
> wrote:
> > Hi, Shuchi,
> >
> > regarding labels_true: you’d only be able to compute the rand index
> adjusted for chance if you have the ground truth labels iof the training
> examples in your dataset.
> >
> > The second parameter, labels_pred, takes in the predicted cluster labels
> (indices) that you got from the clustering. E.g,
> >
> > dbscn = DBSCAN()
> > labels_pred = dbscn.fit(X).predict(X)
> >
> > Best,
> > Sebastian
> >
> >
> > > On Mar 31, 2017, at 12:02 AM, Shuchi Mala <shuchi.23 at gmail.com> wrote:
> > >
> > > Thank you so much for your quick reply. I have one more doubt. The
> below statement is used to calculate rand score.
> > >
> > > metrics.adjusted_rand_score(labels_true, labels_pred)
> > >  In my case what will be labels_true and labels_pred and how I will
> calculate labels_pred?
> > >
> > > With Best Regards,
> > > Shuchi  Mala
> > > Research Scholar
> > > Department of Civil Engineering
> > > MNIT Jaipur
> > >
> > >
> > > On Thu, Mar 30, 2017 at 8:38 PM, Shane Grigsby <
> shane.grigsby at colorado.edu> wrote:
> > > Since you're using lat / long coords, you'll also want to convert them
> to radians and specify 'haversine' as your distance metric; i.e. :
> > >
> > >    coords = np.vstack([lats.ravel(),longs.ravel()]).T
> > >    coords *= np.pi / 180. # to radians
> > >
> > > ...and:
> > >
> > >    db = DBSCAN(eps=0.3, min_samples=10, metric='haversine')
> > >    # replace eps and min_samples as appropriate
> > >    db.fit(coords)
> > >
> > > Cheers,
> > > Shane
> > >
> > >
> > > On 03/30, Sebastian Raschka wrote:
> > > Hi, Shuchi,
> > >
> > > 1. How can I add data to the data set of the package?
> > >
> > > You don’t need to add your dataset to the dataset module to run your
> analysis. A convenient way to load it into a numpy array would be via
> pandas. E.g.,
> > >
> > > import pandas as pd
> > > df = pd.read_csv(‘your_data.txt', delimiter=r"\s+”)
> > > X = df.values
> > >
> > > 2. How I can calculate Rand index for my data?
> > >
> > > After you ran the clustering, you can use the “adjusted_rand_score”
> function, e.g., see
> > > http://scikit-learn.org/stable/modules/clustering.
> html#adjusted-rand-score
> > >
> > > 3. How to use make_blobs command for my data?
> > >
> > > The make_blobs command is just a utility function to create
> toydatasets, you wouldn’t need it in your case since you already have
> “real” data.
> > >
> > > Best,
> > > Sebastian
> > >
> > >
> > > On Mar 30, 2017, at 4:51 AM, Shuchi Mala <shuchi.23 at gmail.com> wrote:
> > >
> > > Hi everyone,
> > >
> > > I have the data with following attributes: (Latitude, Longitude). Now
> I am performing clustering using DBSCAN for my data. I have following
> doubts:
> > >
> > > 1. How can I add data to the data set of the package?
> > > 2. How I can calculate Rand index for my data?
> > > 3. How to use make_blobs command for my data?
> > >
> > > Sample of my data is :
> > > Latitude        Longitude
> > > 37.76901        -122.429299
> > > 37.76904        -122.42913
> > > 37.76878        -122.429092
> > > 37.7763 -122.424249
> > > 37.77627        -122.424657
> > >
> > >
> > > With Best Regards,
> > > Shuchi  Mala
> > > Research Scholar
> > > Department of Civil Engineering
> > > MNIT Jaipur
> > >
> > > _______________________________________________
> > > scikit-learn mailing list
> > > scikit-learn at python.org
> > > https://mail.python.org/mailman/listinfo/scikit-learn
> > >
> > > _______________________________________________
> > > scikit-learn mailing list
> > > scikit-learn at python.org
> > > https://mail.python.org/mailman/listinfo/scikit-learn
> > >
> > > --
> > > *PhD candidate & Research Assistant*
> > > *Cooperative Institute for Research in Environmental Sciences (CIRES)*
> > > *University of Colorado at Boulder*
> > >
> > > _______________________________________________
> > > scikit-learn mailing list
> > > scikit-learn at python.org
> > > https://mail.python.org/mailman/listinfo/scikit-learn
> > >
> > > _______________________________________________
> > > scikit-learn mailing list
> > > scikit-learn at python.org
> > > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170404/e5fc3872/attachment-0001.html>