<div dir="ltr">Hi Raschka,<div><br></div><div>I want to know how to use cross validation when other regression model such as poisson is used in place of linear?</div><div><br></div><div>Kindly help.</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><font color="#000000">With Best Regards,</font><div><font color="#666666">Shuchi  Mala</font></div><div><font color="#666666">Research Scholar</font></div><div><font color="#666666">Department of Civil Engineering</font></div><div><font color="#666666">MNIT Jaipur</font></div><div><br></div></div></div></div></div></div>

<br><div class="gmail_quote">On Mon, Apr 3, 2017 at 8:05 PM, Sebastian Raschka <span dir="ltr"><<a href="mailto:se.raschka@gmail.com" target="_blank">se.raschka@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Don’t get me wrong, but you’d have to either manually label them yourself, asking domain experts, or use platforms like Amazon Turk (or collect them in some other way).<br>

<div class="HOEnZb"><div class="h5"><br>

> On Apr 3, 2017, at 7:38 AM, Shuchi Mala <<a href="mailto:shuchi.23@gmail.com">shuchi.23@gmail.com</a>> wrote:<br>

><br>

> How can I get  ground truth labels of the training examples in my dataset?<br>

><br>

> With Best Regards,<br>

> Shuchi  Mala<br>

> Research Scholar<br>

> Department of Civil Engineering<br>

> MNIT Jaipur<br>

><br>

><br>

> On Fri, Mar 31, 2017 at 8:17 PM, Sebastian Raschka <<a href="mailto:se.raschka@gmail.com">se.raschka@gmail.com</a>> wrote:<br>

> Hi, Shuchi,<br>

><br>

> regarding labels_true: you’d only be able to compute the rand index adjusted for chance if you have the ground truth labels iof the training examples in your dataset.<br>

><br>

> The second parameter, labels_pred, takes in the predicted cluster labels (indices) that you got from the clustering. E.g,<br>

><br>

> dbscn = DBSCAN()<br>

> labels_pred = dbscn.fit(X).predict(X)<br>

><br>

> Best,<br>

> Sebastian<br>

><br>

><br>

> > On Mar 31, 2017, at 12:02 AM, Shuchi Mala <<a href="mailto:shuchi.23@gmail.com">shuchi.23@gmail.com</a>> wrote:<br>

> ><br>

> > Thank you so much for your quick reply. I have one more doubt. The below statement is used to calculate rand score.<br>

> ><br>

> > metrics.adjusted_rand_score(<wbr>labels_true, labels_pred)<br>

> >  In my case what will be labels_true and labels_pred and how I will calculate labels_pred?<br>

> ><br>

> > With Best Regards,<br>

> > Shuchi  Mala<br>

> > Research Scholar<br>

> > Department of Civil Engineering<br>

> > MNIT Jaipur<br>

> ><br>

> ><br>

> > On Thu, Mar 30, 2017 at 8:38 PM, Shane Grigsby <<a href="mailto:shane.grigsby@colorado.edu">shane.grigsby@colorado.edu</a>> wrote:<br>

> > Since you're using lat / long coords, you'll also want to convert them to radians and specify 'haversine' as your distance metric; i.e. :<br>

> ><br>

> >    coords = np.vstack([lats.ravel(),longs.<wbr>ravel()]).T<br>

> >    coords *= np.pi / 180. # to radians<br>

> ><br>

> > ...and:<br>

> ><br>

> >    db = DBSCAN(eps=0.3, min_samples=10, metric='haversine')<br>

> >    # replace eps and min_samples as appropriate<br>

> >    db.fit(coords)<br>

> ><br>

> > Cheers,<br>

> > Shane<br>

> ><br>

> ><br>

> > On 03/30, Sebastian Raschka wrote:<br>

> > Hi, Shuchi,<br>

> ><br>

> > 1. How can I add data to the data set of the package?<br>

> ><br>

> > You don’t need to add your dataset to the dataset module to run your analysis. A convenient way to load it into a numpy array would be via pandas. E.g.,<br>

> ><br>

> > import pandas as pd<br>

> > df = pd.read_csv(‘your_data.txt', delimiter=r"\s+”)<br>

> > X = df.values<br>

> ><br>

> > 2. How I can calculate Rand index for my data?<br>

> ><br>

> > After you ran the clustering, you can use the “adjusted_rand_score” function, e.g., see<br>

> > <a href="http://scikit-learn.org/stable/modules/clustering.html#adjusted-rand-score" rel="noreferrer" target="_blank">http://scikit-learn.org/<wbr>stable/modules/clustering.<wbr>html#adjusted-rand-score</a><br>

> ><br>

> > 3. How to use make_blobs command for my data?<br>

> ><br>

> > The make_blobs command is just a utility function to create toydatasets, you wouldn’t need it in your case since you already have “real” data.<br>

> ><br>

> > Best,<br>

> > Sebastian<br>

> ><br>

> ><br>

> > On Mar 30, 2017, at 4:51 AM, Shuchi Mala <<a href="mailto:shuchi.23@gmail.com">shuchi.23@gmail.com</a>> wrote:<br>

> ><br>

> > Hi everyone,<br>

> ><br>

> > I have the data with following attributes: (Latitude, Longitude). Now I am performing clustering using DBSCAN for my data. I have following doubts:<br>

> ><br>

> > 1. How can I add data to the data set of the package?<br>

> > 2. How I can calculate Rand index for my data?<br>

> > 3. How to use make_blobs command for my data?<br>

> ><br>

> > Sample of my data is :<br>

> > Latitude        Longitude<br>

> > 37.76901        -122.429299<br>

> > 37.76904        -122.42913<br>

> > 37.76878        -122.429092<br>

> > 37.7763 -122.424249<br>

> > 37.77627        -122.424657<br>

> ><br>

> ><br>

> > With Best Regards,<br>

> > Shuchi  Mala<br>

> > Research Scholar<br>

> > Department of Civil Engineering<br>

> > MNIT Jaipur<br>

> ><br>

> > ______________________________<wbr>_________________<br>

> > scikit-learn mailing list<br>

> > <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

> > <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

> ><br>

> > ______________________________<wbr>_________________<br>

> > scikit-learn mailing list<br>

> > <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

> > <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

> ><br>

> > --<br>

> > *PhD candidate & Research Assistant*<br>

> > *Cooperative Institute for Research in Environmental Sciences (CIRES)*<br>

> > *University of Colorado at Boulder*<br>

> ><br>

> > ______________________________<wbr>_________________<br>

> > scikit-learn mailing list<br>

> > <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

> > <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

> ><br>

> > ______________________________<wbr>_________________<br>

> > scikit-learn mailing list<br>

> > <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

> > <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

><br>

> ______________________________<wbr>_________________<br>

> scikit-learn mailing list<br>

> <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

> <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

><br>

> ______________________________<wbr>_________________<br>

> scikit-learn mailing list<br>

> <a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

> <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

<br>

______________________________<wbr>_________________<br>

scikit-learn mailing list<br>

<a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>

</div></div></blockquote></div><br></div>