From niourf at gmail.com Sat Jun 1 10:00:23 2019
From: niourf at gmail.com (Nicolas Hug)
Date: Sat, 1 Jun 2019 10:00:23 0400
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
MessageID:
Splitting the data into train and test data is needed with any machine
learning model (not just linear regression with or without least squares).
The idea is that you want to evaluate the performance of your model
(prediction + scoring) on a portion of the data that you did not use for
training.
You'll find more details in the user guide
https://scikitlearn.org/stable/modules/cross_validation.html
Nicolas
On 5/31/19 8:54 PM, C W wrote:
> Hello everyone,
>
> I'm new to scikit learn. I see that many tutorial in scikitlearn
> follows the workflow along the lines of
> 1) tranform the data
> 2) split the data: train, test
> 3) instantiate the sklearn object and fit
> 4) predict and tune parameter
>
> But, linear regression is done in least squares, so I don't think
> train test split is necessary. So, I guess I can just use the entire
> dataset?
>
> Thanks in advance!
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
 next part 
An HTML attachment was scrubbed...
URL:
From tmrsg11 at gmail.com Sat Jun 1 22:42:14 2019
From: tmrsg11 at gmail.com (C W)
Date: Sat, 1 Jun 2019 22:42:14 0400
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
MessageID:
Hi Nicholas,
I don't get it.
The coefficients are estimated through OLS. Essentially, you are just
calculating a matrix pseudo inverse, where
beta = (X^T * X)^(1) * X^T * y
Splitting the data does not improve the model, It only works in something
like LASSO, where you have a tuning parameter.
Holding out some data will make the regression estimates worse off.
Hope to hear from you, thanks!
On Sat, Jun 1, 2019 at 10:04 AM Nicolas Hug wrote:
> Splitting the data into train and test data is needed with any machine
> learning model (not just linear regression with or without least squares).
>
> The idea is that you want to evaluate the performance of your model
> (prediction + scoring) on a portion of the data that you did not use for
> training.
>
> You'll find more details in the user guide
> https://scikitlearn.org/stable/modules/cross_validation.html
>
> Nicolas
>
>
> On 5/31/19 8:54 PM, C W wrote:
>
> Hello everyone,
>
> I'm new to scikit learn. I see that many tutorial in scikitlearn follows
> the workflow along the lines of
> 1) tranform the data
> 2) split the data: train, test
> 3) instantiate the sklearn object and fit
> 4) predict and tune parameter
>
> But, linear regression is done in least squares, so I don't think train
> test split is necessary. So, I guess I can just use the entire dataset?
>
> Thanks in advance!
>
> _______________________________________________
> scikitlearn mailing listscikitlearn at python.orghttps://mail.python.org/mailman/listinfo/scikitlearn
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From joel.nothman at gmail.com Sun Jun 2 01:11:02 2019
From: joel.nothman at gmail.com (Joel Nothman)
Date: Sun, 2 Jun 2019 15:11:02 +1000
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
MessageID:
You're right that you don't need to use CV for hyperparameter estimation in
linear regression, but you may want it for model evaluation.
As far as I understand: Holding out a test set is recommended if you aren't
entirely sure that the assumptions of the model are held (gaussian error on
a linear fit; independent and identically distributed samples). The model
evaluation approach in predictive ML, using heldout data, relies only on
the weaker assumption that the metric you have chosen, when applied to the
test set you have held out, forms a reasonable measure of generalised /
realworld performance. (Of course this too is often not held in practice,
but it is the primary assumption, in my opinion, that ML practitioners need
to be careful of.)
On Sun, 2 Jun 2019 at 12:43, C W wrote:
> Hi Nicholas,
>
> I don't get it.
>
> The coefficients are estimated through OLS. Essentially, you are just
> calculating a matrix pseudo inverse, where
> beta = (X^T * X)^(1) * X^T * y
>
> Splitting the data does not improve the model, It only works in something
> like LASSO, where you have a tuning parameter.
>
> Holding out some data will make the regression estimates worse off.
>
> Hope to hear from you, thanks!
>
>
>
> On Sat, Jun 1, 2019 at 10:04 AM Nicolas Hug wrote:
>
>> Splitting the data into train and test data is needed with any machine
>> learning model (not just linear regression with or without least squares).
>>
>> The idea is that you want to evaluate the performance of your model
>> (prediction + scoring) on a portion of the data that you did not use for
>> training.
>>
>> You'll find more details in the user guide
>> https://scikitlearn.org/stable/modules/cross_validation.html
>>
>> Nicolas
>>
>>
>> On 5/31/19 8:54 PM, C W wrote:
>>
>> Hello everyone,
>>
>> I'm new to scikit learn. I see that many tutorial in scikitlearn follows
>> the workflow along the lines of
>> 1) tranform the data
>> 2) split the data: train, test
>> 3) instantiate the sklearn object and fit
>> 4) predict and tune parameter
>>
>> But, linear regression is done in least squares, so I don't think train
>> test split is necessary. So, I guess I can just use the entire dataset?
>>
>> Thanks in advance!
>>
>> _______________________________________________
>> scikitlearn mailing listscikitlearn at python.orghttps://mail.python.org/mailman/listinfo/scikitlearn
>>
>> _______________________________________________
>> scikitlearn mailing list
>> scikitlearn at python.org
>> https://mail.python.org/mailman/listinfo/scikitlearn
>>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From jbbrown at kuhp.kyotou.ac.jp Mon Jun 3 00:19:30 2019
From: jbbrown at kuhp.kyotou.ac.jp (Brown J.B.)
Date: Mon, 3 Jun 2019 13:19:30 +0900
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
MessageID:
>
> As far as I understand: Holding out a test set is recommended if you
> aren't entirely sure that the assumptions of the model are held (gaussian
> error on a linear fit; independent and identically distributed samples).
> The model evaluation approach in predictive ML, using heldout data, relies
> only on the weaker assumption that the metric you have chosen, when applied
> to the test set you have held out, forms a reasonable measure of
> generalised / realworld performance. (Of course this too is often not held
> in practice, but it is the primary assumption, in my opinion, that ML
> practitioners need to be careful of.)
>
Dear CW,
As Joel as said, holding out a test set will help you evaluate the validity
of model assumptions, and his last point (reasonable measure of generalised
performance) is absolutely essential for understanding the capabilities and
limitations of ML.
To add to your checklist of interpreting ML papers properly, be cautious
when interpreting reports of high performance when using 5/10fold or
LeaveOneOut crossvalidation on large datasets, where "large" depends on
the nature of the problem setting.
Results are also highly dependent on the distributions of the underlying
independent variables (e.g., 60000 datapoints all with nearidentical
distributions may yield phenomenal performance in cross validation and be
almost nonpredictive in truly unknown/prospective situations).
Even at 500 datapoints, if independent variable distributions look similar
(with similar endpoints), then when each model is trained on 80% of that
data, the remaining 20% will certainly be predictable, and repeating that
five times will yield statistics that seem impressive.
So, again, while problem context completely dictates ML experiment design,
metric selection, and interpretation of outcome, my personal rule of thumb
is to do nomore than 2fold crossvalidation (50% train, 50% predict) when
having 100+ datapoints.
Even more extreme, using try 33% for training and 66% for validation (or
even 20/80).
If your model still reports good statistics, then you can believe that the
patterns in the training data extrapolate well to the ones in the external
validation data.
Hope this helps,
J.B.
 next part 
An HTML attachment was scrubbed...
URL:
From nelle.varoquaux at gmail.com Mon Jun 3 08:20:28 2019
From: nelle.varoquaux at gmail.com (Nelle Varoquaux)
Date: Mon, 3 Jun 2019 08:20:28 0400
Subject: [scikitlearn] Only a few days left to submit!  2019 John Hunter
Excellence in Plotting Contest
MessageID:
Hi everybody, There are only a few days left to submit to the 2019 John
Hunter Excellence in Plotting Contest! If you're interested in
participating, note that you have until June 8th to prepare your
submission.
In memory of John Hunter, we are pleased to be reviving the SciPy John
Hunter Excellence in Plotting Competition for 2019. This open competition
aims to highlight the importance of data visualization to scientific
progress and showcase the capabilities of open source software.
Participants are invited to submit scientific plots to be judged by a
panel. The winning entries will be announced and displayed at the
conference.
John Hunter?s family and NumFocus are graciously sponsoring cash prizes for
the winners in the following amounts:

1st prize: $1000

2nd prize: $750

3rd prize: $500

Entries must be submitted by June, 8th to the form at
https://goo.gl/forms/cFTB3FUBrMPfQ7Vz1

Winners will be announced at Scipy 2019 in Austin, TX.

Participants do not need to attend the Scipy conference.

Entries may take the definition of ?visualization? rather broadly.
Entries may be, for example, a traditional printed plot, an interactive
visualization for the web, or an animation.

Source code for the plot must be provided, in the form of Python code
and/or a Jupyter notebook, along with a rendering of the plot in a widely
used format. This may be, for example, PDF for print, standalone HTML and
Javascript for an interactive plot, or MPEG4 for a video. If the original
data can not be shared for reasons of size or licensing, "fake" data may be
substituted, along with an image of the plot using real data.

Each entry must include a 300500 word abstract describing the plot and
its importance for a general scientific audience.

Entries will be judged on their clarity, innovation and aesthetics, but
most importantly for their effectiveness in communicating a realworld
problem. Entrants are encouraged to submit plots that were used during the
course of research or work, rather than merely being hypothetical.

SciPy reserves the right to display any and all entries, whether
prizewinning or not, at the conference, use in any materials or on its
website, with attribution to the original author(s).
SciPy John Hunter Excellence in Plotting Competition CoChairs
Hannah Aizenman
Thomas Caswell
Madicken Munk
Nelle Varoquaux
 next part 
An HTML attachment was scrubbed...
URL:
From t3kcit at gmail.com Mon Jun 3 11:41:17 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Mon, 3 Jun 2019 11:41:17 0400
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
MessageID: <23948deebbaadf7e0b13559a40f7a372@gmail.com>
This classical paper on statistical practices (Breiman's "two cultures")
might be helpful to understand the different viewpoints:
https://projecteuclid.org/euclid.ss/1009213726
On 6/3/19 12:19 AM, Brown J.B. via scikitlearn wrote:
>
> As far as I understand: Holding out a test set is recommended if
> you aren't entirely sure that the assumptions of the model are
> held (gaussian error on a linear fit; independent and identically
> distributed samples). The model evaluation approach in predictive
> ML, using heldout data, relies only on the weaker assumption that
> the metric you have chosen, when applied to the test set you have
> held out, forms a reasonable measure of generalised / realworld
> performance. (Of course this too is often not held in practice,
> but it is the primary assumption, in my opinion,?that ML
> practitioners need to be careful of.)
>
>
> Dear CW,
> As Joel as said, holding out a test set will help you evaluate the
> validity of model assumptions, and his last point (reasonable measure
> of generalised performance) is absolutely essential for understanding
> the capabilities and limitations of ML.
>
> To add to your checklist of interpreting ML papers properly, be
> cautious when interpreting reports of high performance when using
> 5/10fold or LeaveOneOut crossvalidation on large datasets, where
> "large" depends on the nature of the problem setting.
> Results are also highly dependent on the distributions of the
> underlying independent variables (e.g., 60000 datapoints all with
> nearidentical distributions may yield phenomenal performance in cross
> validation and be almost nonpredictive in truly unknown/prospective
> situations).
> Even at 500 datapoints, if independent variable distributions look
> similar (with similar endpoints), then when each model is trained on
> 80% of that data, the remaining 20% will certainly be predictable, and
> repeating that five times will yield statistics that seem impressive.
>
> So, again, while problem context completely dictates ML experiment
> design, metric selection, and interpretation of outcome, my personal
> rule of thumb is to do nomore than 2fold crossvalidation (50%
> train, 50% predict) when having 100+ datapoints.
> Even more extreme, using try 33% for training and 66% for validation
> (or even 20/80).
> If your model still reports good statistics, then you can believe that
> the patterns in the training data extrapolate well to the ones in the
> external validation data.
>
> Hope this helps,
> J.B.
>
>
>
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
 next part 
An HTML attachment was scrubbed...
URL:
From tmrsg11 at gmail.com Tue Jun 4 20:44:38 2019
From: tmrsg11 at gmail.com (C W)
Date: Tue, 4 Jun 2019 20:44:38 0400
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo: <23948deebbaadf7e0b13559a40f7a372@gmail.com>
References:
<23948deebbaadf7e0b13559a40f7a372@gmail.com>
MessageID:
Thank you all for the replies.
I agree that prediction accuracy is great for evaluating blackbox ML
models. Especially advanced models like neural networks, or notsoblack
models like LASSO, because they are NPhard to solve.
Linear regression is not a blackbox. I view prediction accuracy as an
overkill on interpretable models. Especially when you can use Rsquared,
coefficient significance, etc.
Prediction accuracy also does not tell you which feature is important.
What do you guys think? Thank you!
.
On Mon, Jun 3, 2019 at 11:43 AM Andreas Mueller wrote:
> This classical paper on statistical practices (Breiman's "two cultures")
> might be helpful to understand the different viewpoints:
>
> https://projecteuclid.org/euclid.ss/1009213726
>
>
> On 6/3/19 12:19 AM, Brown J.B. via scikitlearn wrote:
>
> As far as I understand: Holding out a test set is recommended if you
>> aren't entirely sure that the assumptions of the model are held (gaussian
>> error on a linear fit; independent and identically distributed samples).
>> The model evaluation approach in predictive ML, using heldout data, relies
>> only on the weaker assumption that the metric you have chosen, when applied
>> to the test set you have held out, forms a reasonable measure of
>> generalised / realworld performance. (Of course this too is often not held
>> in practice, but it is the primary assumption, in my opinion, that ML
>> practitioners need to be careful of.)
>>
>
> Dear CW,
> As Joel as said, holding out a test set will help you evaluate the
> validity of model assumptions, and his last point (reasonable measure of
> generalised performance) is absolutely essential for understanding the
> capabilities and limitations of ML.
>
> To add to your checklist of interpreting ML papers properly, be cautious
> when interpreting reports of high performance when using 5/10fold or
> LeaveOneOut crossvalidation on large datasets, where "large" depends on
> the nature of the problem setting.
> Results are also highly dependent on the distributions of the underlying
> independent variables (e.g., 60000 datapoints all with nearidentical
> distributions may yield phenomenal performance in cross validation and be
> almost nonpredictive in truly unknown/prospective situations).
> Even at 500 datapoints, if independent variable distributions look similar
> (with similar endpoints), then when each model is trained on 80% of that
> data, the remaining 20% will certainly be predictable, and repeating that
> five times will yield statistics that seem impressive.
>
> So, again, while problem context completely dictates ML experiment design,
> metric selection, and interpretation of outcome, my personal rule of thumb
> is to do nomore than 2fold crossvalidation (50% train, 50% predict) when
> having 100+ datapoints.
> Even more extreme, using try 33% for training and 66% for validation (or
> even 20/80).
> If your model still reports good statistics, then you can believe that the
> patterns in the training data extrapolate well to the ones in the external
> validation data.
>
> Hope this helps,
> J.B.
>
>
>
>
> _______________________________________________
> scikitlearn mailing listscikitlearn at python.orghttps://mail.python.org/mailman/listinfo/scikitlearn
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From jbbrown at kuhp.kyotou.ac.jp Tue Jun 4 21:43:09 2019
From: jbbrown at kuhp.kyotou.ac.jp (Brown J.B.)
Date: Wed, 5 Jun 2019 10:43:09 +0900
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
<23948deebbaadf7e0b13559a40f7a372@gmail.com>
MessageID:
Dear CW,
> Linear regression is not a blackbox. I view prediction accuracy as an
> overkill on interpretable models. Especially when you can use Rsquared,
> coefficient significance, etc.
>
Following on my previous note about being cautious with crossvalidated
evaluation for classification, the same applies for regression.
About 20 years ago, chemoinformatics researchers pointed out the caution
needed with using CVbased R^2 (q^2) as a measure of performance.
"Beware of q2!" Golbraikh and Tropsha, J Mol Graph Modeling (2002) 20:269
https://www.sciencedirect.com/science/article/pii/S1093326301001231
In this article, they propose to measure correlation by using both
knownVSpredicted _and_ predictedVSknown calculations of the correlation
coefficient, and importantly, that the regression line to fit in both cases
goes through the origin.
The resulting coefficients are checked as a pair, and the authors argue
that only if they are both high can one say that the model is fitting the
data well.
Contrast this to Pearson Product Moment Correlation (R), where the fit of
the line has no requirement to go through the origin of the fit.
I found the paper above to be helpful in filtering for more robust
regression models, and have implemented my own version of their method,
which I use as my first evaluation metric when performing regression
modelling.
Hope this provides you some thought.
Prediction accuracy also does not tell you which feature is important.
>
The contributions of the scikitlearn community have yielded a great set of
tools for performing feature weighting separate from model performance
evaluation.
All you need to do is read the documentation and try out some of the
examples, and you should be ready to adapt to your situation.
J.B.
 next part 
An HTML attachment was scrubbed...
URL:
From matthieu.brucher at gmail.com Wed Jun 5 02:43:28 2019
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Wed, 5 Jun 2019 07:43:28 +0100
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
<23948deebbaadf7e0b13559a40f7a372@gmail.com>
MessageID:
Hi CW,
It's not about the concept of the black box, none of the algorithms in
sklearn are a blackbox. The question is about model validity. Is linear
regression a valid representation of your data? That's what the train/test
answers. You may think so, but only this process will answer it properly.
Matthieu
Le mer. 5 juin 2019 ? 01:46, C W a ?crit :
> Thank you all for the replies.
>
> I agree that prediction accuracy is great for evaluating blackbox ML
> models. Especially advanced models like neural networks, or notsoblack
> models like LASSO, because they are NPhard to solve.
>
> Linear regression is not a blackbox. I view prediction accuracy as an
> overkill on interpretable models. Especially when you can use Rsquared,
> coefficient significance, etc.
>
> Prediction accuracy also does not tell you which feature is important.
>
> What do you guys think? Thank you!
>
> .
>
> On Mon, Jun 3, 2019 at 11:43 AM Andreas Mueller wrote:
>
>> This classical paper on statistical practices (Breiman's "two cultures")
>> might be helpful to understand the different viewpoints:
>>
>> https://projecteuclid.org/euclid.ss/1009213726
>>
>>
>> On 6/3/19 12:19 AM, Brown J.B. via scikitlearn wrote:
>>
>> As far as I understand: Holding out a test set is recommended if you
>>> aren't entirely sure that the assumptions of the model are held (gaussian
>>> error on a linear fit; independent and identically distributed samples).
>>> The model evaluation approach in predictive ML, using heldout data, relies
>>> only on the weaker assumption that the metric you have chosen, when applied
>>> to the test set you have held out, forms a reasonable measure of
>>> generalised / realworld performance. (Of course this too is often not held
>>> in practice, but it is the primary assumption, in my opinion, that ML
>>> practitioners need to be careful of.)
>>>
>>
>> Dear CW,
>> As Joel as said, holding out a test set will help you evaluate the
>> validity of model assumptions, and his last point (reasonable measure of
>> generalised performance) is absolutely essential for understanding the
>> capabilities and limitations of ML.
>>
>> To add to your checklist of interpreting ML papers properly, be cautious
>> when interpreting reports of high performance when using 5/10fold or
>> LeaveOneOut crossvalidation on large datasets, where "large" depends on
>> the nature of the problem setting.
>> Results are also highly dependent on the distributions of the underlying
>> independent variables (e.g., 60000 datapoints all with nearidentical
>> distributions may yield phenomenal performance in cross validation and be
>> almost nonpredictive in truly unknown/prospective situations).
>> Even at 500 datapoints, if independent variable distributions look
>> similar (with similar endpoints), then when each model is trained on 80% of
>> that data, the remaining 20% will certainly be predictable, and repeating
>> that five times will yield statistics that seem impressive.
>>
>> So, again, while problem context completely dictates ML experiment
>> design, metric selection, and interpretation of outcome, my personal rule
>> of thumb is to do nomore than 2fold crossvalidation (50% train, 50%
>> predict) when having 100+ datapoints.
>> Even more extreme, using try 33% for training and 66% for validation (or
>> even 20/80).
>> If your model still reports good statistics, then you can believe that
>> the patterns in the training data extrapolate well to the ones in the
>> external validation data.
>>
>> Hope this helps,
>> J.B.
>>
>>
>>
>>
>> _______________________________________________
>> scikitlearn mailing listscikitlearn at python.orghttps://mail.python.org/mailman/listinfo/scikitlearn
>>
>> _______________________________________________
>> scikitlearn mailing list
>> scikitlearn at python.org
>> https://mail.python.org/mailman/listinfo/scikitlearn
>>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>

Quantitative researcher, Ph.D.
Blog: http://blog.audiotk.com/
LinkedIn: http://www.linkedin.com/in/matthieubrucher
 next part 
An HTML attachment was scrubbed...
URL:
From jbbrown at kuhp.kyotou.ac.jp Wed Jun 5 03:17:16 2019
From: jbbrown at kuhp.kyotou.ac.jp (Brown J.B.)
Date: Wed, 5 Jun 2019 16:17:16 +0900
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
<23948deebbaadf7e0b13559a40f7a372@gmail.com>
MessageID:
2019?6?5?(?) 10:43 Brown J.B. :
> Contrast this to Pearson Product Moment Correlation (R), where the fit of
> the line has no requirement to go through the origin of the fit.
>
Not sure what I was thinking when I wrote that.
Pardon the mistake; I'm fully aware that Pearson R is merely a coefficient
merely indicating direction of trend.
 next part 
An HTML attachment was scrubbed...
URL:
From matthew.brett at gmail.com Wed Jun 5 05:45:17 2019
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 5 Jun 2019 10:45:17 +0100
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
<23948deebbaadf7e0b13559a40f7a372@gmail.com>
MessageID:
On Wed, Jun 5, 2019 at 8:18 AM Brown J.B. via scikitlearn
wrote:
>
> 2019?6?5?(?) 10:43 Brown J.B. :
>>
>> Contrast this to Pearson Product Moment Correlation (R), where the fit of the line has no requirement to go through the origin of the fit.
>
>
> Not sure what I was thinking when I wrote that.
> Pardon the mistake; I'm fully aware that Pearson R is merely a coefficient merely indicating direction of trend.
Ah  now I'm more confused. r is surely a coefficient, but I
personally find it most useful to think of r as the leastsquares
regression slope once the x and y values have been transformed to
standard scores. For that case, the leastsquares intercept must be
0.
Cheers,
Matthew
From pahome.chen at mirlab.org Wed Jun 5 06:56:35 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Wed, 5 Jun 2019 18:56:35 +0800
Subject: [scikitlearn] Any way to tune threshold of Birch rather than
GridSearchCV?
MessageID:
I use Birch to cluster my data and my data is kind of timeseries data.
I don't know the actually cluster numbers and need to read large
data(online learning), so I choose Birch rather than MiniKmeans.
When I read it, I found the critical parameters might be branching_factor
and threshold, and threshold will affect my cluster numbers obviously!
Any way to estimate the suitable threshold of Birch? Any paper suggestion
is ok.
thx
 next part 
An HTML attachment was scrubbed...
URL:
From t3kcit at gmail.com Wed Jun 5 09:09:08 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Wed, 5 Jun 2019 09:09:08 0400
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
<23948deebbaadf7e0b13559a40f7a372@gmail.com>
MessageID:
On 6/4/19 8:44 PM, C W wrote:
> Thank you all for the replies.
>
> I agree that prediction accuracy is great for evaluating blackbox ML
> models. Especially advanced models like neural networks, or
> notsoblack models like LASSO, because they are NPhard to solve.
>
> Linear regression is not a blackbox. I view prediction accuracy as an
> overkill on interpretable models. Especially when you can use
> Rsquared, coefficient significance, etc.
>
> Prediction accuracy also does not tell you which feature is important.
>
> What do you guys think? Thank you!
>
Did you read the paper that I sent? ;)
From pahome.chen at mirlab.org Thu Jun 6 03:05:28 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Thu, 6 Jun 2019 15:05:28 +0800
Subject: [scikitlearn] fit before partial_fit ?
MessageID:
I tried MiniBatchKMeans with two order:
fit > partial_fit
partial_fit > partial_fit
The clustering results are different
what's their difference?
 next part 
An HTML attachment was scrubbed...
URL:
From ahmetcik at fhiberlin.mpg.de Thu Jun 6 08:56:59 2019
From: ahmetcik at fhiberlin.mpg.de (ahmetcik)
Date: Thu, 06 Jun 2019 14:56:59 +0200
Subject: [scikitlearn] Normalization in ridge regression when there is no
intercept
MessageID: <1704d7dd5f25fd34fa23931406b7b846@fhiberlin.mpg.de>
Hello everyone,
I have just recognized that when using ridge regression without an
intercept no normalization is performed even if the argument "normalize"
is set to True. Though it is, of course, no problem to manually
normalize the input matrix X I have become curious if there was a
special reason to not normalize the data, e.g. the columns of X scaled
(but not centered to have mean zero) to have unit norm such that their
lengths do not affect the outcome.
Thanks in advance!
Emre
From vaggi.federico at gmail.com Thu Jun 6 13:06:39 2019
From: vaggi.federico at gmail.com (federico vaggi)
Date: Thu, 6 Jun 2019 10:06:39 0700
Subject: [scikitlearn] fit before partial_fit ?
InReplyTo:
References:
MessageID:
kmeans isn't a convex problem, unless you freeze the initialization, you
are going to get very different solutions (depending on the dataset) with
different initializations.
On Thu, Jun 6, 2019 at 12:05 AM lampahome wrote:
> I tried MiniBatchKMeans with two order:
> fit > partial_fit
> partial_fit > partial_fit
>
> The clustering results are different
>
> what's their difference?
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From rth.yurchak at pm.me Fri Jun 7 04:13:46 2019
From: rth.yurchak at pm.me (Roman Yurchak)
Date: Fri, 07 Jun 2019 08:13:46 +0000
Subject: [scikitlearn] Normalization in ridge regression when there is
no intercept
InReplyTo: <1704d7dd5f25fd34fa23931406b7b846@fhiberlin.mpg.de>
References: <1704d7dd5f25fd34fa23931406b7b846@fhiberlin.mpg.de>
MessageID:
On 06/06/2019 14:56, ahmetcik wrote:
> I have just recognized that when using ridge regression without an
> intercept no normalization is performed even if the argument "normalize"
> is set to True.
It's a known longstanding issue
https://github.com/scikitlearn/scikitlearn/issues/3020 It would be
indeed good to find a solution.

Roman
From adrin.jalali at gmail.com Fri Jun 7 10:50:59 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Fri, 7 Jun 2019 18:50:59 +0400
Subject: [scikitlearn] Google code reviews
InReplyTo:
References:
MessageID:
Would we need to nominate PRs for them to review, or would they find them
on their own? Either case, could use a hand and extra eyes, why not
On Sat., May 25, 2019, 16:10 Joel Nothman, wrote:
> For some of the larger PRs, this might be helpful. Not going to help where
> the intricacies of Scikitlearn API come in play.
>
> On Sat, 25 May 2019 at 04:17, Andreas Mueller wrote:
>
>> Hi All.
>> What do you think of https://www.pullrequest.com/googleserve/?
>> It's sponsored code reviews. Could be interesting, right?
>>
>> Best,
>> Andy
>> _______________________________________________
>> scikitlearn mailing list
>> scikitlearn at python.org
>> https://mail.python.org/mailman/listinfo/scikitlearn
>>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From t3kcit at gmail.com Fri Jun 7 11:21:10 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Fri, 7 Jun 2019 11:21:10 0400
Subject: [scikitlearn] Google code reviews
InReplyTo:
References:
MessageID: <284ec1fe2aece6b6373649934f06a126@gmail.com>
I think they might actually review the existing code base? But I'm not
entirely sure. We can also nominate PRs, I think.
On 6/7/19 10:50 AM, Adrin wrote:
> Would we need to nominate PRs for them to review, or would they find
> them on their own? Either case, could use a hand and extra eyes, why not
>
> On Sat., May 25, 2019, 16:10 Joel Nothman, > wrote:
>
> For some of the larger PRs, this might be helpful. Not going to
> help where the intricacies of Scikitlearn API come in play.
>
> On Sat, 25 May 2019 at 04:17, Andreas Mueller > wrote:
>
> Hi All.
> What do you think of https://www.pullrequest.com/googleserve/?
> It's sponsored code reviews. Could be interesting, right?
>
> Best,
> Andy
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
 next part 
An HTML attachment was scrubbed...
URL:
From ericjvandervelden at gmail.com Sat Jun 8 05:34:25 2019
From: ericjvandervelden at gmail.com (Eric J. Van der Velden)
Date: Sat, 8 Jun 2019 11:34:25 +0200
Subject: [scikitlearn] LogisticRegression
MessageID:
Hello,
I am learning sklearn from my book of Geron. On page 137 he learns the
model of petal widths.
When I implements logistic regression myself as I learned from my Coursera
course or from my book of Bishop I find that the following parameters are
found where the cost function is minimal:
In [6219]: w
Out[6219]:
array([[21.12563996],
[ 12.94750716]])
I used Gradient Descent and NewtonRaphson, both give the same answer.
My question is: how can I see after fit() which parameters
LogisticRegression() has found?
One other question also: when I read the documentation page,
https://scikitlearn.org/stable/modules/linear_model.html#logisticregression,
I see a different cost function as I read in the books.
Thanks.
 next part 
An HTML attachment was scrubbed...
URL:
From ericjvandervelden at gmail.com Sat Jun 8 13:56:39 2019
From: ericjvandervelden at gmail.com (Eric J. Van der Velden)
Date: Sat, 8 Jun 2019 19:56:39 +0200
Subject: [scikitlearn] LogisticRegression
InReplyTo:
References:
MessageID:
Here I have added what I had programmed.
With sklearn's LogisticRegression(), how can I see the parameters it has
found after .fit() where the cost is minimal? I use the book of Geron about
scikitlearn and tensorflow and on page 137 he trains the model of petal
widths. I did the following:
iris=datasets.load_iris()
a1=iris['data'][:,3:]
y=(iris['target']==2).astype(int)
log_reg=LogisticRegression()
log_reg.fit(a1,y)
log_reg.coef_
array([[2.61727777]])
log_reg.intercept_
array([4.2209364])
I did the logistic regression myself with Gradient Descent or
NewtonRaphson as I learned from my Coursera course and respectively from
my book of Bishop. I used the Gradient Descent method like so:
from sklearn import datasets
iris=datasets.load_iris()
a1=iris['data'][:,3:]
A1=np.c_[np.ones((150,1)),a1]
y=(iris['target']==2).astype(int).reshape(1,1)
lmda=1
from scipy.special import expit
def logreg_gd(w):
z2=A1.dot(w)
a2=expit(z2)
delta2=a2y
w=w(lmda/len(a1))*A1.T.dot(delta2)
return w
w=np.array([[0],[0]])
for i in range(0,100000):
w=logreg_gd(w)
In [6219]: w
Out[6219]:
array([[21.12563996],
[ 12.94750716]])
I used NewtonRaphson like so, see Bishop page 207,
from sklearn import datasets
iris=datasets.load_iris()
a1=iris['data'][:,3:]
A1=np.c_[np.ones(len(a1)),a1]
y=(iris['target']==2).astype(int).reshape(1,1)
def logreg_nr(w):
z1=A1.dot(w)
y=expit(z1)
R=np.diag((y*(1y))[:,0])
H=A1.T.dot(R).dot(A1)
tmp=A1.dot(w)np.linalg.inv(R).dot(yt)
v=np.linalg.inv(H).dot(A1.T).dot(R).dot(tmp)
return v
w=np.array([[0],[0]])
for i in range(0,10):
w=logreg_nr(w)
In [5149]: w
Out[5149]:
array([[21.12563996],
[ 12.94750716]])
Notice how much faster NewtonRaphson goes than Gradient Descent. But they
give the same result.
How can I see which parameters LogisticRegression() found? And should I
give LogisticRegression other parameters?
On Sat, Jun 8, 2019 at 11:34 AM Eric J. Van der Velden <
ericjvandervelden at gmail.com> wrote:
> Hello,
>
> I am learning sklearn from my book of Geron. On page 137 he learns the
> model of petal widths.
>
> When I implements logistic regression myself as I learned from my Coursera
> course or from my book of Bishop I find that the following parameters are
> found where the cost function is minimal:
>
> In [6219]: w
> Out[6219]:
> array([[21.12563996],
> [ 12.94750716]])
>
> I used Gradient Descent and NewtonRaphson, both give the same answer.
>
> My question is: how can I see after fit() which parameters
> LogisticRegression() has found?
>
> One other question also: when I read the documentation page,
> https://scikitlearn.org/stable/modules/linear_model.html#logisticregression,
> I see a different cost function as I read in the books.
>
> Thanks.
>
>
>
>
 next part 
An HTML attachment was scrubbed...
URL:
From pahome.chen at mirlab.org Sun Jun 9 21:18:28 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Mon, 10 Jun 2019 09:18:28 +0800
Subject: [scikitlearn] Tune parameters when I need to load data segment by
segment?
MessageID:
As title
I have one huge data to load, so I need to train it incrementally.
So I load data segment by segment and train segment by segment like:
MiniBatchKMeans.
In that condition, how to tune parameters? tune the first part of data or
every part of data?
 next part 
An HTML attachment was scrubbed...
URL:
From pahome.chen at mirlab.org Sun Jun 9 22:10:53 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Mon, 10 Jun 2019 10:10:53 +0800
Subject: [scikitlearn] fit before partial_fit ?
InReplyTo:
References:
MessageID:
federico vaggi ? 2019?6?7? ?? ??1:08???
> kmeans isn't a convex problem, unless you freeze the initialization, you
> are going to get very different solutions (depending on the dataset) with
> different initializations.
>
>
Nope, I specify the random_state=0. u can try it.
>>> x = np.array([[1,2],[2,3]])
>>> y = np.array([[3,4],[4,5],[5,6]])
>>> z = np.append(x,y, axis=0)
>>> from sklearn.cluster import MiniBatchKMeans as MBK
>>> m = MBK(random_state=0, n_clusters=2)
>>> m.fit(x) ; m.labels_
array([1,0], dtype=int32) < (1a)
>>> m.partial_fit(y) ; m.labels_
array([0,0,0], dtype=int32) < (1b)
>>> m = MBK(random_state=0, n_clusters=2)
>>> m.partial_fit(x) ; m.labels_
array([0,1], dtype=int32) < (2a)
>>> m.partial_fit(y) ; m.labels_
array([1,1,1], dtype=int32) < (2b)
1a,1b and 2a, 2b are all different, especially the members of each
cluster.
I'm just confused about what usage of partial_fit and fit is the
suitable(reasonable?) way to cluster incrementally?
thx
 next part 
An HTML attachment was scrubbed...
URL:
From christian.braune79 at gmail.com Mon Jun 10 00:25:24 2019
From: christian.braune79 at gmail.com (Christian Braune)
Date: Mon, 10 Jun 2019 06:25:24 +0200
Subject: [scikitlearn] fit before partial_fit ?
InReplyTo:
References:
MessageID:
The clusters produces by your examples are actually the same (despite the
different labels).
I'd guess that "fit" and "partial_fit" draw a different amount of
random_numbers before actually assigning a label to the first (randomly
drawn) sample from "x" (in your code). This is why the labeling is
permutated.
Best regards
Christian
Am Mo., 10. Juni 2019 um 04:12 Uhr schrieb lampahome :
>
>
> federico vaggi ? 2019?6?7? ?? ??1:08???
>
>> kmeans isn't a convex problem, unless you freeze the initialization, you
>> are going to get very different solutions (depending on the dataset) with
>> different initializations.
>>
>>
> Nope, I specify the random_state=0. u can try it.
>
> >>> x = np.array([[1,2],[2,3]])
> >>> y = np.array([[3,4],[4,5],[5,6]])
> >>> z = np.append(x,y, axis=0)
> >>> from sklearn.cluster import MiniBatchKMeans as MBK
> >>> m = MBK(random_state=0, n_clusters=2)
> >>> m.fit(x) ; m.labels_
> array([1,0], dtype=int32) < (1a)
> >>> m.partial_fit(y) ; m.labels_
> array([0,0,0], dtype=int32) < (1b)
>
> >>> m = MBK(random_state=0, n_clusters=2)
> >>> m.partial_fit(x) ; m.labels_
> array([0,1], dtype=int32) < (2a)
> >>> m.partial_fit(y) ; m.labels_
> array([1,1,1], dtype=int32) < (2b)
>
> 1a,1b and 2a, 2b are all different, especially the members of each
> cluster.
> I'm just confused about what usage of partial_fit and fit is the
> suitable(reasonable?) way to cluster incrementally?
>
> thx
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From alexandre.gramfort at inria.fr Mon Jun 10 03:16:17 2019
From: alexandre.gramfort at inria.fr (Alexandre Gramfort)
Date: Mon, 10 Jun 2019 09:16:17 +0200
Subject: [scikitlearn] Difference in normalization between Lasso and
LogisticRegression + L1
InReplyTo:
References:
MessageID:
see https://github.com/scikitlearn/scikitlearn/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aclosed+scale_C+
for historical perspective on this issue.
Alex
On Wed, May 29, 2019 at 11:32 PM Stuart Reynolds
wrote:
>
> I looked into like a while ago. There were differences in which algorithms regularized the intercept, and which ones do not. (I believe liblinear does, lbgfs does not).
> All of the algorithms disagreed with logistic regression in scipy.
>
>  Stuart
>
> On Wed, May 29, 2019 at 10:50 AM Andreas Mueller wrote:
>>
>> That is not very ideal indeed.
>> I think we just went with what liblinear did, and when saga was introduced kept that behavior.
>> It should probably be scaled as in Lasso, I would imagine?
>>
>>
>> On 5/29/19 1:42 PM, Michael Eickenberg wrote:
>>
>> Hi Jesse,
>>
>> I think there was an effort to compare normalization methods on the data attachment term between Lasso and Ridge regression back in 2012/13, but this might have not been finished or extended to Logistic Regression.
>>
>> If it is not documented well, it could definitely benefit from a documentation update.
>>
>> As for changing it to a more consistent state, that would require adding a keyword argument pertaining to this functionality and, after discussion, possibly changing the default value after some deprecation cycles (though this seems like a dangerous one to change at all imho).
>>
>> Michael
>>
>>
>> On Wed, May 29, 2019 at 10:38 AM Jesse Livezey wrote:
>>>
>>> Hi everyone,
>>>
>>> I noticed recently that in the Lasso implementation (and docs), the MSE term is normalized by the number of samples
>>> https://scikitlearn.org/stable/modules/generated/sklearn.linear_model.Lasso.html
>>>
>>> but for LogisticRegression + L1, the logloss does not seem to be normalized by the number of samples. One consequence is that the strength of the regularization depends on the number of samples explicitly. For instance, in Lasso, if you tile a dataset N times, you will learn the same coef, but in LogisticRegression, you will learn a different coef.
>>>
>>> Is this the intended behavior of LogisticRegression? I was surprised by this. Either way, it would be helpful to document this more clearly in the Logistic Regression docs (I can make a PR.)
>>> https://scikitlearn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
>>>
>>> Jesse
>>> _______________________________________________
>>> scikitlearn mailing list
>>> scikitlearn at python.org
>>> https://mail.python.org/mailman/listinfo/scikitlearn
>>
>>
>> _______________________________________________
>> scikitlearn mailing list
>> scikitlearn at python.org
>> https://mail.python.org/mailman/listinfo/scikitlearn
>>
>>
>> _______________________________________________
>> scikitlearn mailing list
>> scikitlearn at python.org
>> https://mail.python.org/mailman/listinfo/scikitlearn
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
From pahome.chen at mirlab.org Mon Jun 10 06:58:01 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Mon, 10 Jun 2019 18:58:01 +0800
Subject: [scikitlearn] How to tune parameters when using partial_fit
MessageID:
as title,
I try to cluster a huge data, but I don't know how to tune parameters when
clustering.
If it's a small dataset, I can use gridsearchcv, but how to if it's a huge
data?
thx
 next part 
An HTML attachment was scrubbed...
URL:
From t3kcit at gmail.com Mon Jun 10 13:23:21 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Mon, 10 Jun 2019 13:23:21 0400
Subject: [scikitlearn] How to tune parameters when using partial_fit
InReplyTo:
References:
MessageID: <4b70d51d29ddfcc3822c6d6074344935@gmail.com>
There's no builtin way to do that with scikitlearn right now, sorry.
On 6/10/19 6:58 AM, lampahome wrote:
> as title,
>
> I try to cluster a huge data, but I don't know how to tune parameters
> when clustering.
>
> If it's a small dataset, I can use gridsearchcv, but how to if it's a
> huge data?
>
> thx
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
 next part 
An HTML attachment was scrubbed...
URL:
From ahowe42 at gmail.com Tue Jun 11 04:07:54 2019
From: ahowe42 at gmail.com (Andrew Howe)
Date: Tue, 11 Jun 2019 09:07:54 +0100
Subject: [scikitlearn] LogisticRegression
InReplyTo:
References:
MessageID:
The coef_ attribute of the LogisticRegression object stores the parameters.
Andrew
<~~~~~~~~~~~~~~~~~~~~~~~~~~~>
J. Andrew Howe, PhD
LinkedIn Profile
ResearchGate Profile
Open Researcher and Contributor ID (ORCID)
Github Profile
Personal Website
I live to learn, so I can learn to live.  me
<~~~~~~~~~~~~~~~~~~~~~~~~~~~>
On Sat, Jun 8, 2019 at 6:58 PM Eric J. Van der Velden <
ericjvandervelden at gmail.com> wrote:
> Here I have added what I had programmed.
>
> With sklearn's LogisticRegression(), how can I see the parameters it has
> found after .fit() where the cost is minimal? I use the book of Geron about
> scikitlearn and tensorflow and on page 137 he trains the model of petal
> widths. I did the following:
>
> iris=datasets.load_iris()
> a1=iris['data'][:,3:]
> y=(iris['target']==2).astype(int)
> log_reg=LogisticRegression()
> log_reg.fit(a1,y)
>
> log_reg.coef_
> array([[2.61727777]])
> log_reg.intercept_
> array([4.2209364])
>
>
> I did the logistic regression myself with Gradient Descent or
> NewtonRaphson as I learned from my Coursera course and respectively from
> my book of Bishop. I used the Gradient Descent method like so:
>
> from sklearn import datasets
> iris=datasets.load_iris()
> a1=iris['data'][:,3:]
> A1=np.c_[np.ones((150,1)),a1]
> y=(iris['target']==2).astype(int).reshape(1,1)
> lmda=1
>
> from scipy.special import expit
>
> def logreg_gd(w):
> z2=A1.dot(w)
> a2=expit(z2)
> delta2=a2y
> w=w(lmda/len(a1))*A1.T.dot(delta2)
> return w
>
> w=np.array([[0],[0]])
> for i in range(0,100000):
> w=logreg_gd(w)
>
> In [6219]: w
> Out[6219]:
> array([[21.12563996],
> [ 12.94750716]])
>
> I used NewtonRaphson like so, see Bishop page 207,
>
> from sklearn import datasets
> iris=datasets.load_iris()
> a1=iris['data'][:,3:]
> A1=np.c_[np.ones(len(a1)),a1]
> y=(iris['target']==2).astype(int).reshape(1,1)
>
> def logreg_nr(w):
> z1=A1.dot(w)
> y=expit(z1)
> R=np.diag((y*(1y))[:,0])
> H=A1.T.dot(R).dot(A1)
> tmp=A1.dot(w)np.linalg.inv(R).dot(yt)
> v=np.linalg.inv(H).dot(A1.T).dot(R).dot(tmp)
> return v
>
> w=np.array([[0],[0]])
> for i in range(0,10):
> w=logreg_nr(w)
>
> In [5149]: w
> Out[5149]:
> array([[21.12563996],
> [ 12.94750716]])
>
> Notice how much faster NewtonRaphson goes than Gradient Descent. But they
> give the same result.
>
> How can I see which parameters LogisticRegression() found? And should I
> give LogisticRegression other parameters?
>
> On Sat, Jun 8, 2019 at 11:34 AM Eric J. Van der Velden <
> ericjvandervelden at gmail.com> wrote:
>
>> Hello,
>>
>> I am learning sklearn from my book of Geron. On page 137 he learns the
>> model of petal widths.
>>
>> When I implements logistic regression myself as I learned from my
>> Coursera course or from my book of Bishop I find that the following
>> parameters are found where the cost function is minimal:
>>
>> In [6219]: w
>> Out[6219]:
>> array([[21.12563996],
>> [ 12.94750716]])
>>
>> I used Gradient Descent and NewtonRaphson, both give the same answer.
>>
>> My question is: how can I see after fit() which parameters
>> LogisticRegression() has found?
>>
>> One other question also: when I read the documentation page,
>> https://scikitlearn.org/stable/modules/linear_model.html#logisticregression,
>> I see a different cost function as I read in the books.
>>
>> Thanks.
>>
>>
>>
>> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From pahome.chen at mirlab.org Tue Jun 11 04:38:07 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Tue, 11 Jun 2019 16:38:07 +0800
Subject: [scikitlearn] How to tune parameters when using partial_fit
InReplyTo: <4b70d51d29ddfcc3822c6d6074344935@gmail.com>
References:
<4b70d51d29ddfcc3822c6d6074344935@gmail.com>
MessageID:
I know there's no builtin way to tune parameter batch by batch.
I'm curious about is there any suitable/general way to tune parameters
batch by batch?
Because the distribution is not easy to know when the dataset is too large
to load into memory.
 next part 
An HTML attachment was scrubbed...
URL:
From jbbrown at kuhp.kyotou.ac.jp Tue Jun 11 05:34:07 2019
From: jbbrown at kuhp.kyotou.ac.jp (Brown J.B.)
Date: Tue, 11 Jun 2019 18:34:07 +0900
Subject: [scikitlearn] How to tune parameters when using partial_fit
InReplyTo:
References:
<4b70d51d29ddfcc3822c6d6074344935@gmail.com>
MessageID:
>
> I'm curious about is there any suitable/general way to tune parameters
> batch by batch?
> Because the distribution is not easy to know when the dataset is too large
> to load into memory.
>
Repeated subsampling to estimate a distribution is one alternative.
Not guaranteed to match the global distribution, but you should get a
reasonable estimate with enough repetitions.
 next part 
An HTML attachment was scrubbed...
URL:
From ericjvandervelden at gmail.com Tue Jun 11 11:47:09 2019
From: ericjvandervelden at gmail.com (Eric J. Van der Velden)
Date: Tue, 11 Jun 2019 17:47:09 +0200
Subject: [scikitlearn] LogisticRegression
InReplyTo:
References:
MessageID:
Hi Nicolas, Andrew,
Thanks!
I found out that it is the regularization term. Sklearn always has that
term. When I program logistic regression with that term too, with
\lambda=1, I get exactly the same answer as sklearn, when I look at the
parameters you gave me.
Question is why sklearn always has that term in logistic regression. If you
have enough data, do you need a regularization term?
Op di 11 jun. 2019 10:08 schreef Andrew Howe :
> The coef_ attribute of the LogisticRegression object stores the parameters.
>
> Andrew
>
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
> J. Andrew Howe, PhD
> LinkedIn Profile
> ResearchGate Profile
> Open Researcher and Contributor ID (ORCID)
>
> Github Profile
> Personal Website
> I live to learn, so I can learn to live.  me
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>
>
> On Sat, Jun 8, 2019 at 6:58 PM Eric J. Van der Velden <
> ericjvandervelden at gmail.com> wrote:
>
>> Here I have added what I had programmed.
>>
>> With sklearn's LogisticRegression(), how can I see the parameters it has
>> found after .fit() where the cost is minimal? I use the book of Geron about
>> scikitlearn and tensorflow and on page 137 he trains the model of petal
>> widths. I did the following:
>>
>> iris=datasets.load_iris()
>> a1=iris['data'][:,3:]
>> y=(iris['target']==2).astype(int)
>> log_reg=LogisticRegression()
>> log_reg.fit(a1,y)
>>
>> log_reg.coef_
>> array([[2.61727777]])
>> log_reg.intercept_
>> array([4.2209364])
>>
>>
>> I did the logistic regression myself with Gradient Descent or
>> NewtonRaphson as I learned from my Coursera course and respectively from
>> my book of Bishop. I used the Gradient Descent method like so:
>>
>> from sklearn import datasets
>> iris=datasets.load_iris()
>> a1=iris['data'][:,3:]
>> A1=np.c_[np.ones((150,1)),a1]
>> y=(iris['target']==2).astype(int).reshape(1,1)
>> lmda=1
>>
>> from scipy.special import expit
>>
>> def logreg_gd(w):
>> z2=A1.dot(w)
>> a2=expit(z2)
>> delta2=a2y
>> w=w(lmda/len(a1))*A1.T.dot(delta2)
>> return w
>>
>> w=np.array([[0],[0]])
>> for i in range(0,100000):
>> w=logreg_gd(w)
>>
>> In [6219]: w
>> Out[6219]:
>> array([[21.12563996],
>> [ 12.94750716]])
>>
>> I used NewtonRaphson like so, see Bishop page 207,
>>
>> from sklearn import datasets
>> iris=datasets.load_iris()
>> a1=iris['data'][:,3:]
>> A1=np.c_[np.ones(len(a1)),a1]
>> y=(iris['target']==2).astype(int).reshape(1,1)
>>
>> def logreg_nr(w):
>> z1=A1.dot(w)
>> y=expit(z1)
>> R=np.diag((y*(1y))[:,0])
>> H=A1.T.dot(R).dot(A1)
>> tmp=A1.dot(w)np.linalg.inv(R).dot(yt)
>> v=np.linalg.inv(H).dot(A1.T).dot(R).dot(tmp)
>> return v
>>
>> w=np.array([[0],[0]])
>> for i in range(0,10):
>> w=logreg_nr(w)
>>
>> In [5149]: w
>> Out[5149]:
>> array([[21.12563996],
>> [ 12.94750716]])
>>
>> Notice how much faster NewtonRaphson goes than Gradient Descent. But
>> they give the same result.
>>
>> How can I see which parameters LogisticRegression() found? And should I
>> give LogisticRegression other parameters?
>>
>> On Sat, Jun 8, 2019 at 11:34 AM Eric J. Van der Velden <
>> ericjvandervelden at gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I am learning sklearn from my book of Geron. On page 137 he learns the
>>> model of petal widths.
>>>
>>> When I implements logistic regression myself as I learned from my
>>> Coursera course or from my book of Bishop I find that the following
>>> parameters are found where the cost function is minimal:
>>>
>>> In [6219]: w
>>> Out[6219]:
>>> array([[21.12563996],
>>> [ 12.94750716]])
>>>
>>> I used Gradient Descent and NewtonRaphson, both give the same answer.
>>>
>>> My question is: how can I see after fit() which parameters
>>> LogisticRegression() has found?
>>>
>>> One other question also: when I read the documentation page,
>>> https://scikitlearn.org/stable/modules/linear_model.html#logisticregression,
>>> I see a different cost function as I read in the books.
>>>
>>> Thanks.
>>>
>>>
>>>
>>> _______________________________________________
>> scikitlearn mailing list
>> scikitlearn at python.org
>> https://mail.python.org/mailman/listinfo/scikitlearn
>>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From t3kcit at gmail.com Tue Jun 11 14:47:57 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Tue, 11 Jun 2019 14:47:57 0400
Subject: [scikitlearn] LogisticRegression
InReplyTo:
References:
MessageID: <295e5a02def76d854a170874e191328e@gmail.com>
On 6/11/19 11:47 AM, Eric J. Van der Velden wrote:
> Hi Nicolas, Andrew,
>
> Thanks!
>
> I found out that it is the regularization term. Sklearn always has
> that term. When I program logistic regression with that term too, with
> \lambda=1, I get exactly the same answer as sklearn, when I look at
> the parameters you gave me.
>
> Question is why sklearn always has that term in logistic regression.
> If you have enough data, do you need a regularization term?
It's equivalent to setting C to a high value.
We now allow penalty='none' in logisticregression, see
https://github.com/scikitlearn/scikitlearn/pull/12860
I opened an issue on improving the docs:
https://github.com/scikitlearn/scikitlearn/issues/14070
feel free to make suggestions there.
There's more discussion here as well:
https://github.com/scikitlearn/scikitlearn/issues/6738
>
> Op di 11 jun. 2019 10:08 schreef Andrew Howe >:
>
> The coef_ attribute of the LogisticRegression object stores the
> parameters.
>
> Andrew
>
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
> J. Andrew Howe, PhD
> LinkedIn Profile
> ResearchGate Profile
>
> Open Researcher and Contributor ID (ORCID)
>
> Github Profile
> Personal Website
> I live to learn, so I can learn to live.  me
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>
>
> On Sat, Jun 8, 2019 at 6:58 PM Eric J. Van der Velden
> >
> wrote:
>
> Here I have added what I had programmed.
>
> With sklearn's LogisticRegression(), how can I see the
> parameters it has found after .fit() where the cost is
> minimal? I use the book of Geron about scikitlearn and
> tensorflow and on page 137 he trains the model of petal
> widths. I did the following:
>
> ? ? iris=datasets.load_iris()
> ? ? a1=iris['data'][:,3:]
> ? ? y=(iris['target']==2).astype(int)
> ? ? log_reg=LogisticRegression()
> ? ? log_reg.fit(a1,y)
>
> ? ? log_reg.coef_
> ? ? array([[2.61727777]])
> ? ? log_reg.intercept_
> ? ? array([4.2209364])
>
>
> I did the logistic regression myself with Gradient Descent or
> NewtonRaphson as I learned from my Coursera course and
> respectively from my book of Bishop. I used the Gradient
> Descent method like so:
>
> ? ? from sklearn import datasets
> ? ? iris=datasets.load_iris()
> ? ? a1=iris['data'][:,3:]
> ? ? A1=np.c_[np.ones((150,1)),a1]
> y=(iris['target']==2).astype(int).reshape(1,1)
> ? ? lmda=1
>
> ? ? from scipy.special import expit
>
> ? ? def logreg_gd(w):
> ? ? ? z2=A1.dot(w)
> ? ? ? a2=expit(z2)
> ? ? ? delta2=a2y
> ? ? ? w=w(lmda/len(a1))*A1.T.dot(delta2)
> ? ? ? return w
> ? ? w=np.array([[0],[0]])
> ? ? for i in range(0,100000):
> ? ? ? w=logreg_gd(w)
>
> ? ? In [6219]: w
> ? ? Out[6219]:
> ? ? array([[21.12563996],
> ? ? ? ? ? ?[ 12.94750716]])
>
> I used NewtonRaphson like so, see Bishop page 207,
>
> ? ? from sklearn import datasets
> ? ? iris=datasets.load_iris()
> ? ? a1=iris['data'][:,3:]
> ? ? A1=np.c_[np.ones(len(a1)),a1]
> y=(iris['target']==2).astype(int).reshape(1,1)
> ? ? def logreg_nr(w):
> ? ? ? z1=A1.dot(w)
> ? ? ? y=expit(z1)
> ? ? ? R=np.diag((y*(1y))[:,0])
> ? ? ? H=A1.T.dot(R).dot(A1)
> ? ? ? tmp=A1.dot(w)np.linalg.inv(R).dot(yt)
> v=np.linalg.inv(H).dot(A1.T).dot(R).dot(tmp)
> ? ? ? return v
>
> ? ? w=np.array([[0],[0]])
> ? ? for i in range(0,10):
> ? ? ? w=logreg_nr(w)
>
> ? ? In [5149]: w
> ? ? Out[5149]:
> ? ? array([[21.12563996],
> ? ? ? ? ? ?[ 12.94750716]])
>
> Notice how much faster NewtonRaphson goes than Gradient
> Descent. But they give the same result.
>
> How can I see which parameters LogisticRegression() found? And
> should I give LogisticRegression other parameters?
>
> On Sat, Jun 8, 2019 at 11:34 AM Eric J. Van der Velden
> > wrote:
>
> Hello,
>
> I am learning sklearn from my book of Geron. On page 137
> he learns the model of petal widths.
>
> When I implements logistic regression myself as I learned
> from my Coursera course or from my book of Bishop I find
> that the following parameters are found where the cost
> function is minimal:
>
> In [6219]: w
> Out[6219]:
> array([[21.12563996],
> ? ? ? ?[ 12.94750716]])
>
> I used Gradient Descent and NewtonRaphson, both give the
> same answer.
>
> My question is: how can I see after fit() which parameters
> LogisticRegression() has found?
>
> One other question also: when I read the documentation
> page,
> https://scikitlearn.org/stable/modules/linear_model.html#logisticregression,
> I see a different cost function as I read in the books.
>
> Thanks.
>
>
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
 next part 
An HTML attachment was scrubbed...
URL:
From ericjvandervelden at gmail.com Wed Jun 12 00:18:47 2019
From: ericjvandervelden at gmail.com (Eric J. Van der Velden)
Date: Wed, 12 Jun 2019 06:18:47 +0200
Subject: [scikitlearn] LogisticRegression
InReplyTo: <295e5a02def76d854a170874e191328e@gmail.com>
References:
<295e5a02def76d854a170874e191328e@gmail.com>
MessageID:
Thanks!
Op di 11 jun. 2019 20:48 schreef Andreas Mueller :
>
>
> On 6/11/19 11:47 AM, Eric J. Van der Velden wrote:
>
> Hi Nicolas, Andrew,
>
> Thanks!
>
> I found out that it is the regularization term. Sklearn always has that
> term. When I program logistic regression with that term too, with
> \lambda=1, I get exactly the same answer as sklearn, when I look at the
> parameters you gave me.
>
> Question is why sklearn always has that term in logistic regression. If
> you have enough data, do you need a regularization term?
>
> It's equivalent to setting C to a high value.
> We now allow penalty='none' in logisticregression, see
> https://github.com/scikitlearn/scikitlearn/pull/12860
>
> I opened an issue on improving the docs:
> https://github.com/scikitlearn/scikitlearn/issues/14070
>
> feel free to make suggestions there.
>
> There's more discussion here as well:
> https://github.com/scikitlearn/scikitlearn/issues/6738
>
>
>
> Op di 11 jun. 2019 10:08 schreef Andrew Howe :
>
>> The coef_ attribute of the LogisticRegression object stores the
>> parameters.
>>
>> Andrew
>>
>> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>> J. Andrew Howe, PhD
>> LinkedIn Profile
>> ResearchGate Profile
>> Open Researcher and Contributor ID (ORCID)
>>
>> Github Profile
>> Personal Website
>> I live to learn, so I can learn to live.  me
>> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>>
>>
>> On Sat, Jun 8, 2019 at 6:58 PM Eric J. Van der Velden <
>> ericjvandervelden at gmail.com> wrote:
>>
>>> Here I have added what I had programmed.
>>>
>>> With sklearn's LogisticRegression(), how can I see the parameters it has
>>> found after .fit() where the cost is minimal? I use the book of Geron about
>>> scikitlearn and tensorflow and on page 137 he trains the model of petal
>>> widths. I did the following:
>>>
>>> iris=datasets.load_iris()
>>> a1=iris['data'][:,3:]
>>> y=(iris['target']==2).astype(int)
>>> log_reg=LogisticRegression()
>>> log_reg.fit(a1,y)
>>>
>>> log_reg.coef_
>>> array([[2.61727777]])
>>> log_reg.intercept_
>>> array([4.2209364])
>>>
>>>
>>> I did the logistic regression myself with Gradient Descent or
>>> NewtonRaphson as I learned from my Coursera course and respectively from
>>> my book of Bishop. I used the Gradient Descent method like so:
>>>
>>> from sklearn import datasets
>>> iris=datasets.load_iris()
>>> a1=iris['data'][:,3:]
>>> A1=np.c_[np.ones((150,1)),a1]
>>> y=(iris['target']==2).astype(int).reshape(1,1)
>>> lmda=1
>>>
>>> from scipy.special import expit
>>>
>>> def logreg_gd(w):
>>> z2=A1.dot(w)
>>> a2=expit(z2)
>>> delta2=a2y
>>> w=w(lmda/len(a1))*A1.T.dot(delta2)
>>> return w
>>>
>>> w=np.array([[0],[0]])
>>> for i in range(0,100000):
>>> w=logreg_gd(w)
>>>
>>> In [6219]: w
>>> Out[6219]:
>>> array([[21.12563996],
>>> [ 12.94750716]])
>>>
>>> I used NewtonRaphson like so, see Bishop page 207,
>>>
>>> from sklearn import datasets
>>> iris=datasets.load_iris()
>>> a1=iris['data'][:,3:]
>>> A1=np.c_[np.ones(len(a1)),a1]
>>> y=(iris['target']==2).astype(int).reshape(1,1)
>>>
>>> def logreg_nr(w):
>>> z1=A1.dot(w)
>>> y=expit(z1)
>>> R=np.diag((y*(1y))[:,0])
>>> H=A1.T.dot(R).dot(A1)
>>> tmp=A1.dot(w)np.linalg.inv(R).dot(yt)
>>> v=np.linalg.inv(H).dot(A1.T).dot(R).dot(tmp)
>>> return v
>>>
>>> w=np.array([[0],[0]])
>>> for i in range(0,10):
>>> w=logreg_nr(w)
>>>
>>> In [5149]: w
>>> Out[5149]:
>>> array([[21.12563996],
>>> [ 12.94750716]])
>>>
>>> Notice how much faster NewtonRaphson goes than Gradient Descent. But
>>> they give the same result.
>>>
>>> How can I see which parameters LogisticRegression() found? And should
>>> I give LogisticRegression other parameters?
>>>
>>> On Sat, Jun 8, 2019 at 11:34 AM Eric J. Van der Velden <
>>> ericjvandervelden at gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> I am learning sklearn from my book of Geron. On page 137 he learns the
>>>> model of petal widths.
>>>>
>>>> When I implements logistic regression myself as I learned from my
>>>> Coursera course or from my book of Bishop I find that the following
>>>> parameters are found where the cost function is minimal:
>>>>
>>>> In [6219]: w
>>>> Out[6219]:
>>>> array([[21.12563996],
>>>> [ 12.94750716]])
>>>>
>>>> I used Gradient Descent and NewtonRaphson, both give the same answer.
>>>>
>>>> My question is: how can I see after fit() which parameters
>>>> LogisticRegression() has found?
>>>>
>>>> One other question also: when I read the documentation page,
>>>> https://scikitlearn.org/stable/modules/linear_model.html#logisticregression,
>>>> I see a different cost function as I read in the books.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>> scikitlearn mailing list
>>> scikitlearn at python.org
>>> https://mail.python.org/mailman/listinfo/scikitlearn
>>>
>> _______________________________________________
>> scikitlearn mailing list
>> scikitlearn at python.org
>> https://mail.python.org/mailman/listinfo/scikitlearn
>>
>
> _______________________________________________
> scikitlearn mailing listscikitlearn at python.orghttps://mail.python.org/mailman/listinfo/scikitlearn
>
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From tmrsg11 at gmail.com Wed Jun 12 14:36:42 2019
From: tmrsg11 at gmail.com (C W)
Date: Wed, 12 Jun 2019 14:36:42 0400
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
<23948deebbaadf7e0b13559a40f7a372@gmail.com>
MessageID:
Thank you both for the papers references.
@ Andreas,
What is your take? And what are you implying?
The Breiman (2001) paper points out the black box vs. statistical approach.
I call them black box vs. open box. He advocates black box in the paper.
Black box:
y < nature < x
Open box:
y < linear regression < x
Decision trees and neural nets are black box model. They require large
amount of data to train, and skip the part where it tries to understand
nature.
Because it is a black box, you can't open up to see what's inside. Linear
regression is a very simple model that you can use to approximate nature,
but the key thing is that you need to know how the data are generated.
@ Brown,
I know nothing about molecular modeling. The paper your linked "Beware of
q2!" paper raises some interesting point, as far as I see in sklearn linear
regression, score is R^2.
On Wed, Jun 5, 2019 at 9:11 AM Andreas Mueller wrote:
>
> On 6/4/19 8:44 PM, C W wrote:
> > Thank you all for the replies.
> >
> > I agree that prediction accuracy is great for evaluating blackbox ML
> > models. Especially advanced models like neural networks, or
> > notsoblack models like LASSO, because they are NPhard to solve.
> >
> > Linear regression is not a blackbox. I view prediction accuracy as an
> > overkill on interpretable models. Especially when you can use
> > Rsquared, coefficient significance, etc.
> >
> > Prediction accuracy also does not tell you which feature is important.
> >
> > What do you guys think? Thank you!
> >
> Did you read the paper that I sent? ;)
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From t3kcit at gmail.com Thu Jun 13 10:41:39 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Thu, 13 Jun 2019 10:41:39 0400
Subject: [scikitlearn] How is linear regression in scikitlearn done?
Do you need train and test split?
InReplyTo:
References:
<23948deebbaadf7e0b13559a40f7a372@gmail.com>
MessageID: <5ca8e766a243fddba366a6cb1a95bd2b@gmail.com>
He doesn't only talk about black box vs statistical, he talks about
model based vs prediction based.
He says that if you validate predictions, you don't need to
(necessarily) worry about model misspecification.
A linear regression model can be misspecified, and it can be overfit.
Just fitting the model will not inform you whether either of these is
the case.
Because the model is simple and well understood, there is ways to check
model misspecification and overfit in several ways.
A traintestsplit doesn't exactly tell you whether the model is
misspecified (errors could be nonnormal and prediction could still be
good),
but it gives you an idea if the model is "useful".
Basically: you need to validate whatever you did. There are modelbased
approaches and there are prediction based approaches.
Prediction based approaches are always applicable, modelbased
approaches are usually more limited and harder to do (but if you find a
good model you got a model of the process, which is great!). But you
need to pick at least one of the two approaches.
On 6/12/19 2:36 PM, C W wrote:
> Thank you both for the papers references.
>
> @ Andreas,
> What is your take? And what are you implying?
>
> The Breiman (2001) paper points out the black box vs. statistical
> approach. I call them black box vs. open box. He advocates black box
> in the paper.
> Black box:
> y < nature < x
>
> Open box:
> y < linear regression < x
>
> Decision trees and neural nets are black box model. They require large
> amount of data to train, and skip the part where it tries to
> understand nature.
>
> Because it is a black box, you can't open up to see what's inside.
> Linear regression is a very simple model that you can use to
> approximate nature, but the key thing is that you need to know how the
> data are generated.
>
> @ Brown,
> I know nothing about molecular modeling. The paper your linked "Beware
> of q2!" paper raises some interesting point, as far as I see in
> sklearn linear regression, score is R^2.
>
> On Wed, Jun 5, 2019 at 9:11 AM Andreas Mueller > wrote:
>
>
> On 6/4/19 8:44 PM, C W wrote:
> > Thank you all for the replies.
> >
> > I agree that prediction accuracy is great for evaluating
> blackbox ML
> > models. Especially advanced models like neural networks, or
> > notsoblack models like LASSO, because they are NPhard to solve.
> >
> > Linear regression is not a blackbox. I view prediction accuracy
> as an
> > overkill on interpretable models. Especially when you can use
> > Rsquared, coefficient significance, etc.
> >
> > Prediction accuracy also does not tell you which feature is
> important.
> >
> > What do you guys think? Thank you!
> >
> Did you read the paper that I sent? ;)
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
>
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
 next part 
An HTML attachment was scrubbed...
URL:
From np.dong572 at gmail.com Thu Jun 13 11:03:48 2019
From: np.dong572 at gmail.com (Naiping Dong)
Date: Thu, 13 Jun 2019 23:03:48 +0800
Subject: [scikitlearn] Concatenate posterior probabilities of different
datasets obtained from different models
MessageID:
Hi all,
I have several small datasets, each is composed by two classes. The
posterior probabilities of different datasets are predicted by different
models, which are constructed either by different models having the
attribute "predict_proba" or the same algorithm trained by different
training data. I wonder whether there exists a method to concatenate these
probabilities as a single array so that I can do some inferences from much
larger number of probabilities.
Thanks in advance.
Best regards,
Elkan
 next part 
An HTML attachment was scrubbed...
URL:
From wendley at ufc.br Mon Jun 17 09:27:27 2019
From: wendley at ufc.br (Wendley Silva)
Date: Mon, 17 Jun 2019 10:27:27 0300
Subject: [scikitlearn] How use get_depth
MessageID:
Hi all,
I tried several ways to use the get_depth() method from
DecisionTreeRegression, but I always get the same error:
self.clf.*get_depth()*
AttributeError: *'DecisionTreeRegressor' object has no attribute
'get_depth'*
I researched the internet and found no solution. Any idea how to use it
correctly?
*Description of get_depth():*
https://scikitlearn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
Thanks in advance.
Best,
*Wendley S. Silva*
Universidade Federal do Cear?  Brasil
+55 (88) 3695.4608
wendley at ufc.br
www.ec.ufc.br/wendley
Rua Cel. Estanislau Frota, 563, Centro, SobralCE, Brasil  CEP 62.010560
 next part 
An HTML attachment was scrubbed...
URL:
From jbbrown at kuhp.kyotou.ac.jp Mon Jun 17 09:41:08 2019
From: jbbrown at kuhp.kyotou.ac.jp (Brown J.B.)
Date: Mon, 17 Jun 2019 22:41:08 +0900
Subject: [scikitlearn] How use get_depth
InReplyTo:
References:
MessageID:
Perhaps you mean:
DecisionTreeRegressor.tree_.max_depth , where DecisionTreeRegressor.tree_
is available after calling fit() ?
2019?6?17?(?) 22:29 Wendley Silva :
> Hi all,
>
> I tried several ways to use the get_depth() method from
> DecisionTreeRegression, but I always get the same error:
>
> self.clf.*get_depth()*
> AttributeError: *'DecisionTreeRegressor' object has no attribute
> 'get_depth'*
>
> I researched the internet and found no solution. Any idea how to use it
> correctly?
>
> *Description of get_depth():*
>
> https://scikitlearn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
>
> Thanks in advance.
>
> Best,
> *Wendley S. Silva*
> Universidade Federal do Cear?  Brasil
>
> +55 (88) 3695.4608
> wendley at ufc.br
> www.ec.ufc.br/wendley
> Rua Cel. Estanislau Frota, 563, Centro, SobralCE, Brasil  CEP 62.0
> 10560
> _______________________________________________
> scikitlearn mailing list
> scikitlearn at python.org
> https://mail.python.org/mailman/listinfo/scikitlearn
>
 next part 
An HTML attachment was scrubbed...
URL:
From adrin.jalali at gmail.com Mon Jun 17 09:43:35 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Mon, 17 Jun 2019 15:43:35 +0200
Subject: [scikitlearn] How use get_depth
InReplyTo:
References:
MessageID:
The function is added in the latest release, you probably need to update
the package and then you should have it.
On Mon., Jun. 17, 2019, 15:42 Brown J.B. via scikitlearn, <
scikitlearn at python.org> wrote:
> Perhaps you mean:
> DecisionTreeRegressor.tree_.max_depth , where DecisionTreeRegressor.tree_
> is available after calling fit() ?
>
>
> 2019?6?17?(?) 22:29 Wendley Silva