[scikit-learn] LogisticRegression

Tue Jun 11 14:47:57 EDT 2019

On 6/11/19 11:47 AM, Eric J. Van der Velden wrote:
> Hi Nicolas, Andrew,
>
> Thanks!
>
> I found out that it is the regularization term. Sklearn always has 
> that term. When I program logistic regression with that term too, with 
> \lambda=1, I get exactly the same answer as sklearn, when I look at 
> the parameters you gave me.
>
> Question is why sklearn always has that term in logistic regression. 
> If you have enough data, do you need a regularization term?
It's equivalent to setting C to a high value.
We now allow penalty='none' in logisticregression, see 
https://github.com/scikit-learn/scikit-learn/pull/12860

I opened an issue on improving the docs:
https://github.com/scikit-learn/scikit-learn/issues/14070

feel free to make suggestions there.

There's more discussion here as well:
https://github.com/scikit-learn/scikit-learn/issues/6738

>
> Op di 11 jun. 2019 10:08 schreef Andrew Howe <ahowe42 at gmail.com 
> <mailto:ahowe42 at gmail.com>>:
>
>     The coef_ attribute of the LogisticRegression object stores the
>     parameters.
>
>     Andrew
>
>     <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>     J. Andrew Howe, PhD
>     LinkedIn Profile <http://www.linkedin.com/in/ahowe42>
>     ResearchGate Profile
>     <http://www.researchgate.net/profile/John_Howe12/>
>     Open Researcher and Contributor ID (ORCID)
>     <http://orcid.org/0000-0002-3553-1990>
>     Github Profile <http://github.com/ahowe42>
>     Personal Website <http://www.andrewhowe.com>
>     I live to learn, so I can learn to live. - me
>     <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>
>
>     On Sat, Jun 8, 2019 at 6:58 PM Eric J. Van der Velden
>     <ericjvandervelden at gmail.com <mailto:ericjvandervelden at gmail.com>>
>     wrote:
>
>         Here I have added what I had programmed.
>
>         With sklearn's LogisticRegression(), how can I see the
>         parameters it has found after .fit() where the cost is
>         minimal? I use the book of Geron about scikit-learn and
>         tensorflow and on page 137 he trains the model of petal
>         widths. I did the following:
>
>             iris=datasets.load_iris()
>             a1=iris['data'][:,3:]
>             y=(iris['target']==2).astype(int)
>             log_reg=LogisticRegression()
>             log_reg.fit(a1,y)
>
>             log_reg.coef_
>             array([[2.61727777]])
>             log_reg.intercept_
>             array([-4.2209364])
>
>
>         I did the logistic regression myself with Gradient Descent or
>         Newton-Raphson as I learned from my Coursera course and
>         respectively from my book of Bishop. I used the Gradient
>         Descent method like so:
>
>             from sklearn import datasets
>             iris=datasets.load_iris()
>             a1=iris['data'][:,3:]
>             A1=np.c_[np.ones((150,1)),a1]
>         y=(iris['target']==2).astype(int).reshape(-1,1)
>             lmda=1
>
>             from scipy.special import expit
>
>             def logreg_gd(w):
>               z2=A1.dot(w)
>               a2=expit(z2)
>               delta2=a2-y
>               w=w-(lmda/len(a1))*A1.T.dot(delta2)
>               return w
>             w=np.array([[0],[0]])
>             for i in range(0,100000):
>               w=logreg_gd(w)
>
>             In [6219]: w
>             Out[6219]:
>             array([[-21.12563996],
>                    [ 12.94750716]])
>
>         I used Newton-Raphson like so, see Bishop page 207,
>
>             from sklearn import datasets
>             iris=datasets.load_iris()
>             a1=iris['data'][:,3:]
>             A1=np.c_[np.ones(len(a1)),a1]
>         y=(iris['target']==2).astype(int).reshape(-1,1)
>             def logreg_nr(w):
>               z1=A1.dot(w)
>               y=expit(z1)
>               R=np.diag((y*(1-y))[:,0])
>               H=A1.T.dot(R).dot(A1)
>               tmp=A1.dot(w)-np.linalg.inv(R).dot(y-t)
>         v=np.linalg.inv(H).dot(A1.T).dot(R).dot(tmp)
>               return v
>
>             w=np.array([[0],[0]])
>             for i in range(0,10):
>               w=logreg_nr(w)
>
>             In [5149]: w
>             Out[5149]:
>             array([[-21.12563996],
>                    [ 12.94750716]])
>
>         Notice how much faster Newton-Raphson goes than Gradient
>         Descent. But they give the same result.
>
>         How can I see which parameters LogisticRegression() found? And
>         should I give LogisticRegression other parameters?
>
>         On Sat, Jun 8, 2019 at 11:34 AM Eric J. Van der Velden
>         <ericjvandervelden at gmail.com
>         <mailto:ericjvandervelden at gmail.com>> wrote:
>
>             Hello,
>
>             I am learning sklearn from my book of Geron. On page 137
>             he learns the model of petal widths.
>
>             When I implements logistic regression myself as I learned
>             from my Coursera course or from my book of Bishop I find
>             that the following parameters are found where the cost
>             function is minimal:
>
>             In [6219]: w
>             Out[6219]:
>             array([[-21.12563996],
>                    [ 12.94750716]])
>
>             I used Gradient Descent and Newton-Raphson, both give the
>             same answer.
>
>             My question is: how can I see after fit() which parameters
>             LogisticRegression() has found?
>
>             One other question also: when I read the documentation
>             page,
>             https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression,
>             I see a different cost function as I read in the books.
>
>             Thanks.
>
>
>
>         _______________________________________________
>         scikit-learn mailing list
>         scikit-learn at python.org <mailto:scikit-learn at python.org>
>         https://mail.python.org/mailman/listinfo/scikit-learn
>
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190611/00ab04dc/attachment.html>