[scikit-learn] LogisticRegression

Tue Jun 11 11:47:09 EDT 2019

Hi Nicolas, Andrew,

Thanks!

I found out that it is the regularization term. Sklearn always has that
term. When I program logistic regression with that term too, with
\lambda=1, I get exactly the same answer as sklearn, when I look at the
parameters you gave me.

Question is why sklearn always has that term in logistic regression. If you
have enough data, do you need a regularization term?

Op di 11 jun. 2019 10:08 schreef Andrew Howe <ahowe42 at gmail.com>:

> The coef_ attribute of the LogisticRegression object stores the parameters.
>
> Andrew
>
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
> J. Andrew Howe, PhD
> LinkedIn Profile <http://www.linkedin.com/in/ahowe42>
> ResearchGate Profile <http://www.researchgate.net/profile/John_Howe12/>
> Open Researcher and Contributor ID (ORCID)
> <http://orcid.org/0000-0002-3553-1990>
> Github Profile <http://github.com/ahowe42>
> Personal Website <http://www.andrewhowe.com>
> I live to learn, so I can learn to live. - me
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>
>
> On Sat, Jun 8, 2019 at 6:58 PM Eric J. Van der Velden <
> ericjvandervelden at gmail.com> wrote:
>
>> Here I have added what I had programmed.
>>
>> With sklearn's LogisticRegression(), how can I see the parameters it has
>> found after .fit() where the cost is minimal? I use the book of Geron about
>> scikit-learn and tensorflow and on page 137 he trains the model of petal
>> widths. I did the following:
>>
>>     iris=datasets.load_iris()
>>     a1=iris['data'][:,3:]
>>     y=(iris['target']==2).astype(int)
>>     log_reg=LogisticRegression()
>>     log_reg.fit(a1,y)
>>
>>     log_reg.coef_
>>     array([[2.61727777]])
>>     log_reg.intercept_
>>     array([-4.2209364])
>>
>>
>> I did the logistic regression myself with Gradient Descent or
>> Newton-Raphson as I learned from my Coursera course and respectively from
>> my book of Bishop. I used the Gradient Descent method like so:
>>
>>     from sklearn import datasets
>>     iris=datasets.load_iris()
>>     a1=iris['data'][:,3:]
>>     A1=np.c_[np.ones((150,1)),a1]
>>     y=(iris['target']==2).astype(int).reshape(-1,1)
>>     lmda=1
>>
>>     from scipy.special import expit
>>
>>     def logreg_gd(w):
>>       z2=A1.dot(w)
>>       a2=expit(z2)
>>       delta2=a2-y
>>       w=w-(lmda/len(a1))*A1.T.dot(delta2)
>>       return w
>>
>>     w=np.array([[0],[0]])
>>     for i in range(0,100000):
>>       w=logreg_gd(w)
>>
>>     In [6219]: w
>>     Out[6219]:
>>     array([[-21.12563996],
>>            [ 12.94750716]])
>>
>> I used Newton-Raphson like so, see Bishop page 207,
>>
>>     from sklearn import datasets
>>     iris=datasets.load_iris()
>>     a1=iris['data'][:,3:]
>>     A1=np.c_[np.ones(len(a1)),a1]
>>     y=(iris['target']==2).astype(int).reshape(-1,1)
>>
>>     def logreg_nr(w):
>>       z1=A1.dot(w)
>>       y=expit(z1)
>>       R=np.diag((y*(1-y))[:,0])
>>       H=A1.T.dot(R).dot(A1)
>>       tmp=A1.dot(w)-np.linalg.inv(R).dot(y-t)
>>       v=np.linalg.inv(H).dot(A1.T).dot(R).dot(tmp)
>>       return v
>>
>>     w=np.array([[0],[0]])
>>     for i in range(0,10):
>>       w=logreg_nr(w)
>>
>>     In [5149]: w
>>     Out[5149]:
>>     array([[-21.12563996],
>>            [ 12.94750716]])
>>
>> Notice how much faster Newton-Raphson goes than Gradient Descent. But
>> they give the same result.
>>
>>  How can I see which parameters LogisticRegression() found? And should I
>> give LogisticRegression other parameters?
>>
>> On Sat, Jun 8, 2019 at 11:34 AM Eric J. Van der Velden <
>> ericjvandervelden at gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I am learning sklearn from my book of Geron. On page 137 he learns the
>>> model of petal widths.
>>>
>>> When I implements logistic regression myself as I learned from my
>>> Coursera course or from my book of Bishop I find that the following
>>> parameters are found where the cost function is minimal:
>>>
>>> In [6219]: w
>>> Out[6219]:
>>> array([[-21.12563996],
>>>        [ 12.94750716]])
>>>
>>> I used Gradient Descent and Newton-Raphson, both give the same answer.
>>>
>>> My question is: how can I see after fit() which parameters
>>> LogisticRegression() has found?
>>>
>>> One other question also: when I read the documentation page,
>>> https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression,
>>> I see a different cost function as I read in the books.
>>>
>>> Thanks.
>>>
>>>
>>>
>>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190611/24d8e7f8/attachment.html>