[scikit-learn] scikit-learn Digest, Vol 43, Issue 25

Sun Oct 13 06:40:11 EDT 2019

Please, respect and refinement when addressing the contributors and users
of scikit-learn.

Gael's statement is perfect -- complexity does not imply better prediction.
The choice of estimator (and algorithm) depends on the structure of the
model desired for the data presented.
Estimator superiority cannot be proven in a context- and/or data-agnostic
fashion.

J.B.

2019年10月13日(日) 6:13 Mike Smith <javaeurusd at gmail.com>:

> "Second complexity does not
> > imply better prediction. "
>
> Complexity doesn't imply prediction? Perhaps you're having a translation
> error.
>
> On Sat, Oct 12, 2019 at 2:04 PM <scikit-learn-request at python.org> wrote:
>
>> Send scikit-learn mailing list submissions to
>>         scikit-learn at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scikit-learn
>> or, via email, send a message with subject or body 'help' to
>>         scikit-learn-request at python.org
>>
>> You can reach the person managing the list at
>>         scikit-learn-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of scikit-learn digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: scikit-learn Digest, Vol 43, Issue 24 (Mike Smith)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Sat, 12 Oct 2019 14:04:12 -0700
>> From: Mike Smith <javaeurusd at gmail.com>
>> To: scikit-learn at python.org
>> Subject: Re: [scikit-learn] scikit-learn Digest, Vol 43, Issue 24
>> Message-ID:
>>         <CAEWZffD-hNviFkyxuM8CgDR3XSWOyn=
>> 4LRy2NJvjwvVr4RgobQ at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> "...  > If I should expect good results on a pc, scikit says that needing
>> gpu power is
>> > obsolete, since certain scikit models perform better (than ml designed
>> for gpu)
>> > that are not designed for gpu, for that reason. Is this true?"
>>
>> Where do you see this written? I think that you are looking for overly
>> simple stories that you are not true."
>>
>> Gael, see the below from the scikit-learn FAQ. You can also find this
>> yourself at the main FAQ:
>>
>> [image: 2019-10-12 14_00_05-Frequently Asked Questions ? scikit-learn
>> 0.21.3 documentation.png]
>>
>>
>> On Sat, Oct 12, 2019 at 9:03 AM <scikit-learn-request at python.org> wrote:
>>
>> > Send scikit-learn mailing list submissions to
>> >         scikit-learn at python.org
>> >
>> > To subscribe or unsubscribe via the World Wide Web, visit
>> >         https://mail.python.org/mailman/listinfo/scikit-learn
>> > or, via email, send a message with subject or body 'help' to
>> >         scikit-learn-request at python.org
>> >
>> > You can reach the person managing the list at
>> >         scikit-learn-owner at python.org
>> >
>> > When replying, please edit your Subject line so it is more specific
>> > than "Re: Contents of scikit-learn digest..."
>> >
>> >
>> > Today's Topics:
>> >
>> >    1. Re: Is scikit-learn implying neural nets are the best
>> >       regressor? (Gael Varoquaux)
>> >
>> >
>> > ----------------------------------------------------------------------
>> >
>> > Message: 1
>> > Date: Fri, 11 Oct 2019 13:34:33 -0400
>> > From: Gael Varoquaux <gael.varoquaux at normalesup.org>
>> > To: Scikit-learn mailing list <scikit-learn at python.org>
>> > Subject: Re: [scikit-learn] Is scikit-learn implying neural nets are
>> >         the best regressor?
>> > Message-ID: <20191011173433.bbywiqnwjjpvsi4r at phare.normalesup.org>
>> > Content-Type: text/plain; charset=iso-8859-1
>> >
>> > On Fri, Oct 11, 2019 at 10:10:32AM -0700, Mike Smith wrote:
>> > > In other words, according to that arrangement, is scikit-learn
>> implying
>> > that
>> > > section 1.17 is the best regressor out of the listed, 1.1 to 1.17?
>> >
>> > No.
>> >
>> > First they are not ordered in order of complexity (Naive Bayes is
>> > arguably simpler than Gaussian Processes). Second complexity does not
>> > imply better prediction.
>> >
>> > > If I should expect good results on a pc, scikit says that needing gpu
>> > power is
>> > > obsolete, since certain scikit models perform better (than ml designed
>> > for gpu)
>> > > that are not designed for gpu, for that reason. Is this true?
>> >
>> > Where do you see this written? I think that you are looking for overly
>> > simple stories that you are not true.
>> >
>> > > How much hardware is a practical expectation for running the best
>> > > scikit models and getting the best results?
>> >
>> > This is too vague a question for which there is no answer.
>> >
>> > Ga?l
>> >
>> > > On Fri, Oct 11, 2019 at 9:02 AM <scikit-learn-request at python.org>
>> wrote:
>> >
>> > >     Send scikit-learn mailing list submissions to
>> > >     ? ? ? ? scikit-learn at python.org
>> >
>> > >     To subscribe or unsubscribe via the World Wide Web, visit
>> > >     ? ? ? ? https://mail.python.org/mailman/listinfo/scikit-learn
>> > >     or, via email, send a message with subject or body 'help' to
>> > >     ? ? ? ? scikit-learn-request at python.org
>> >
>> > >     You can reach the person managing the list at
>> > >     ? ? ? ? scikit-learn-owner at python.org
>> >
>> > >     When replying, please edit your Subject line so it is more
>> specific
>> > >     than "Re: Contents of scikit-learn digest..."
>> >
>> >
>> > >     Today's Topics:
>> >
>> > >     ? ?1. Re: logistic regression results are not stable between
>> > >     ? ? ? solvers (Andreas Mueller)
>> >
>> >
>> > >
>> >  ----------------------------------------------------------------------
>> >
>> > >     Message: 1
>> > >     Date: Fri, 11 Oct 2019 15:42:58 +0200
>> > >     From: Andreas Mueller <t3kcit at gmail.com>
>> > >     To: scikit-learn at python.org
>> > >     Subject: Re: [scikit-learn] logistic regression results are not
>> > stable
>> > >     ? ? ? ? between solvers
>> > >     Message-ID: <d55949d6-3355-f892-f6b3-030edf1c7947 at gmail.com>
>> > >     Content-Type: text/plain; charset="utf-8"; Format="flowed"
>> >
>> >
>> >
>> > >     On 10/10/19 1:14 PM, Beno?t Presles wrote:
>> >
>> > >     > Thanks for your answers.
>> >
>> > >     > On my real data, I do not have so many samples. I have a bit
>> more
>> > than
>> > >     > 200 samples in total and I also would like to get some results
>> with
>> > >     > unpenalized logisitic regression.
>> > >     > What do you suggest? Should I switch to the lbfgs solver?
>> > >     Yes.
>> > >     > Am I sure that with this solver I will not have any convergence
>> > issue
>> > >     > and always get the good result? Indeed, I did not get any
>> > convergence
>> > >     > warning with saga, so I thought everything was fine. I noticed
>> some
>> > >     > issues only when I decided to test several solvers. Without
>> > comparing
>> > >     > the results across solvers, how to be sure that the optimisation
>> > goes
>> > >     > well? Shouldn't scikit-learn warn the user somehow if it is not
>> > the case?
>> > >     We should attempt to warn in the SAGA solver if it doesn't
>> converge.
>> > >     That it doesn't raise a convergence warning should probably be
>> > >     considered a bug.
>> > >     It uses the maximum weight change as a stopping criterion right
>> now.
>> > >     We could probably compute the dual objective once in the end to
>> see
>> > if
>> > >     we converged, right? Or is that not possible with SAGA? If not, we
>> > might
>> > >     want to caution that no convergence warning will be raised.
>> >
>> >
>> > >     > At last, I was using saga because I also wanted to do some
>> feature
>> > >     > selection by using l1 penalty which is not supported by lbfgs...
>> > >     You can use liblinear then.
>> >
>> >
>> >
>> > >     > Best regards,
>> > >     > Ben
>> >
>> >
>> > >     > Le 09/10/2019 ? 23:39, Guillaume Lema?tre a ?crit?:
>> > >     >> Ups I did not see the answer of Roman. Sorry about that. It is
>> > coming
>> > >     >> back to the same conclusion :)
>> >
>> > >     >> On Wed, 9 Oct 2019 at 23:37, Guillaume Lema?tre
>> > >     >> <g.lemaitre58 at gmail.com <mailto:g.lemaitre58 at gmail.com>>
>> wrote:
>> >
>> > >     >>? ? ?Uhm actually increasing to 10000 samples solve the
>> convergence
>> > >     issue.
>> > >     >>? ? ?SAGA is not designed to work with a so small sample size
>> most
>> > >     >>? ? ?probably.
>> >
>> > >     >>? ? ?On Wed, 9 Oct 2019 at 23:36, Guillaume Lema?tre
>> > >     >>? ? ?<g.lemaitre58 at gmail.com <mailto:g.lemaitre58 at gmail.com>>
>> > wrote:
>> >
>> > >     >>? ? ? ? ?I slightly change the bench such that it uses pipeline
>> and
>> > >     >>? ? ? ? ?plotted the coefficient:
>> >
>> > >     >>? ? ? ? ?https://gist.github.com/glemaitre/
>> > >     8fcc24bdfc7dc38ca0c09c56e26b9386
>> >
>> > >     >>? ? ? ? ?I only see one of the 10 splits where SAGA is not
>> > converging,
>> > >     >>? ? ? ? ?otherwise the coefficients
>> > >     >>? ? ? ? ?look very close (I don't attach the figure here but
>> they
>> > can
>> > >     >>? ? ? ? ?be plotted using the snippet).
>> > >     >>? ? ? ? ?So apart from this second split, the other differences
>> > seems
>> > >     >>? ? ? ? ?to be numerical instability.
>> >
>> > >     >>? ? ? ? ?Where I have some concern is regarding the convergence
>> > rate
>> > >     >>? ? ? ? ?of SAGA but I have no
>> > >     >>? ? ? ? ?intuition to know if this is normal or not.
>> >
>> > >     >>? ? ? ? ?On Wed, 9 Oct 2019 at 23:22, Roman Yurchak
>> > >     >>? ? ? ? ?<rth.yurchak at gmail.com <mailto:rth.yurchak at gmail.com>>
>> > wrote:
>> >
>> > >     >>? ? ? ? ? ? ?Ben,
>> >
>> > >     >>? ? ? ? ? ? ?I can confirm your results with penalty='none' and
>> > C=1e9.
>> > >     >>? ? ? ? ? ? ?In both cases,
>> > >     >>? ? ? ? ? ? ?you are running a mostly unpenalized logisitic
>> > >     >>? ? ? ? ? ? ?regression. Usually
>> > >     >>? ? ? ? ? ? ?that's less numerically stable than with a small
>> > >     >>? ? ? ? ? ? ?regularization,
>> > >     >>? ? ? ? ? ? ?depending on the data collinearity.
>> >
>> > >     >>? ? ? ? ? ? ?Running that same code with
>> > >     >>? ? ? ? ? ? ?? - larger penalty ( smaller C values)
>> > >     >>? ? ? ? ? ? ?? - or larger number of samples
>> > >     >>? ? ? ? ? ? ?? yields for me the same coefficients (up to some
>> > >     tolerance).
>> >
>> > >     >>? ? ? ? ? ? ?You can also see that SAGA convergence is not good
>> by
>> > the
>> > >     >>? ? ? ? ? ? ?fact that it
>> > >     >>? ? ? ? ? ? ?needs 196000 epochs/iterations to converge.
>> >
>> > >     >>? ? ? ? ? ? ?Actually, I have often seen convergence issues with
>> > SAG
>> > >     >>? ? ? ? ? ? ?on small
>> > >     >>? ? ? ? ? ? ?datasets (in unit tests), not fully sure why.
>> >
>> > >     >>? ? ? ? ? ? ?--
>> > >     >>? ? ? ? ? ? ?Roman
>> >
>> > >     >>? ? ? ? ? ? ?On 09/10/2019 22:10, serafim loukas wrote:
>> > >     >>? ? ? ? ? ? ?> The predictions across solver are exactly the
>> same
>> > when
>> > >     >>? ? ? ? ? ? ?I run the code.
>> > >     >>? ? ? ? ? ? ?> I am using 0.21.3 version. What is yours?
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?> In [13]: import sklearn
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?> In [14]: sklearn.__version__
>> > >     >>? ? ? ? ? ? ?> Out[14]: '0.21.3'
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?> Serafeim
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?>> On 9 Oct 2019, at 21:44, Beno?t Presles
>> > >     >>? ? ? ? ? ? ?<benoit.presles at u-bourgogne.fr
>> > >     >>? ? ? ? ? ? ?<mailto:benoit.presles at u-bourgogne.fr>
>> > >     >>? ? ? ? ? ? ?>> <mailto:benoit.presles at u-bourgogne.fr
>> > >     >>? ? ? ? ? ? ?<mailto:benoit.presles at u-bourgogne.fr>>> wrote:
>> > >     >>? ? ? ? ? ? ?>>
>> > >     >>? ? ? ? ? ? ?>> (y_pred_lbfgs==y_pred_saga).all() == False
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?>
>> > >     >>? ? ? ? ? ? ?> _______________________________________________
>> > >     >>? ? ? ? ? ? ?> scikit-learn mailing list
>> > >     >>? ? ? ? ? ? ?> scikit-learn at python.org <mailto:
>> > scikit-learn at python.org>
>> > >     >>? ? ? ? ? ? ?>
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> > >     >>? ? ? ? ? ? ?>
>> >
>> > >     >>? ? ? ? ? ? ?_______________________________________________
>> > >     >>? ? ? ? ? ? ?scikit-learn mailing list
>> > >     >>? ? ? ? ? ? ?scikit-learn at python.org <mailto:
>> > scikit-learn at python.org>
>> > >     >>? ? ? ? ? ? ?
>> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> >
>> > >     >>? ? ? ? ?--
>> > >     >>? ? ? ? ?Guillaume Lemaitre
>> > >     >>? ? ? ? ?Scikit-learn @ Inria Foundation
>> > >     >>? ? ? ? ?https://glemaitre.github.io/
>> >
>> >
>> >
>> > >     >>? ? ?--
>> > >     >>? ? ?Guillaume Lemaitre
>> > >     >>? ? ?Scikit-learn @ Inria Foundation
>> > >     >>? ? ?https://glemaitre.github.io/
>> >
>> >
>> >
>> > >     >> --
>> > >     >> Guillaume Lemaitre
>> > >     >> Scikit-learn @ Inria Foundation
>> > >     >> https://glemaitre.github.io/
>> >
>> > >     >> _______________________________________________
>> > >     >> scikit-learn mailing list
>> > >     >> scikit-learn at python.org
>> > >     >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> > >     > _______________________________________________
>> > >     > scikit-learn mailing list
>> > >     > scikit-learn at python.org
>> > >     > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> > >     -------------- next part --------------
>> > >     An HTML attachment was scrubbed...
>> > >     URL: <
>> > http://mail.python.org/pipermail/scikit-learn/attachments/20191011/
>> > >     a7052cd9/attachment-0001.html>
>> >
>> > >     ------------------------------
>> >
>> > >     Subject: Digest Footer
>> >
>> > >     _______________________________________________
>> > >     scikit-learn mailing list
>> > >     scikit-learn at python.org
>> > >     https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> > >     ------------------------------
>> >
>> > >     End of scikit-learn Digest, Vol 43, Issue 21
>> > >     ********************************************
>> >
>> >
>> > > _______________________________________________
>> > > scikit-learn mailing list
>> > > scikit-learn at python.org
>> > > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> > --
>> >     Gael Varoquaux
>> >     Research Director, INRIA              Visiting professor, McGill
>> >     http://gael-varoquaux.info
>> http://twitter.com/GaelVaroquaux
>> >
>> >
>> > ------------------------------
>> >
>> > Subject: Digest Footer
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> > ------------------------------
>> >
>> > End of scikit-learn Digest, Vol 43, Issue 24
>> > ********************************************
>> >
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://mail.python.org/pipermail/scikit-learn/attachments/20191012/6959d075/attachment.html
>> >
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: 2019-10-12 14_00_05-Frequently Asked Questions ? scikit-learn
>> 0.21.3 documentation.png
>> Type: image/png
>> Size: 26245 bytes
>> Desc: not available
>> URL: <
>> http://mail.python.org/pipermail/scikit-learn/attachments/20191012/6959d075/attachment.png
>> >
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>> ------------------------------
>>
>> End of scikit-learn Digest, Vol 43, Issue 25
>> ********************************************
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20191013/39a1e54e/attachment-0001.html>