[scikit-learn] scikit-learn Digest, Vol 43, Issue 25
Brown J.B.
jbbrown at kuhp.kyoto-u.ac.jp
Sun Oct 13 06:40:11 EDT 2019
Please, respect and refinement when addressing the contributors and users
of scikit-learn.
Gael's statement is perfect -- complexity does not imply better prediction.
The choice of estimator (and algorithm) depends on the structure of the
model desired for the data presented.
Estimator superiority cannot be proven in a context- and/or data-agnostic
fashion.
J.B.
2019年10月13日(日) 6:13 Mike Smith <javaeurusd at gmail.com>:
> "Second complexity does not
> > imply better prediction. "
>
> Complexity doesn't imply prediction? Perhaps you're having a translation
> error.
>
> On Sat, Oct 12, 2019 at 2:04 PM <scikit-learn-request at python.org> wrote:
>
>> Send scikit-learn mailing list submissions to
>> scikit-learn at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://mail.python.org/mailman/listinfo/scikit-learn
>> or, via email, send a message with subject or body 'help' to
>> scikit-learn-request at python.org
>>
>> You can reach the person managing the list at
>> scikit-learn-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of scikit-learn digest..."
>>
>>
>> Today's Topics:
>>
>> 1. Re: scikit-learn Digest, Vol 43, Issue 24 (Mike Smith)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Sat, 12 Oct 2019 14:04:12 -0700
>> From: Mike Smith <javaeurusd at gmail.com>
>> To: scikit-learn at python.org
>> Subject: Re: [scikit-learn] scikit-learn Digest, Vol 43, Issue 24
>> Message-ID:
>> <CAEWZffD-hNviFkyxuM8CgDR3XSWOyn=
>> 4LRy2NJvjwvVr4RgobQ at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> "... > If I should expect good results on a pc, scikit says that needing
>> gpu power is
>> > obsolete, since certain scikit models perform better (than ml designed
>> for gpu)
>> > that are not designed for gpu, for that reason. Is this true?"
>>
>> Where do you see this written? I think that you are looking for overly
>> simple stories that you are not true."
>>
>> Gael, see the below from the scikit-learn FAQ. You can also find this
>> yourself at the main FAQ:
>>
>> [image: 2019-10-12 14_00_05-Frequently Asked Questions ? scikit-learn
>> 0.21.3 documentation.png]
>>
>>
>> On Sat, Oct 12, 2019 at 9:03 AM <scikit-learn-request at python.org> wrote:
>>
>> > Send scikit-learn mailing list submissions to
>> > scikit-learn at python.org
>> >
>> > To subscribe or unsubscribe via the World Wide Web, visit
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> > or, via email, send a message with subject or body 'help' to
>> > scikit-learn-request at python.org
>> >
>> > You can reach the person managing the list at
>> > scikit-learn-owner at python.org
>> >
>> > When replying, please edit your Subject line so it is more specific
>> > than "Re: Contents of scikit-learn digest..."
>> >
>> >
>> > Today's Topics:
>> >
>> > 1. Re: Is scikit-learn implying neural nets are the best
>> > regressor? (Gael Varoquaux)
>> >
>> >
>> > ----------------------------------------------------------------------
>> >
>> > Message: 1
>> > Date: Fri, 11 Oct 2019 13:34:33 -0400
>> > From: Gael Varoquaux <gael.varoquaux at normalesup.org>
>> > To: Scikit-learn mailing list <scikit-learn at python.org>
>> > Subject: Re: [scikit-learn] Is scikit-learn implying neural nets are
>> > the best regressor?
>> > Message-ID: <20191011173433.bbywiqnwjjpvsi4r at phare.normalesup.org>
>> > Content-Type: text/plain; charset=iso-8859-1
>> >
>> > On Fri, Oct 11, 2019 at 10:10:32AM -0700, Mike Smith wrote:
>> > > In other words, according to that arrangement, is scikit-learn
>> implying
>> > that
>> > > section 1.17 is the best regressor out of the listed, 1.1 to 1.17?
>> >
>> > No.
>> >
>> > First they are not ordered in order of complexity (Naive Bayes is
>> > arguably simpler than Gaussian Processes). Second complexity does not
>> > imply better prediction.
>> >
>> > > If I should expect good results on a pc, scikit says that needing gpu
>> > power is
>> > > obsolete, since certain scikit models perform better (than ml designed
>> > for gpu)
>> > > that are not designed for gpu, for that reason. Is this true?
>> >
>> > Where do you see this written? I think that you are looking for overly
>> > simple stories that you are not true.
>> >
>> > > How much hardware is a practical expectation for running the best
>> > > scikit models and getting the best results?
>> >
>> > This is too vague a question for which there is no answer.
>> >
>> > Ga?l
>> >
>> > > On Fri, Oct 11, 2019 at 9:02 AM <scikit-learn-request at python.org>
>> wrote:
>> >
>> > > Send scikit-learn mailing list submissions to
>> > > ? ? ? ? scikit-learn at python.org
>> >
>> > > To subscribe or unsubscribe via the World Wide Web, visit
>> > > ? ? ? ? https://mail.python.org/mailman/listinfo/scikit-learn
>> > > or, via email, send a message with subject or body 'help' to
>> > > ? ? ? ? scikit-learn-request at python.org
>> >
>> > > You can reach the person managing the list at
>> > > ? ? ? ? scikit-learn-owner at python.org
>> >
>> > > When replying, please edit your Subject line so it is more
>> specific
>> > > than "Re: Contents of scikit-learn digest..."
>> >
>> >
>> > > Today's Topics:
>> >
>> > > ? ?1. Re: logistic regression results are not stable between
>> > > ? ? ? solvers (Andreas Mueller)
>> >
>> >
>> > >
>> > ----------------------------------------------------------------------
>> >
>> > > Message: 1
>> > > Date: Fri, 11 Oct 2019 15:42:58 +0200
>> > > From: Andreas Mueller <t3kcit at gmail.com>
>> > > To: scikit-learn at python.org
>> > > Subject: Re: [scikit-learn] logistic regression results are not
>> > stable
>> > > ? ? ? ? between solvers
>> > > Message-ID: <d55949d6-3355-f892-f6b3-030edf1c7947 at gmail.com>
>> > > Content-Type: text/plain; charset="utf-8"; Format="flowed"
>> >
>> >
>> >
>> > > On 10/10/19 1:14 PM, Beno?t Presles wrote:
>> >
>> > > > Thanks for your answers.
>> >
>> > > > On my real data, I do not have so many samples. I have a bit
>> more
>> > than
>> > > > 200 samples in total and I also would like to get some results
>> with
>> > > > unpenalized logisitic regression.
>> > > > What do you suggest? Should I switch to the lbfgs solver?
>> > > Yes.
>> > > > Am I sure that with this solver I will not have any convergence
>> > issue
>> > > > and always get the good result? Indeed, I did not get any
>> > convergence
>> > > > warning with saga, so I thought everything was fine. I noticed
>> some
>> > > > issues only when I decided to test several solvers. Without
>> > comparing
>> > > > the results across solvers, how to be sure that the optimisation
>> > goes
>> > > > well? Shouldn't scikit-learn warn the user somehow if it is not
>> > the case?
>> > > We should attempt to warn in the SAGA solver if it doesn't
>> converge.
>> > > That it doesn't raise a convergence warning should probably be
>> > > considered a bug.
>> > > It uses the maximum weight change as a stopping criterion right
>> now.
>> > > We could probably compute the dual objective once in the end to
>> see
>> > if
>> > > we converged, right? Or is that not possible with SAGA? If not, we
>> > might
>> > > want to caution that no convergence warning will be raised.
>> >
>> >
>> > > > At last, I was using saga because I also wanted to do some
>> feature
>> > > > selection by using l1 penalty which is not supported by lbfgs...
>> > > You can use liblinear then.
>> >
>> >
>> >
>> > > > Best regards,
>> > > > Ben
>> >
>> >
>> > > > Le 09/10/2019 ? 23:39, Guillaume Lema?tre a ?crit?:
>> > > >> Ups I did not see the answer of Roman. Sorry about that. It is
>> > coming
>> > > >> back to the same conclusion :)
>> >
>> > > >> On Wed, 9 Oct 2019 at 23:37, Guillaume Lema?tre
>> > > >> <g.lemaitre58 at gmail.com <mailto:g.lemaitre58 at gmail.com>>
>> wrote:
>> >
>> > > >>? ? ?Uhm actually increasing to 10000 samples solve the
>> convergence
>> > > issue.
>> > > >>? ? ?SAGA is not designed to work with a so small sample size
>> most
>> > > >>? ? ?probably.
>> >
>> > > >>? ? ?On Wed, 9 Oct 2019 at 23:36, Guillaume Lema?tre
>> > > >>? ? ?<g.lemaitre58 at gmail.com <mailto:g.lemaitre58 at gmail.com>>
>> > wrote:
>> >
>> > > >>? ? ? ? ?I slightly change the bench such that it uses pipeline
>> and
>> > > >>? ? ? ? ?plotted the coefficient:
>> >
>> > > >>? ? ? ? ?https://gist.github.com/glemaitre/
>> > > 8fcc24bdfc7dc38ca0c09c56e26b9386
>> >
>> > > >>? ? ? ? ?I only see one of the 10 splits where SAGA is not
>> > converging,
>> > > >>? ? ? ? ?otherwise the coefficients
>> > > >>? ? ? ? ?look very close (I don't attach the figure here but
>> they
>> > can
>> > > >>? ? ? ? ?be plotted using the snippet).
>> > > >>? ? ? ? ?So apart from this second split, the other differences
>> > seems
>> > > >>? ? ? ? ?to be numerical instability.
>> >
>> > > >>? ? ? ? ?Where I have some concern is regarding the convergence
>> > rate
>> > > >>? ? ? ? ?of SAGA but I have no
>> > > >>? ? ? ? ?intuition to know if this is normal or not.
>> >
>> > > >>? ? ? ? ?On Wed, 9 Oct 2019 at 23:22, Roman Yurchak
>> > > >>? ? ? ? ?<rth.yurchak at gmail.com <mailto:rth.yurchak at gmail.com>>
>> > wrote:
>> >
>> > > >>? ? ? ? ? ? ?Ben,
>> >
>> > > >>? ? ? ? ? ? ?I can confirm your results with penalty='none' and
>> > C=1e9.
>> > > >>? ? ? ? ? ? ?In both cases,
>> > > >>? ? ? ? ? ? ?you are running a mostly unpenalized logisitic
>> > > >>? ? ? ? ? ? ?regression. Usually
>> > > >>? ? ? ? ? ? ?that's less numerically stable than with a small
>> > > >>? ? ? ? ? ? ?regularization,
>> > > >>? ? ? ? ? ? ?depending on the data collinearity.
>> >
>> > > >>? ? ? ? ? ? ?Running that same code with
>> > > >>? ? ? ? ? ? ?? - larger penalty ( smaller C values)
>> > > >>? ? ? ? ? ? ?? - or larger number of samples
>> > > >>? ? ? ? ? ? ?? yields for me the same coefficients (up to some
>> > > tolerance).
>> >
>> > > >>? ? ? ? ? ? ?You can also see that SAGA convergence is not good
>> by
>> > the
>> > > >>? ? ? ? ? ? ?fact that it
>> > > >>? ? ? ? ? ? ?needs 196000 epochs/iterations to converge.
>> >
>> > > >>? ? ? ? ? ? ?Actually, I have often seen convergence issues with
>> > SAG
>> > > >>? ? ? ? ? ? ?on small
>> > > >>? ? ? ? ? ? ?datasets (in unit tests), not fully sure why.
>> >
>> > > >>? ? ? ? ? ? ?--
>> > > >>? ? ? ? ? ? ?Roman
>> >
>> > > >>? ? ? ? ? ? ?On 09/10/2019 22:10, serafim loukas wrote:
>> > > >>? ? ? ? ? ? ?> The predictions across solver are exactly the
>> same
>> > when
>> > > >>? ? ? ? ? ? ?I run the code.
>> > > >>? ? ? ? ? ? ?> I am using 0.21.3 version. What is yours?
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?> In [13]: import sklearn
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?> In [14]: sklearn.__version__
>> > > >>? ? ? ? ? ? ?> Out[14]: '0.21.3'
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?> Serafeim
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?>> On 9 Oct 2019, at 21:44, Beno?t Presles
>> > > >>? ? ? ? ? ? ?<benoit.presles at u-bourgogne.fr
>> > > >>? ? ? ? ? ? ?<mailto:benoit.presles at u-bourgogne.fr>
>> > > >>? ? ? ? ? ? ?>> <mailto:benoit.presles at u-bourgogne.fr
>> > > >>? ? ? ? ? ? ?<mailto:benoit.presles at u-bourgogne.fr>>> wrote:
>> > > >>? ? ? ? ? ? ?>>
>> > > >>? ? ? ? ? ? ?>> (y_pred_lbfgs==y_pred_saga).all() == False
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?>
>> > > >>? ? ? ? ? ? ?> _______________________________________________
>> > > >>? ? ? ? ? ? ?> scikit-learn mailing list
>> > > >>? ? ? ? ? ? ?> scikit-learn at python.org <mailto:
>> > scikit-learn at python.org>
>> > > >>? ? ? ? ? ? ?>
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> > > >>? ? ? ? ? ? ?>
>> >
>> > > >>? ? ? ? ? ? ?_______________________________________________
>> > > >>? ? ? ? ? ? ?scikit-learn mailing list
>> > > >>? ? ? ? ? ? ?scikit-learn at python.org <mailto:
>> > scikit-learn at python.org>
>> > > >>? ? ? ? ? ? ?
>> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> >
>> > > >>? ? ? ? ?--
>> > > >>? ? ? ? ?Guillaume Lemaitre
>> > > >>? ? ? ? ?Scikit-learn @ Inria Foundation
>> > > >>? ? ? ? ?https://glemaitre.github.io/
>> >
>> >
>> >
>> > > >>? ? ?--
>> > > >>? ? ?Guillaume Lemaitre
>> > > >>? ? ?Scikit-learn @ Inria Foundation
>> > > >>? ? ?https://glemaitre.github.io/
>> >
>> >
>> >
>> > > >> --
>> > > >> Guillaume Lemaitre
>> > > >> Scikit-learn @ Inria Foundation
>> > > >> https://glemaitre.github.io/
>> >
>> > > >> _______________________________________________
>> > > >> scikit-learn mailing list
>> > > >> scikit-learn at python.org
>> > > >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> > > > _______________________________________________
>> > > > scikit-learn mailing list
>> > > > scikit-learn at python.org
>> > > > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> > > -------------- next part --------------
>> > > An HTML attachment was scrubbed...
>> > > URL: <
>> > http://mail.python.org/pipermail/scikit-learn/attachments/20191011/
>> > > a7052cd9/attachment-0001.html>
>> >
>> > > ------------------------------
>> >
>> > > Subject: Digest Footer
>> >
>> > > _______________________________________________
>> > > scikit-learn mailing list
>> > > scikit-learn at python.org
>> > > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> > > ------------------------------
>> >
>> > > End of scikit-learn Digest, Vol 43, Issue 21
>> > > ********************************************
>> >
>> >
>> > > _______________________________________________
>> > > scikit-learn mailing list
>> > > scikit-learn at python.org
>> > > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> > --
>> > Gael Varoquaux
>> > Research Director, INRIA Visiting professor, McGill
>> > http://gael-varoquaux.info
>> http://twitter.com/GaelVaroquaux
>> >
>> >
>> > ------------------------------
>> >
>> > Subject: Digest Footer
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> > ------------------------------
>> >
>> > End of scikit-learn Digest, Vol 43, Issue 24
>> > ********************************************
>> >
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://mail.python.org/pipermail/scikit-learn/attachments/20191012/6959d075/attachment.html
>> >
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: 2019-10-12 14_00_05-Frequently Asked Questions ? scikit-learn
>> 0.21.3 documentation.png
>> Type: image/png
>> Size: 26245 bytes
>> Desc: not available
>> URL: <
>> http://mail.python.org/pipermail/scikit-learn/attachments/20191012/6959d075/attachment.png
>> >
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>> ------------------------------
>>
>> End of scikit-learn Digest, Vol 43, Issue 25
>> ********************************************
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20191013/39a1e54e/attachment-0001.html>
More information about the scikit-learn
mailing list