From g.lemaitre58 at gmail.com Wed Dec 1 08:31:52 2021 From: g.lemaitre58 at gmail.com (=?utf-8?Q?Guillaume_Lema=C3=AEtre?=) Date: Wed, 1 Dec 2021 14:31:52 +0100 Subject: [scikit-learn] scikit-learn office hours on Monday Dec. 6 2021 Message-ID: Hi all, Some of us will be online on the scikit-learn discord next Monday at 10:00 PT / 13:00 ET / 18:00 UTC / 19:00 CET for about an hour or so. First time and occasional contributors are welcome to join us to discord using this invitation link: https://discord.gg/YyYRXMju > The focus of these office hour sessions is to answer questions about contributing to scikit-learn. We can also split into break out audio/text channels and do pair programming or live reviewing of forgotten pull requests with screen sharing. We can also try to assist you into crafting minimal reproduction cases for bug reports to get a higher likelihood of resolution (e.g. https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports >). Please note, our Code of Conduct applies: https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md > See you soon on discord! -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasjpfan at gmail.com Wed Dec 1 09:22:40 2021 From: thomasjpfan at gmail.com (Thomas J. Fan) Date: Wed, 1 Dec 2021 09:22:40 -0500 Subject: [scikit-learn] scikit-learn monthly developer meeting: Monday January 3rd 2022 Message-ID: Dear all, The scikit-learn developer monthly meeting will take place on Monday January 3rd at 22:00 UTC. - Video call link: https://meet.google.com/ews-uszu-djs - Meeting notes / agenda: https://hackmd.io/0yokz72CTZSny8y3Re648Q - Local times: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2022&month=1&day=3&hour=22&min=0&sec=0&p1=1440&p2=240&p3=248&p4=195&p5=179&p6=224 The goal of this meeting is to discuss ongoing development topics for the project. Everybody is welcome. As usual, please follow the code of conduct of the project: https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md Regards, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From norbert at preining.info Wed Dec 8 23:57:33 2021 From: norbert at preining.info (Norbert Preining) Date: Thu, 9 Dec 2021 13:57:33 +0900 Subject: [scikit-learn] scikit-learn 1 - pytest - multiprocessing Pool - hangs? Message-ID: Dear all, I am trying to track down a strange behaviour in one of our (Fujitsu) library we are planning to open source. In preparation for that, I am trying to bring it into a state that it works with scikit-learn >= 1. But, some of our tests fail when running in parallel mode. But they only fail when running under pytest, but NOT when running under python. The library code contains def fit(self, X, y=None): ... p = multiprocessing.Pool() ret = _reduce( p.map(....)) Now what happens is that with scikit-learn 1(.0.1), the code hangs forever. I adjusted the code also so that the pool definition is not in the fit function, but in the __init__ function, and saved into self, but that didn't help either. When interrupted, pytest gives: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! /home/norbert/.pyenv/versions/3.9.6/lib/python3.9/threading.py:312: KeyboardInterrupt (to show a full traceback on KeyboardInterrupt use --full-trace) ================================================ 1 passed, 2 warnings in 273.84s (0:04:33) ================================================= Exception ignored in: Traceback (most recent call last): File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/pool.py", line 268, in __del__ self._change_notifier.put(None) File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/queues.py", line 378, in put self._writer.send_bytes(obj) File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes self._send_bytes(m[offset:offset + size]) File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes self._send(header + buf) File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 373, in _send n = write(self._handle, buf) While when running under python testfile.py all goes well. I have tested the following combinations: * scikit-learn 0.23.*, python 3.8 and python 3.9 => works * scikit-learn 0.24.*, python 3.8 and python 3.9 => works * scikit-learn 1.0.1, python 3.8 and python 3.9 => fails I don't really understand where scikit-learn comes into the play here, so I wanted to ask whether someone here has an idea. Thanks for any suggestion Norbert -- PREINING Norbert https://www.preining.info Fujitsu Research + IFMGA Guide + TU Wien + TeX Live + Debian Dev GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13 From olivier.grisel at ensta.org Thu Dec 9 04:05:35 2021 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 9 Dec 2021 10:05:35 +0100 Subject: [scikit-learn] scikit-learn 1 - pytest - multiprocessing Pool - hangs? In-Reply-To: References: Message-ID: Maybe you can try to use faulthandler.dump_traceback_later https://docs.python.org/3/library/faulthandler.html#faulthandler.dump_traceback_later to get a traceback of all the threads of the main process. But the fact that you are using the default `p = multiprocessing.Pool()` makes me think that it might be related to the lack of fork-safety of the OpenMP runtime library of GCC (libgomp) [1]. There are several ways to check this: - print the output of threadpoolctl.threadpool_info() before calling the code that freezes to confirm (or not) that the libgomp runtime has been loaded before creating the MP Pool. - use multiprocessing Pool using a forkserver context instead of the default fork context: multiprocessing.get_context("forkserver").Pool() - alternatively, use loky.get_reusable_excutor() instead of multiprocessing.Pool() (with a slightly different API) - alternatively, use joblib that uses loky internally with an even more different API. - alternatively, recompile scikit-learn from source with clang instead of gcc so as to link scikit-learn to llvm-openmp instead of gcc's libgomp runtime. llvm-openmp is forksafe, - alternatively, install scikit-learn from conda-forge (conda install -c conda-forge scikit-learn) as the conda-forge distribution relinks all OpenMP compiled extensions of its packaged libraries to llvm-openmp transparently at install time, even if they were built with GCC (maybe we should do that for our linux wheels). [1] https://gcc.gnu.org/legacy-ml/gcc-patches/2014-02/msg00979.html If that does not work or need more help, please feel free to open an issue with a minimal reproducer and ping me on gitter or discord. Le jeu. 9 d?c. 2021 ? 05:59, Norbert Preining a ?crit : > > Dear all, > > I am trying to track down a strange behaviour in one of our (Fujitsu) > library we are planning to open source. In preparation for that, I am > trying to bring it into a state that it works with scikit-learn >= 1. > > But, some of our tests fail when running in parallel mode. But they > only fail when running under pytest, but NOT when running under python. > > The library code contains > > def fit(self, X, y=None): > ... > p = multiprocessing.Pool() > ret = _reduce( > p.map(....)) > > Now what happens is that with scikit-learn 1(.0.1), the code hangs > forever. I adjusted the code also so that the pool definition is not in > the fit function, but in the __init__ function, and saved into self, but > that didn't help either. > > When interrupted, pytest gives: > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > /home/norbert/.pyenv/versions/3.9.6/lib/python3.9/threading.py:312: KeyboardInterrupt > (to show a full traceback on KeyboardInterrupt use --full-trace) > ================================================ 1 passed, 2 warnings in 273.84s (0:04:33) ================================================= > Exception ignored in: > Traceback (most recent call last): > File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/pool.py", line 268, in __del__ > self._change_notifier.put(None) > File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/queues.py", line 378, in put > self._writer.send_bytes(obj) > File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes > self._send_bytes(m[offset:offset + size]) > File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes > self._send(header + buf) > File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 373, in _send > n = write(self._handle, buf) > > > While when running under python testfile.py all goes well. > > > I have tested the following combinations: > * scikit-learn 0.23.*, python 3.8 and python 3.9 => works > * scikit-learn 0.24.*, python 3.8 and python 3.9 => works > * scikit-learn 1.0.1, python 3.8 and python 3.9 => fails > > I don't really understand where scikit-learn comes into the play here, > so I wanted to ask whether someone here has an idea. > > Thanks for any suggestion > > > Norbert > > -- > PREINING Norbert https://www.preining.info > Fujitsu Research + IFMGA Guide + TU Wien + TeX Live + Debian Dev > GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13 > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -- Olivier From thomasjpfan at gmail.com Thu Dec 9 17:22:46 2021 From: thomasjpfan at gmail.com (Thomas J. Fan) Date: Thu, 9 Dec 2021 17:22:46 -0500 Subject: [scikit-learn] scikit-learn Triage-Focused Development Meeting: Friday December 10 2021 Message-ID: Hi all, Our triage-focused development meeting will be on Friday, December 10, 16:30 UTC. - Discord invite: https://discord.gg/92NYvrPSgU - Meeting notes / agenda: https://hackmd.io/C_qEdGapRm2V0kHLx8OcQw?both - Local times: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2021&month=12&day=10&hour=16&min=30&sec=0&p1=1440&p2=240&p3=248&p4=195&p5=179&p6=224&iv=1800 Everyone is welcome to join us in prioritizing and discussing issues or pull requests. Please note, our Code of Conduct applies: https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md See you soon on discord! Regards, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From norbert at preining.info Fri Dec 10 00:16:55 2021 From: norbert at preining.info (Norbert Preining) Date: Fri, 10 Dec 2021 14:16:55 +0900 Subject: [scikit-learn] scikit-learn 1 - pytest - multiprocessing Pool - hangs? In-Reply-To: References: Message-ID: Hi Olivier, thanks a lot, I will try the various options and see what I can do. If and when I understand more, I will report back. Thanks again for the detailed explanation and hints, much appreciated. Best Norbert On Thu, 09 Dec 2021, Olivier Grisel wrote: > Maybe you can try to use faulthandler.dump_traceback_later > https://docs.python.org/3/library/faulthandler.html#faulthandler.dump_traceback_later > to get a traceback of all the threads of the main process. [...] -- PREINING Norbert https://www.preining.info Fujitsu Research + IFMGA Guide + TU Wien + TeX Live + Debian Dev GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13 From reshama.stat at gmail.com Tue Dec 14 12:48:48 2021 From: reshama.stat at gmail.com (Reshama Shaikh) Date: Tue, 14 Dec 2021 12:48:48 -0500 Subject: [scikit-learn] [Data Umbrella] AFME (Africa & Middle East) scikit-learn open source sprint (scikit-learn) In-Reply-To: References: Message-ID: Hi Adrin, Thanks! Can we send the below text over to NumFOCUS for their next monthly newsletter? If you prefer me to send the note directly to Arliss, I can do that. === The Data Umbrella Africa & Middle East (AFME2) *scikit-learn* online sprint was held on October 23, 2021, and the event report is now available. 40 participants joined from 17 countries, and 57% were returning contributors. Check out the report for informative plots. === Reshama Shaikh she/her Blog | Twitter | LinkedIn | GitHub Data Umbrella NYC PyLadies On Mon, Nov 22, 2021 at 5:36 PM Adrin wrote: > Thanks Reshama, > > That's a really nice report! > > On Mon, Nov 22, 2021 at 12:01 PM Reshama Shaikh > wrote: > >> Hello, >> The report from the Data Umbrella Africa & Middle East sprint is here >> [a]. >> >> SUMMARY >> - 40 people joined >> - 17 countries represented >> - 57% were returning contributors >> >> There are a lot of good plots in the report. This is one of the first >> times I've examined attrition more closely, related to gender and country. >> >> Thanks to everyone on the Data Umbrella and scikit-learn teams for their >> assistance in making this happen! >> >> [a]: >> https://blog.dataumbrella.org/data-umbrella-afme2-2021-scikit-learn-sprint-report >> >> Best, >> Reshama >> --- >> Reshama Shaikh >> she/her >> Blog | Twitter >> | LinkedIn >> | GitHub >> >> >> Data Umbrella >> NYC PyLadies >> >> >> >> On Mon, Oct 11, 2021 at 8:00 AM Reshama Shaikh >> wrote: >> >>> Hello, >>> At this time, we have a few spots open for the upcoming October 23 >>> online scikit-learn sprint organized by Data Umbrella. >>> >>> If you reside outside of the Africa and Middle East region, you are now >>> able to apply. >>> https://afme2021rc.dataumbrella.org/home >>> >>> Note 1: we offer a stipend of $10 USD to cover the cost of internet >>> access, and you can indicate such on your application. >>> >>> Note 2: if you need a translator, please indicate so on your >>> application. >>> >>> Key Notes: >>> a) There is a pre-sprint event on Saturday October 16 from 5-6pm EAT. >>> This pre-sprint event is *optional* and an opportunity to answer any >>> questions in general and aid in setting up your virtual environment. >>> >>> b) Sprint is on *Saturday, October 23 at 5pm - 9pm EAT (East Africa >>> Time) *on our Discord server. >>> >>> c) There is a post-sprint event on Saturday November 23 from 5-6pm >>> EAT. This post-sprint event is *optional* and an opportunity to ask the >>> core devs questions on open pull requests. >>> >>> d) There is 3-4 hours of pre-work for the sprint. Here is the >>> checklist: https://afme2021rc.dataumbrella.org/about/prep-work >>> >>> Please feel free to send any questions to me off the mailing list. >>> >>> Best, >>> Reshama >>> Reshama Shaikh >>> she/her >>> Blog | Twitter >>> | LinkedIn >>> | GitHub >>> >>> >>> Data Umbrella >>> NYC PyLadies >>> >>> >>> >>> On Sat, Sep 25, 2021 at 5:05 PM Reshama Shaikh >>> wrote: >>> >>>> Hello, >>>> >>>> Data Umbrella is organizing a scikit-learn sprint for this October 23, >>>> with a focus on **Africa and the Middle East**. This event is free. >>>> >>>> A sprint is a 4-hour hands-on hackathon where we work on beginner >>>> issues in the scikit-learn GitHub repository. Participants will be paired >>>> with another person. There will be core contributors available to answer >>>> any questions. >>>> >>>> Event website is: https://afme2021rc.dataumbrella.org >>>> We encourage folks to read the website and then complete the >>>> application. >>>> >>>> The event can be shared in these ways: >>>> - Retweet: https://twitter.com/DataUmbrella/status/1435972074842034184 >>>> - Share post on LinkedIn: >>>> https://www.linkedin.com/feed/update/urn:li:activity:6841738994305294336/ >>>> >>>> Please feel free to contact me if you have any questions. >>>> >>>> Cheers, >>>> Reshama Shaikh >>>> she/her >>>> >>>> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.lemaitre58 at gmail.com Fri Dec 17 16:53:16 2021 From: g.lemaitre58 at gmail.com (=?utf-8?Q?Guillaume_Lema=C3=AEtre?=) Date: Fri, 17 Dec 2021 22:53:16 +0100 Subject: [scikit-learn] scikit-learn office hours on Monday Dec. 20, 2021 Message-ID: <6DD3B04D-D6D9-471A-B9CD-3B0DCACEB8F4@gmail.com> Hi all, Some of us will be online on the scikit-learn discord next Monday at 10:00 PT / 13:00 ET / 18:00 UTC / 19:00 CET for about an hour or so. First time and occasional contributors are welcome to join us to discord using this invitation link: https://discord.gg/N8dGHPpq The focus of these office hour sessions is to answer questions about contributing to scikit-learn. We can also split into break out audio/text channels and do pair programming or live reviewing of forgotten pull requests with screen sharing. We can also try to assist you into crafting minimal reproduction cases for bug reports to get a higher likelihood of resolution (e.g. https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports >). Please note, our Code of Conduct applies: https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md > See you soon on discord! -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From reshama.stat at gmail.com Tue Dec 21 11:34:36 2021 From: reshama.stat at gmail.com (Reshama Shaikh) Date: Tue, 21 Dec 2021 11:34:36 -0500 Subject: [scikit-learn] Community Office Hours Message-ID: Hello, *scikit-learn: Community Office Hours* Beginning January 11, 2022, the scikit-learn team will be holding bi-weekly (every two weeks) office hours on Mondays. There is also a link to a public calendar which can be added manually: https://calendar.google.com/calendar/u/0/embed?src=social.scikitlearn at gmail.com&ctz=America/New_York DATE: biweekly (every two weeks on Mondays) TIME: 10:00 PT / 13:00 ET / 18:00 UTC / 19:00 CET DURATION: 1 hour WHERE: Discord (https://discord.gg/N8dGHPpq ) ABOUT First time, occasional and regular contributors are welcome to join us on Discord. The focus of these office hour sessions is to answer questions about contributing to scikit-learn. We can also split into break out audio/text channels and do pair programming or live reviewing of stalled pull requests with screen sharing. We can also try to assist you into crafting minimal reproduction cases for bug reports to get a higher likelihood of resolution (a). Please note, our Code of Conduct applies (b). (a) https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports (b) https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md Cheers, Reshama --- Reshama Shaikh she/her Blog | Twitter | LinkedIn | GitHub Data Umbrella NYC PyLadies -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.lemaitre58 at gmail.com Sat Dec 25 15:34:02 2021 From: g.lemaitre58 at gmail.com (=?UTF-8?Q?Guillaume_Lema=C3=AEtre?=) Date: Sat, 25 Dec 2021 21:34:02 +0100 Subject: [scikit-learn] [ANN] scikit-learn 1.0.2 is online! Message-ID: scikit-learn 1.0.2 is out on pypi.org and conda-forge! This is a small maintenance release that fixes a couple of regressions. Binaries and wheels are available for Python 3.10. https://scikit-learn.org/dev/whats_new/v1.0.html#version-1-0-2 You can upgrade with pip as usual: pip install -U scikit-learn The conda-forge builds will be available shortly, which you can then install using: conda install -c conda-forge scikit-learn Thanks again to all the contributors. On behalf of the scikit-learn maintainer team. -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mh.nwafu at gmail.com Sun Dec 26 22:16:20 2021 From: mh.nwafu at gmail.com (Haylee Miller) Date: Mon, 27 Dec 2021 11:16:20 +0800 Subject: [scikit-learn] Fwd: There is a problem with using "r2" to calculate cross_val_score and GridSearchCV scores In-Reply-To: References: Message-ID: I don?t know if the email was successfully sent last time. I send it again now. I?m sorry to disturb you. ---------- Forwarded message --------- ???? Haylee Miller Date: 2021?12?24??? 21:17 Subject: There is a problem with using "r2" to calculate cross_val_score and GridSearchCV scores To: Dear sklearn developers? First of all, thank you for developing this module, it is very useful. However, recently we found a small problem in the use of cross_val_score and GridSearchCV. Using "scoring = ?r2?" to calculate the cross_val_score and GridSearchCV scores is inconsistent with the result calculated using "metrics.r2_score". [image: 5.png] According to the principle of k-fold cross-validation, we performed manual 3-fold cross-validation and there was a big gap between the score and the result of cross_val_score. Below is the code and results of our manual verification process. [image: 1.png] [image: 2.png] [image: 3.png][image: 4.png] Theoretically, the three values in results 1-3 should be similar to the three values in cross_val_score 1 and cross_val_score 2. However, only the first value in cross_val_score 1 and cross_val_score 2 is close to the result 1-3 in figures. Why is this so, looking forward to your reply? Finally, Merry Christmas? Best wishes, Ma Hui -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.png Type: image/png Size: 13921 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 2.png Type: image/png Size: 12312 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 3.png Type: image/png Size: 11661 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 4.png Type: image/png Size: 21220 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 5.png Type: image/png Size: 31637 bytes Desc: not available URL: From g.lemaitre58 at gmail.com Mon Dec 27 03:31:59 2021 From: g.lemaitre58 at gmail.com (g.lemaitre58 at gmail.com) Date: Mon, 27 Dec 2021 09:31:59 +0100 Subject: [scikit-learn] Fwd: There is a problem with using "r2" to calculate cross_val_score and GridSearchCV scores In-Reply-To: References: Message-ID: <68943D06-8BC3-476F-B505-001931FAC7F6@gmail.com> I am not surprised to observe these variations. First, the oob score tends to overestimate the statistical performance of the model. Then the last score is an evaluation without cross validation. Therefore, you trained a single model on full x1 and tested on x3. In cross validation you evaluate 3 models on different of x1. So you have less data and I would expect the last score to be potentially better. The remaining variation is across the fold in the first score. This can happen when you use a idols that does not shuffle the data and that there is a structure in the order of the data. Shuffling the data will break this and make it easier to predict without this variation, most probably. What is important however is to know if this structure is supposed to be existing or not. If it is then shuffling should not be done and the original estimate is what you should look at. Such a wrong shuffling coule be something like shuffling time series: you break the ordering by shuffling while you certainly want to split considering this time structure. Sent from my iPhone > On 27 Dec 2021, at 04:18, Haylee Miller wrote: > > ? > I don?t know if the email was successfully sent last time. I send it again now. I?m sorry to disturb you. > > ---------- Forwarded message --------- > ???? Haylee Miller > Date: 2021?12?24??? 21:17 > Subject: There is a problem with using "r2" to calculate cross_val_score and GridSearchCV scores > To: > > > Dear sklearn developers? > First of all, thank you for developing this module, it is very useful. However, recently we found a small problem in the use of cross_val_score and GridSearchCV. > Using "scoring = ?r2?" to calculate the cross_val_score and GridSearchCV scores is inconsistent with the result calculated using "metrics.r2_score". > > According to the principle of k-fold cross-validation, we performed manual 3-fold cross-validation and there was a big gap between the score and the result of cross_val_score. > Below is the code and results of our manual verification process. > > > > Theoretically, the three values in results 1-3 should be similar to the three values in cross_val_score 1 and cross_val_score 2. > However, only the first value in cross_val_score 1 and cross_val_score 2 is close to the result 1-3 in figures. > Why is this so, looking forward to your reply? > Finally, Merry Christmas? > Best wishes, > Ma Hui > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 5.png Type: image/png Size: 31637 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.png Type: image/png Size: 13921 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 2.png Type: image/png Size: 12312 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 3.png Type: image/png Size: 11661 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 4.png Type: image/png Size: 21220 bytes Desc: not available URL: