From mahmood.nt at gmail.com  Tue Mar  1 06:34:28 2022
From: mahmood.nt at gmail.com (Mahmood Naderan)
Date: Tue, 1 Mar 2022 12:34:28 +0100
Subject: [scikit-learn] Does KernelDensity works for a vector with two
 elements?
Message-ID: <CADa2P2US7DdYrr=+sRiPRTW0N5r1Cksre6QjAYQ7gXuvETLsHg@mail.gmail.com>

Hi,
I would like to use KernelDensity for some vectors. If the length of
vectors is greater than 2, there is no problem.
However, for the following example, it seems that the density
estimation doesn't work properly.

v = [2.46415e+07,1.23208e+07]
a = array(v).reshape(-1, 1)
kde = KernelDensity(kernel='gaussian', bandwidth=1).fit(a)
s = linspace(min(a),max(a))
e = kde.score_samples(s.reshape(-1,1))
plot(s, e)
mi = argrelextrema(e, np.less)[0]
print ("Minima:", s[mi])


The s[mi] is empty in the end. But indeed the plot shows a minima
because there is a gap between two numbers.
Is there any restriction or note about using KernelDensity?

Regards,
Mahmood

From rdslater at gmail.com  Tue Mar  8 20:02:21 2022
From: rdslater at gmail.com (Robert Slater)
Date: Tue, 8 Mar 2022 19:02:21 -0600
Subject: [scikit-learn] SGD Early Stopping
Message-ID: <CAMt686YXQ7qxaDnJRkMh1-u98wHaMi_YZh-4koA9iJ7hkAG=UA@mail.gmail.com>

We have something we are not understanding.

clf2 = SGDClassifier(loss='log', penalty='l2',shuffle=True,
max_iter=10,tol=.00001,
                     early_stopping=True, validation_fraction=0.2,
                     n_iter_no_change=2, verbose=0, random_state=1)

clf2.fit(X_train,y_train)
clf2.n_iter_

The result of the last line is ALWAYS n_iter_no_chang+1. (in this case 3,
if we set n_iter+no_change=10, it ends at 11)  No matter how I try to slow
things down, it appears the early stopping kicks in at epoch 1.  We've
played with the learning rate, tolerance, etc... to try and make sure our
problem isn't being solved in 1 epoch (which does seem dubious).

I even ran this manually and scored the accuracy (along with enabling
warm_start=True, and max_iter=1)

for i in range(5):
    clf2.fit(X_train,y_train)
    p = clf2.predict(X_test)
    print(accuracy_score(p,y_test))

0.9748226138704509
0.987182421606775
0.9881742580300603
0.9879453727016099
0.991760128175784

So it seems there is some accuracy improvement there to be had, however
small--We're stumped as to what is going on and could use some wiser minds
to explain this behavior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20220308/3a9c26ee/attachment.html>

From reshama.stat at gmail.com  Sun Mar 13 09:15:45 2022
From: reshama.stat at gmail.com (Reshama Shaikh)
Date: Sun, 13 Mar 2022 09:15:45 -0400
Subject: [scikit-learn] scikit-learn: March 2022 Updates: blog(s) + more
Message-ID: <CAKPCsugWcMBAZUcP1g5xk=PJ2-ktxJicxCnCoas5aMBQdyssfA@mail.gmail.com>

Hello,


Read on for March 2022 updates from scikit-learn.


[1] scikit-learn has a * BLOG * !

[2] Blog: Performance in scikit-learn

[3] Blog: Three Components for Reviewing a Pull Request

[4] Calendar: Office Hours and Meetings

[5] scikit-learn is on social media: support us & get updates by following
us!

[6] scikit-learn MOOC is ongoing and you can still sign-up!

[7] NumFOCUS Open Source Research Project


[1]

We would like to share a significant milestone in scikit-learn:  a BLOG!

https://blog.scikit-learn.org


This blog was created using open source tools such as Jekyll and GitHub
Pages.

Lauren Burke, the blog creator, will be sharing how to get started creating
a blog and hosting it for free.  Sign-up is here for the March 22 event:

https://www.meetup.com/data-umbrella/events/284042132/


>Jekyll is a static site generator that can be used to create a custom
website simply, efficiently, and for free of charge. In this session, you
will learn to set up a Jekyll-based website and blog, install a basic
theme, add customizations, and host it via GitHub Pages.


[2]

Maintainer Julien Jerphanion has written a few articles on performance in
scikit-learn.

>scikit-learn has been around for more than 10 years. Yet, scikit-learn has
some room to maneuver when it comes to performance. This series of blog
posts aim to explain the on-going work of the scikit-learn developers to
boost the library?s performance.

https://blog.scikit-learn.org/technical/performances/


[3]

Maintainer Thomas Fan has presented on etiquette in open source (video +
slides + blog), "Three Components for Reviewing a Pull Request":

   1.

   The mechanics of code review on GitHub.
   2.

   The social aspects of code review and how to effectively give feedback.
   3.

   The technical aspects of reviewing a pull request.


Blog:  https://blog.scikit-learn.org/community/pull-request/

Video:  https://youtu.be/dyxS9KKCNzA


[4]

There is a handy calendar available on the blog website with meetings
(Community Office Hours, Triage Team Meeting, Monthly Developer Meetings).

https://blog.scikit-learn.org/calendar/


[5]

Connect with scikit-learn on Social Media by following us on your favorite
platforms! Feel free to share with others in your community including
colleagues and students.


[a] Twitter: https://twitter.com/scikit_learn

[b] Twitter (commits): https://twitter.com/sklearn_commits

[c] LinkedIn: https://www.linkedin.com/company/scikit-learn/

[d] YouTube:
https://www.youtube.com/channel/UCJosFjYm0ZYVUARxuOZqnnw/playlists

[e] Facebook: https://www.facebook.com/scikitlearnofficial/

[f] Instagram: https://www.instagram.com/scikitlearnofficial/   (photos
worth seeing here!)

[g] TikTok: https://www.tiktok.com/@scikit.learn


[6]

MOOC

The second edition of the scikit-learn MOOC, Machine Learning in Python
with scikit-learn, is open (registration
<https://www.fun-mooc.fr/en/courses/machine-learning-python-scikit-learn/>).
This *free* online course runs from February 15 to May 17, 2022.  It
is beginner-friendly,
and a strong technical background is not required. Learners should have
some familiarity with numpy, pandas, matplotlib. A certificate will be
issued upon completion. There is still time to join this course.


[7]

NumFOCUS Open Source Research Project


scikit-learn is working with NumFOCUS <http://numfocus.org> on a research
project
<https://numfocus.org/diversity-inclusion-disc/a-pivotal-time-in-numfocuss-project-aimed-dei-efforts?eType=EmailBlastContent&eId=f41a86c3-60d4-4cf9-86cf-58eb49dc968c>
funded by the Gordon & Betty Moore Foundation <https://www.moore.org/> to
understand the barriers to participation that contributors, particularly
those from historically underrepresented groups, face in the open-source
software community. The research team would like to talk to new
contributors, project developers and maintainers, and those who have
contributed in the past about their experiences joining and contributing to
scikit-learn.


Interested in sharing your experiences?

Please complete this brief ?Participant Interest
<https://numfocus.typeform.com/to/WBWVJSqe>? form which contains additional
information on the research goals, privacy, and confidentiality
considerations. Your participation will be valuable to the growth and
sustainability of diverse and inclusive open-source software communities.
Accepted participants will participate in a 30-minute interview with a
research team member.


Cheers,

Reshama & Lauren

https://scikit-learn.org/dev/about.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20220313/a1bc76e4/attachment-0001.html>

From thomasjpfan at gmail.com  Wed Mar 23 18:53:33 2022
From: thomasjpfan at gmail.com (Thomas J. Fan)
Date: Wed, 23 Mar 2022 18:53:33 -0400
Subject: [scikit-learn] scikit-learn monthly developer meeting: Monday March
 28 2022
Message-ID: <CAK3g5Ab6GvODwcNYV5CuC8Z1CtVoaNc_H5a4nTicnvAV=vxVfw@mail.gmail.com>

Dear all,

The scikit-learn developer monthly meeting will take place on Monday March
28 at 20:00 UTC.

- Video call link: https://meet.google.com/ews-uszu-djs
- Meeting notes / agenda: https://hackmd.io/0yokz72CTZSny8y3Re648Q?both
- Local times:
https://www.timeanddate.com/worldclock/meetingdetails.html?year=2022&month=3&day=28&hour=20&min=0&sec=0&p1=1440&p2=240&p3=248&p4=195&p5=179&p6=224

The goal of this meeting is to discuss ongoing development topics for the
project. Everybody is welcome.

As usual, please follow the code of conduct of the project:
https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md

Regards,
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20220323/ea77483a/attachment.html>

From kalaichelvan at wisc.edu  Wed Mar 23 23:51:46 2022
From: kalaichelvan at wisc.edu (Kab Kalaichelvan)
Date: Thu, 24 Mar 2022 03:51:46 +0000
Subject: [scikit-learn] Nystroem kernel approximation on precomputed kernel
 matrix
Message-ID: <SN7PR06MB726401FF000C813EDC151DFCBB199@SN7PR06MB7264.namprd06.prod.outlook.com>

Dear scikit-learn


My name is Kabiltan Kalaichelvan and I am an undergraduate researcher working under Professor Dane Morgan at the University of Wisconsin Madison.  I am
wondering if there are any code examples of Nystroem reducing the dimensionality of a precomputed kernel matrix(i.e. passing in 'precomputed' for the kernel argument). Thank you for your time.


Best regards,
Kabiltan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20220324/f2466c76/attachment.html>