[scikit-learn] Is there any official position on PEP484/mypy?

Andreas Mueller t3kcit at gmail.com
Wed Jul 27 17:08:13 EDT 2016

Hi Daniel.
This hasn't been brought up before so there is no "official position".
I am generally in favor, though I'm not sure how doable it is.
We are generally pretty generous in accepting all kinds of inputs, and 
many of our options can have different types: (None, int, float, string, 
nd-array) is relatively common as a type for an option.
As we still support 2.6, we would need to do comments or external files.

As a user, you are probably most interested in the outputs, right? The 
types returned by scikit-learn could probably be auto-generated.

I'm curious to see what others think.
I'd be surprised if anyone is willing to invest a large amount of time 
on this, though if you guys want to contribute,
we might be able to work something out.


On 07/27/2016 03:17 PM, Daniel Moisset wrote:
> Hi,
> [If you're also on the numpy mailing list and get a similar version of 
> the message, I apologise for that]
> I work at Machinalis were we use a lot of scikit-learn (and the pydata 
> stack in general). Recently we've also been getting involved with 
> mypy, which is a tool to type check (not on runtime, think of it as a 
> linter) annotated python code (the way of annotating python types has 
> been recently standarized in PEP 484).
> As part of that involvement we've started creating type annotations 
> for the Python libraries we use most, which include both numpy and 
> scikit-learn. Mypy provides a way to specify types with annotations in 
> separate files in case you don't have control over a library, so we 
> have created an initial proof of concept for numpy at [1], and we are 
> actively improving it. You can find some additional information about 
> it and some problems we've found on the way at this blogpost [2]. We 
> were planning to also start some work on scikit-learn (which has a 
> much larger surface area than numpy, so probably focusing on small 
> parts for now); we had to start with numpy anyway given that SKL 
> depends on it.
> What I wanted to ask is if the people involved on the SKL project are 
> aware of PEP484 annotations and if you have some interest in starting 
> using them. The main benefit is that annotations serve as clear (and 
> automatically testable) documentation for users, and secondary 
> benefits is that users discovers bugs more quickly and that some IDEs 
> (like pycharm) are starting to use this information for smart editor 
> features (autocompletion, online checking, refactoring tools); 
> eventually tools like jupyter could take advantage of these 
> annotations in the future. And the cost of writing and including these 
> are relatively low.
> We're doing the work anyway, but contributing our typespecs back could 
> make it easier for users to benefit from this, and for us to maintain 
> it and keep it in sync with future releases.
> If you've never heard about PEP484 or mypy (it happens a lot) I'll be 
> happy to clarify anything about it that might helpunderstand this 
> situation
> Thanks!
> D.
> [1] https://github.com/machinalis/mypy-data
> [2] http://www.machinalis.com/blog/writing-type-stubs-for-numpy/
> -- 
> Daniel F. Moisset - UK Country Manager
> www.machinalis.com <http://www.machinalis.com>
> Skype: @dmoisset
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160727/a66f22b1/attachment.html>

More information about the scikit-learn mailing list