[scikit-learn] Is there any official position on PEP484/mypy?

Joel Nothman joel.nothman at gmail.com
Tue Aug 2 10:06:02 EDT 2016

I certainly see the benefit, and think we would benefit also from finding
test coverage holes wrt input type.

But I think without ndarray/sparse matrix type support, we're not going to
be able to annotate most of our code in sufficient detail.

On 2 August 2016 at 23:34, Daniel Moisset <dmoisset at machinalis.com> wrote:

> A couple of things I forgot to mention:
> * One relevant consequence is that, to add annotations on the code,
> scikit-learn should depend on the "typing"[1] module which contains some of
> the basic names imported and used in annotations. It's a stdlib module in
> python 3.5, but the PyPI package backports it to python 2.7 and newer (I'm
> not sure how it works with Python 2.6, which might be an issue)
> * As an example of the kind of bugs that mypy can find, someone here
> already found a documentation bug in the sklearn.svm.SVC() initializer; the
> "kernel" parameter is described as "string"[2], when it's actually a
> "string or callable" (which can be read in the "small print" description of
> the argument). That kind of slips would be automatically prevented if
> declared as an annotation with mypy on the CI. Also it would be more clear
> what is the signature of the callable directly instead of looking up
> additional documentation on kernel functions or digging into the source
> [1] https://pypi.python.org/pypi/typing
> [2]
> http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC
> On Mon, Aug 1, 2016 at 5:15 PM, Daniel Moisset <dmoisset at machinalis.com>
> wrote:
>> On Fri, Jul 29, 2016 at 8:57 PM, Gael Varoquaux <
>> gael.varoquaux at normalesup.org> wrote:
>>> Can you summarize once again in very simple terms what would be the big
>>> benefits?
>> Benefits for regular scikit-learn users
>> 1. Reliable information on method signatures in a standarized way
>> ("reliable" in the sense of "automatically verified")
>> 2. Better integration with tools supporting PEP-484 (editors,
>> documentation tools). This is a small set now, but I expect it to grow (and
>> it's also an egg and chicken problem, support has to start somewhere)
>> Benefits for scikit-learn users also using mypy and/or PEP-484 (probably
>> not a large set, but I know a few people :) )
>> 0. Same as the rest of the users
>> 1. Early detection of errors in own code while writing code based on SKL
>> 2. Making own code more readable/explicit by annotating functions that
>> receive/return SKL types (and verifying that annotations)
>> Benefits for scikit-learn developers
>> 1. Some extra checks that changes keep internal consistency
>> 2. (Future) possible simplification of typing information in docstrings,
>> which would make themselves redundant (this would require updating doc
>> generators)
>> Regarding the cost for contributing, an scenario where you get a CI error
>> due to mypy would be because:
>> * the change in the code somewhat changed the existing accepted/returned
>> types, which is a change in the API and should actually be verified
>> * the change in the code extended the signature of an existing function
>> (what Andreas mentioned); in this situation it's similar to a PR that adds
>> an argument and doesn't update the docstring (only that this is
>> automatically caught).
>> WRT to the second issue, the error here might be confusing when using the
>> "one line" syntax because arguments may "misalign" with their signatures.
>> The multiline version (or the python3-only form) is safer in that sense (in
>> fact, adding an argument there will not produce a CI problem because its
>> unannotated and assumed to be "any type").
>> Adding new modules/methods without no annotations wouldn't produce an
>> error, just an incompleteness in the annotations
>> A possible source of problems like the one you mention is that the
>> implementation of the annotated methods will be checked, and sometimes
>> you'll get a warning about a local variable if mypy can't infer its type
>> (it happens sometimes when assigning an empty list to a local, where mypy
>> knows that it's a list but doesn't know the element type). But in that case
>> I think the message you get is very obvious.
>> --
>> Daniel F. Moisset - UK Country Manager
>> www.machinalis.com
>> Skype: @dmoisset
> --
> Daniel F. Moisset - UK Country Manager
> www.machinalis.com
> Skype: @dmoisset
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160803/d53c821c/attachment-0001.html>

More information about the scikit-learn mailing list