[Numpy-discussion] Put type annotations in NumPy proper?
Juan Nunez-Iglesias
jni at fastmail.com
Tue Mar 24 23:56:17 EDT 2020
I'd like to offer a +1 from skimage's perspective (and napari's!) for having NumPy types directly in the repo. We have been wanting to start using type annotations, but the lack of types in NumPy proper, together with the uncertainty about whether numpy-stubs was "officially supported" and in-sync with NumPy itself, have been factors in holding us back.
The __array_function__ problem is a major source of confusion for those of us unfamiliar with typing theory. Protocols would seem to solve this but they were only recently accepted and are only available by default in Python 3.8+:
https://www.python.org/dev/peps/pep-0544/
https://mypy.readthedocs.io/en/stable/protocols.html
I'd like to avoid driving this discussion off-topic, so it would be great if there was a "typing interest group" or similar list where we could discuss these issues. One thing that we would love to do in skimage is distinguish different kinds of arrays using NewType, e.g.
Image = NewType('Image', np.ndarray)
Coordinates = NewType('Coordinates', np.ndarray)
def find_maxima(image : Image) -> Coordinates:
...
def gaussian_filter(image : Image, sigma : float) -> Image:
...
then
find_maxima(gaussian_filter(image, num))
validates, but
gaussian_filter(find_maxima(image), num)
does not, even though the arguments are all NumPy arrays. However, the above cannot be combined with an array Protocol:
https://www.python.org/dev/peps/pep-0544/#newtype-and-type-aliases
I'd love to understand the reasons why, and whether this decision can be reversed, but I am out of my depth. Again, this is probably something for a different thread, but I just wanted to flag it as something to discuss as we develop a typing framework to suit the entire SciPy ecosystem. Also if someone has resources for "type theory for beginners", both in Python and more generally, they would be appreciated!
Thanks Stephan for raising this issue!
Juan.
On Tue, 24 Mar 2020, at 5:42 PM, Joshua Wilson wrote:
> > That is, is this an all-or-nothing thing where as soon as we start, numpy-stubs becomes unusable?
>
> Until NumPy is made PEP 561 compatible by adding a `py.typed` file,
> type checkers will ignore the types in the repo, so in theory you can
> avoid the all or nothing. In practice it's maybe trickier because
> currently people can use the stubs, but they won't be able to use the
> types in the repo until the PEP 561 switch is flipped. So e.g.
> currently SciPy pulls the stubs from `numpy-stubs` master, allowing
> for a short
>
> find place where NumPy stubs are lacking -> improve stubs -> improve SciPy types
>
> loop. If all development moves into the main repo then SciPy is
> blocked on it becoming PEP 561 compatible before moving forward. But,
> you could complain that I put the cart before the horse with
> introducing typing in the SciPy repo before the NumPy types were more
> resolved, and that's probably a fair complaint.
>
> > Anyone interested in taking the lead on this?
>
> Not that I am a core developer or anything, but I am interested in
> helping to improve typing in NumPy.
>
> On Tue, Mar 24, 2020 at 11:15 AM Eric Wieser
> <wieser.eric+numpy at gmail.com> wrote:
> >
> > > Putting
> > > aside ndarray, as more challenging, even annotations for numpy functions
> > > and method parameters with built-in types would help, as a start.
> >
> > This is a good idea in principle, but one thing concerns me.
> >
> > If we add type annotations to numpy, does it become an error to have numpy-stubs installed?
> > That is, is this an all-or-nothing thing where as soon as we start, numpy-stubs becomes unusable?
> >
> > Eric
> >
> > On Tue, 24 Mar 2020 at 17:28, Roman Yurchak <rth.yurchak at gmail.com> wrote:
> >>
> >> Thanks for re-starting this discussion, Stephan! I think there is
> >> definitely significant interest in this topic:
> >> https://github.com/numpy/numpy/issues/7370 is the issue with the largest
> >> number of user likes in the issue tracker (FWIW).
> >>
> >> Having them in numpy, as opposed to a separate numpy-stubs repository
> >> would indeed be ideal from a user perspective. When looking into it in
> >> the past, I was never sure how well in sync numpy-stubs was. Putting
> >> aside ndarray, as more challenging, even annotations for numpy functions
> >> and method parameters with built-in types would help, as a start.
> >>
> >> To add to the previously listed projects that would benefit from this,
> >> we are currently considering to start using some (minimal) type
> >> annotations in scikit-learn.
> >>
> >> --
> >> Roman Yurchak
> >>
> >> On 24/03/2020 18:00, Stephan Hoyer wrote:
> >> > When we started numpy-stubs [1] a few years ago, putting type
> >> > annotations in NumPy itself seemed premature. We still supported Python
> >> > 2, which meant that we would need to use awkward comments for type
> >> > annotations.
> >> >
> >> > Over the past few years, using type annotations has become increasingly
> >> > popular, even in the scientific Python stack. For example, off-hand I
> >> > know that at least SciPy, pandas and xarray have at least part of their
> >> > APIs type annotated. Even without annotations for shapes or dtypes, it
> >> > would be valuable to have near complete annotations for NumPy, the
> >> > project at the bottom of the scientific stack.
> >> >
> >> > Unfortunately, numpy-stubs never really took off. I can think of a few
> >> > reasons for that:
> >> > 1. Missing high level guidance on how to write type annotations,
> >> > particularly for how (or if) to annotate particularly dynamic parts of
> >> > NumPy (e.g., consider __array_function__), and whether we should
> >> > prioritize strictness or faithfulness [2].
> >> > 2. We didn't have a good experience for new contributors. Due to the
> >> > relatively low level of interest in the project, when a contributor
> >> > would occasionally drop in, I often didn't even notice their PR for a
> >> > few weeks.
> >> > 3. Developing type annotations separately from the main codebase makes
> >> > them a little harder to keep in sync. This means that type annotations
> >> > couldn't serve their typical purpose of self-documenting code. Part of
> >> > this may be necessary for NumPy (due to our use of C extensions), but
> >> > large parts of NumPy's user facing APIs are written in Python. We no
> >> > longer support Python 2, so at least we no longer need to worry about
> >> > putting annotations in comments.
> >> >
> >> > We eventually could probably use a formal NEP (or several) on how we
> >> > want to use type annotations in NumPy, but I think a good first step
> >> > would be to think about how to start moving the annotations from
> >> > numpy-stubs into numpy proper.
> >> >
> >> > Any thoughts? Anyone interested in taking the lead on this?
> >> >
> >> > Cheers,
> >> > Stephan
> >> >
> >> > [1] https://github.com/numpy/numpy-stubs
> >> > [2] https://github.com/numpy/numpy-stubs/issues/12
> >> >
> >> > _______________________________________________
> >> > NumPy-Discussion mailing list
> >> > NumPy-Discussion at python.org
> >> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >> >
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list