RFC: decouple doctesting from refguide-check

Hi, Code examples in the SciPy and NumPy documentation are doctested, with a modified doctesting machinery which understands floating point, numpy formatting and some details of our documentation and API. Our modified doctesting machinery is buried in the refguide-check utility, which also does several unrelated things, all of which are tightly coupled to each other and to the libraries themselves. It has been sometimes described as hard to understand or work with or extend due to this tight coupling and a lack of a dedicated test suite. The numpy and scipy versions of the utility are both vendored, and the numpy version has diverged from the scipy version somewhat. Following a discussion in https://github.com/numpy/numpy/issues/21070 I did a small experiment to decouple the doctesting into a separate package, so that it's easier to consolidate the two versions. Plus, a separate repo is generally easier to maintain, configure, and possibly extend or adopt to other projects. The work-in-progress result is here: https://github.com/ev-br/scpdt It currently can run the full doctesting of the scipy API documentation (docstrings of objects) --- see https://github.com/ev-br/scpdt/pull/33 (the log of a test run with warnings turned to errors is in the GH actions: https://github.com/ev-br/scpdt/runs/6743881766?check_suite_focus=true). Note that it shows, among other things, a couple of deprecation warnings our docs have accumulated :-). The API of the tool closely follows that of the standard library doctest module and provides (nearly) drop-in replacements for doctest checking, parsing, finding and running. Various configuration options for our modifications are collected into a single bag object which is internally passed around. This way, it's user-configurable all the way from a plain standard doctest module behavior to what refguide-check does now. There are a couple of wrinkles to iron out; overall it does what refguide-check does already. One missing bit is doctesting rst or other text files, but it's coming soon. The current plan is to: - verify that the standalone version does not miss things checked by the refguide-check - plumb it through the SciPy dev interface and rip out the refguide-check bundled doctesting utilities. - Sync changes that NumPy version of refguide-check accumulated over time - Make sure it correctly tests the NumPy docs, too. - Better document the internals, there is currently only a readme file. If someone's interested to join me working on these, great, the more the merrier :-). I think it could make sense to move the tool's repository to the scipy github org (or maybe even numpy org?). I'm offering to maintain it regardless of the location. Thoughts? Cheers, Evgeni

Hi Evgeni, This is great, thanks for taking care of that! If it’s a cross project initiative, why not putting this under https://github.com/scientific-python <https://github.com/scientific-python> ? Cheers, Pamphile
On 07.06.2022, at 08:37, Evgeni Burovski <evgeny.burovskiy@gmail.com> wrote:
Hi,
Code examples in the SciPy and NumPy documentation are doctested, with a modified doctesting machinery which understands floating point, numpy formatting and some details of our documentation and API.
Our modified doctesting machinery is buried in the refguide-check utility, which also does several unrelated things, all of which are tightly coupled to each other and to the libraries themselves. It has been sometimes described as hard to understand or work with or extend due to this tight coupling and a lack of a dedicated test suite. The numpy and scipy versions of the utility are both vendored, and the numpy version has diverged from the scipy version somewhat.
Following a discussion in https://github.com/numpy/numpy/issues/21070 I did a small experiment to decouple the doctesting into a separate package, so that it's easier to consolidate the two versions. Plus, a separate repo is generally easier to maintain, configure, and possibly extend or adopt to other projects. The work-in-progress result is here: https://github.com/ev-br/scpdt
It currently can run the full doctesting of the scipy API documentation (docstrings of objects) --- see https://github.com/ev-br/scpdt/pull/33 (the log of a test run with warnings turned to errors is in the GH actions: https://github.com/ev-br/scpdt/runs/6743881766?check_suite_focus=true). Note that it shows, among other things, a couple of deprecation warnings our docs have accumulated :-).
The API of the tool closely follows that of the standard library doctest module and provides (nearly) drop-in replacements for doctest checking, parsing, finding and running. Various configuration options for our modifications are collected into a single bag object which is internally passed around. This way, it's user-configurable all the way from a plain standard doctest module behavior to what refguide-check does now.
There are a couple of wrinkles to iron out; overall it does what refguide-check does already. One missing bit is doctesting rst or other text files, but it's coming soon.
The current plan is to: - verify that the standalone version does not miss things checked by the refguide-check - plumb it through the SciPy dev interface and rip out the refguide-check bundled doctesting utilities. - Sync changes that NumPy version of refguide-check accumulated over time - Make sure it correctly tests the NumPy docs, too. - Better document the internals, there is currently only a readme file. If someone's interested to join me working on these, great, the more the merrier :-).
I think it could make sense to move the tool's repository to the scipy github org (or maybe even numpy org?). I'm offering to maintain it regardless of the location. Thoughts?
Cheers,
Evgeni _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: roy.pamphile@gmail.com

Hi Evgeni, Thanks for taking care of this! I agree with Pamphile that it would make sense to host this under https://github.com/scientific-python and am more than happy to give you whatever access you need. Best regards, Jarrod On Tue, Jun 7, 2022 at 3:29 AM Pamphile Roy <roy.pamphile@gmail.com> wrote:
Hi Evgeni,
This is great, thanks for taking care of that!
If it’s a cross project initiative, why not putting this under https://github.com/scientific-python ?
Cheers, Pamphile
On 07.06.2022, at 08:37, Evgeni Burovski <evgeny.burovskiy@gmail.com> wrote:
Hi,
Code examples in the SciPy and NumPy documentation are doctested, with a modified doctesting machinery which understands floating point, numpy formatting and some details of our documentation and API.
Our modified doctesting machinery is buried in the refguide-check utility, which also does several unrelated things, all of which are tightly coupled to each other and to the libraries themselves. It has been sometimes described as hard to understand or work with or extend due to this tight coupling and a lack of a dedicated test suite. The numpy and scipy versions of the utility are both vendored, and the numpy version has diverged from the scipy version somewhat.
Following a discussion in https://github.com/numpy/numpy/issues/21070 I did a small experiment to decouple the doctesting into a separate package, so that it's easier to consolidate the two versions. Plus, a separate repo is generally easier to maintain, configure, and possibly extend or adopt to other projects. The work-in-progress result is here: https://github.com/ev-br/scpdt
It currently can run the full doctesting of the scipy API documentation (docstrings of objects) --- see https://github.com/ev-br/scpdt/pull/33 (the log of a test run with warnings turned to errors is in the GH actions: https://github.com/ev-br/scpdt/runs/6743881766?check_suite_focus=true). Note that it shows, among other things, a couple of deprecation warnings our docs have accumulated :-).
The API of the tool closely follows that of the standard library doctest module and provides (nearly) drop-in replacements for doctest checking, parsing, finding and running. Various configuration options for our modifications are collected into a single bag object which is internally passed around. This way, it's user-configurable all the way from a plain standard doctest module behavior to what refguide-check does now.
There are a couple of wrinkles to iron out; overall it does what refguide-check does already. One missing bit is doctesting rst or other text files, but it's coming soon.
The current plan is to: - verify that the standalone version does not miss things checked by the refguide-check - plumb it through the SciPy dev interface and rip out the refguide-check bundled doctesting utilities. - Sync changes that NumPy version of refguide-check accumulated over time - Make sure it correctly tests the NumPy docs, too. - Better document the internals, there is currently only a readme file. If someone's interested to join me working on these, great, the more the merrier :-).
I think it could make sense to move the tool's repository to the scipy github org (or maybe even numpy org?). I'm offering to maintain it regardless of the location. Thoughts?
Cheers,
Evgeni _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: roy.pamphile@gmail.com
_______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: jarrod.millman@gmail.com

On Tue, Jun 7, 2022 at 8:38 AM Evgeni Burovski <evgeny.burovskiy@gmail.com> wrote:
Hi,
Code examples in the SciPy and NumPy documentation are doctested, with a modified doctesting machinery which understands floating point, numpy formatting and some details of our documentation and API.
Our modified doctesting machinery is buried in the refguide-check utility, which also does several unrelated things, all of which are tightly coupled to each other and to the libraries themselves. It has been sometimes described as hard to understand or work with or extend due to this tight coupling and a lack of a dedicated test suite. The numpy and scipy versions of the utility are both vendored, and the numpy version has diverged from the scipy version somewhat.
Following a discussion in https://github.com/numpy/numpy/issues/21070 I did a small experiment to decouple the doctesting into a separate package, so that it's easier to consolidate the two versions. Plus, a separate repo is generally easier to maintain, configure, and possibly extend or adopt to other projects. The work-in-progress result is here: https://github.com/ev-br/scpdt
This looks great! I like the README in particular, it's very clear on the why, how and what. The repository name is perhaps the one thing to tweak, something human-readable like `scipy-doctest` (or what "pdt" means) would be nice.
It currently can run the full doctesting of the scipy API documentation (docstrings of objects) --- see https://github.com/ev-br/scpdt/pull/33 (the log of a test run with warnings turned to errors is in the GH actions: https://github.com/ev-br/scpdt/runs/6743881766?check_suite_focus=true). Note that it shows, among other things, a couple of deprecation warnings our docs have accumulated :-).
The API of the tool closely follows that of the standard library doctest module and provides (nearly) drop-in replacements for doctest checking, parsing, finding and running. Various configuration options for our modifications are collected into a single bag object which is internally passed around. This way, it's user-configurable all the way from a plain standard doctest module behavior to what refguide-check does now.
There are a couple of wrinkles to iron out; overall it does what refguide-check does already. One missing bit is doctesting rst or other text files, but it's coming soon.
The current plan is to: - verify that the standalone version does not miss things checked by the refguide-check - plumb it through the SciPy dev interface and rip out the refguide-check bundled doctesting utilities.
I will hurry up with a change to remove dev.py and rename do.py to dev.py, that will avoid the need to make the changes in two places.
- Sync changes that NumPy version of refguide-check accumulated over time - Make sure it correctly tests the NumPy docs, too. - Better document the internals, there is currently only a readme file. If someone's interested to join me working on these, great, the more the merrier :-).
This plan sounds good, thanks for working on it. Maybe one thing to consider: add it as a git submodule to the repo. That will avoid the need to deal with packaging of the separate utility; it'd otherwise require releasing on both PyPI and conda-forge and then add test dependencies. And a git submodule is now easy, and cleaner than copy-vendoring. Cheers, Ralf
I think it could make sense to move the tool's repository to the scipy github org (or maybe even numpy org?). I'm offering to maintain it regardless of the location. Thoughts?
Cheers,
Evgeni _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: ralf.gommers@googlemail.com

Hi, tl;dr: here's an RFC for the plan to decouple the doctesting machinery from refguide-check and use pytest as a runner. The plan and several RFC points below, the original email is quoted further down. Context: to keep examples in the documentation current, we use a modified doctesting machinery in the refguide-check tool. The previous iteration was to decouple the machinery into a separate tool with the API compatible with the standard library doctest module. Several people expressed a preference to use pytest as the test runner. Thanks to great work by Sheila Kahwai during her Quansight internship last summer, we now have a pytest plugin layer. The tool currently lives under https://github.com/ev-br/scpdt, and here's the SciPy PR https://github.com/scipy/scipy/pull/20127 which plumbs it through the `dev.py` interface: $ python dev.py test -s linalg --doctest runs the doctests in the linalg module etc. The plan is - move the tool repository to live under the scipy github organization; - add it as a git submodule; - merge the PR, remove now-duplicate parts of refguide-check; - keep evolving the tool. I believe if the tool lives under the scipy organization, commit and write access will be shared with all the maintainer's team. Either way, I'm happy to continue taking the lead maintaining it. Does this make sense, do you see anything I'm missing? Also there's a question about naming: of the tool itself and of the `dev.py` command: 1. The tool needs renaming: 'scpdt' is awful, and was a quick name of the throw-away experiment. The previous suggestion from Ralf was "scipy-doctest": Over at https://github.com/scipy/scipy/pull/20127#pullrequestreview-1893722684, Pamphile coined "smoke-docs" to parallel "smoke tests". 2. the `dev.py` interface: the current one is `$dev.py test --doctests`, and it runs doctests _only_. Note the difference to `pytest --doctest`: this runs both unit tests and doctests. So maybe `dev.py test --doctest-only`? Or indeed Pamphile's `dev.py test --smoke-docs`? What we certainly do not want is to mix doctests and unit tests. These are two different things: testing is through unit tests, and doctests are only an implementation detail of how we keep documentation current. Thoughts? Evgeni Burovski wrote:
Hi,
Code examples in the SciPy and NumPy documentation are doctested, with a modified doctesting machinery which understands floating point, numpy formatting and some details of our documentation and API.
Our modified doctesting machinery is buried in the refguide-check utility, which also does several unrelated things, all of which are tightly coupled to each other and to the libraries themselves. It has been sometimes described as hard to understand or work with or extend due to this tight coupling and a lack of a dedicated test suite. The numpy and scipy versions of the utility are both vendored, and the numpy version has diverged from the scipy version somewhat.
Following a discussion in https://github.com/numpy/numpy/issues/21070 I did a small experiment to decouple the doctesting into a separate package, so that it's easier to consolidate the two versions. Plus, a separate repo is generally easier to maintain, configure, and possibly extend or adopt to other projects. The work-in-progress result is here: https://github.com/ev-br/scpdt
It currently can run the full doctesting of the scipy API documentation (docstrings of objects) --- see https://github.com/ev-br/scpdt/pull/33 (the log of a test run with warnings turned to errors is in the GH actions: https://github.com/ev-br/scpdt/runs/6743881766?check_suite_focus=true). Note that it shows, among other things, a couple of deprecation warnings our docs have accumulated :-).
The API of the tool closely follows that of the standard library doctest module and provides (nearly) drop-in replacements for doctest checking, parsing, finding and running. Various configuration options for our modifications are collected into a single bag object which is internally passed around. This way, it's user-configurable all the way from a plain standard doctest module behavior to what refguide-check does now.
There are a couple of wrinkles to iron out; overall it does what refguide-check does already. One missing bit is doctesting rst or other text files, but it's coming soon.
The current plan is to: - verify that the standalone version does not miss things checked by the refguide-check - plumb it through the SciPy dev interface and rip out the refguide-check bundled doctesting utilities. - Sync changes that NumPy version of refguide-check accumulated over time - Make sure it correctly tests the NumPy docs, too. - Better document the internals, there is currently only a readme file. If someone's interested to join me working on these, great, the more the merrier :-).
I think it could make sense to move the tool's repository to the scipy github org (or maybe even numpy org?). I'm offering to maintain it regardless of the location. Thoughts?
Cheers,
Evgeni

Regarding the naming, “scipy-doctest” sounds straightforward. And how about “dev.py doctest”? That makes it clear doctest serves a different purpose than unittests.

Regarding the naming, “scipy-doctest” sounds straightforward. And how about “dev.py doctest”? That makes it clear doctest serves a different purpose than unittests.
Ah, yes. This was decided against in https://github.com/scipy/scipy/pull/19242#discussion_r1327000250

On Fri, Feb 23, 2024 at 11:00 AM Evgeni Burovski <evgeny.burovskiy@gmail.com> wrote:
Regarding the naming, “scipy-doctest” sounds straightforward. And how
about “dev.py doctest”? That makes it clear doctest serves a different purpose than unittests.
Ah, yes. This was decided against in https://github.com/scipy/scipy/pull/19242#discussion_r1327000250
I think that comment was just about making it easy to implement, avoiding duplication. Naming wise I'd be happy with anything reasonable - Pamphile's smoke-docs is also good, as is "check-docs" or "docs --verify-examples" or anything like that. Cheers, Ralf

On Thu, Feb 22, 2024 at 2:20 PM Evgeni Burovski <evgeny.burovskiy@gmail.com> wrote:
Hi,
tl;dr: here's an RFC for the plan to decouple the doctesting machinery from refguide-check and use pytest as a runner. The plan and several RFC points below, the original email is quoted further down.
Context: to keep examples in the documentation current, we use a modified doctesting machinery in the refguide-check tool. The previous iteration was to decouple the machinery into a separate tool with the API compatible with the standard library doctest module. Several people expressed a preference to use pytest as the test runner. Thanks to great work by Sheila Kahwai during her Quansight internship last summer, we now have a pytest plugin layer.
The tool currently lives under https://github.com/ev-br/scpdt, and here's the SciPy PR https://github.com/scipy/scipy/pull/20127 which plumbs it through the `dev.py` interface:
$ python dev.py test -s linalg --doctest
runs the doctests in the linalg module etc.
The plan is - move the tool repository to live under the scipy github organization; - add it as a git submodule; - merge the PR, remove now-duplicate parts of refguide-check; - keep evolving the tool.
I believe if the tool lives under the scipy organization, commit and write access will be shared with all the maintainer's team. Either way, I'm happy to continue taking the lead maintaining it.
Does this make sense, do you see anything I'm missing?
That sounds reasonable to me. Git submodule vs. separate package to install is a bit of a toss-up, either way works. Cheers, Ralf
Also there's a question about naming: of the tool itself and of the `dev.py` command:
1. The tool needs renaming: 'scpdt' is awful, and was a quick name of the throw-away experiment. The previous suggestion from Ralf was "scipy-doctest": Over at https://github.com/scipy/scipy/pull/20127#pullrequestreview-1893722684, Pamphile coined "smoke-docs" to parallel "smoke tests".
2. the `dev.py` interface: the current one is `$dev.py test --doctests`, and it runs doctests _only_. Note the difference to `pytest --doctest`: this runs both unit tests and doctests.
So maybe `dev.py test --doctest-only`? Or indeed Pamphile's `dev.py test --smoke-docs`?
What we certainly do not want is to mix doctests and unit tests. These are two different things: testing is through unit tests, and doctests are only an implementation detail of how we keep documentation current.
Thoughts?
Evgeni Burovski wrote:
Hi,
Code examples in the SciPy and NumPy documentation are doctested, with a modified doctesting machinery which understands floating point, numpy formatting and some details of our documentation and API.
Our modified doctesting machinery is buried in the refguide-check utility, which also does several unrelated things, all of which are tightly coupled to each other and to the libraries themselves. It has been sometimes described as hard to understand or work with or extend due to this tight coupling and a lack of a dedicated test suite. The numpy and scipy versions of the utility are both vendored, and the numpy version has diverged from the scipy version somewhat.
Following a discussion in https://github.com/numpy/numpy/issues/21070 I did a small experiment to decouple the doctesting into a separate package, so that it's easier to consolidate the two versions. Plus, a separate repo is generally easier to maintain, configure, and possibly extend or adopt to other projects. The work-in-progress result is here: https://github.com/ev-br/scpdt
It currently can run the full doctesting of the scipy API documentation (docstrings of objects) --- see https://github.com/ev-br/scpdt/pull/33 (the log of a test run with warnings turned to errors is in the GH actions: https://github.com/ev-br/scpdt/runs/6743881766?check_suite_focus=true). Note that it shows, among other things, a couple of deprecation warnings our docs have accumulated :-).
The API of the tool closely follows that of the standard library doctest module and provides (nearly) drop-in replacements for doctest checking, parsing, finding and running. Various configuration options for our modifications are collected into a single bag object which is internally passed around. This way, it's user-configurable all the way from a plain standard doctest module behavior to what refguide-check does now.
There are a couple of wrinkles to iron out; overall it does what refguide-check does already. One missing bit is doctesting rst or other text files, but it's coming soon.
The current plan is to: - verify that the standalone version does not miss things checked by the refguide-check - plumb it through the SciPy dev interface and rip out the refguide-check bundled doctesting utilities. - Sync changes that NumPy version of refguide-check accumulated over time - Make sure it correctly tests the NumPy docs, too. - Better document the internals, there is currently only a readme file. If someone's interested to join me working on these, great, the more the merrier :-).
I think it could make sense to move the tool's repository to the scipy github org (or maybe even numpy org?). I'm offering to maintain it regardless of the location. Thoughts?
Cheers,
Evgeni
SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-leave@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: ralf.gommers@googlemail.com
participants (5)
-
Evgeni Burovski
-
Jarrod Millman
-
Pamphile Roy
-
Ralf Gommers
-
RR YY