Mailman 3 Code samples in docstrings mistaken as doctests - NumPy-Discussion

Code samples in docstrings mistaken as doctests

Alan McIntyre

23 Jun 2008 23 Jun '08

7:37 a.m.

Hi all, Some docstrings have examples of how to use the function that aren't executable code (see numpy.core.defmatrix.bmat for an example) in their current form. Should these examples have the ">>>" removed from them to avoid them being picked up as doctests? Thanks, Alan

Show replies by date

Stéfan van der Walt

23 Jun 23 Jun

8:03 a.m.

New subject: Code samples in docstrings mistaken as doctests

2008/6/23 Alan McIntyre :

...

Some docstrings have examples of how to use the function that aren't executable code (see numpy.core.defmatrix.bmat for an example) in their current form. Should these examples have the ">>>" removed from them to avoid them being picked up as doctests?

The examples written for the random module warrants the same question. First and foremost, the docstrings are there to illustrate to users how to use the code; second, to serve as tests. Example codes should run, but I'm not sure whether they should always be valid doctests. In the `bmat` example, I would remove the '>>>' like you suggested. Regards Stéfan

Fernando Perez

6:02 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 1:03 AM, Stéfan van der Walt wrote:

...

2008/6/23 Alan McIntyre :

...
Some docstrings have examples of how to use the function that aren't executable code (see numpy.core.defmatrix.bmat for an example) in their current form. Should these examples have the ">>>" removed from them to avoid them being picked up as doctests?

The examples written for the random module warrants the same question. First and foremost, the docstrings are there to illustrate to users how to use the code; second, to serve as tests.

Example codes should run, but I'm not sure whether they should always be valid doctests.

In the `bmat` example, I would remove the '>>>' like you suggested.

There's also the option of marking them so doctest skips them via #doctest: +SKIP http://docs.python.org/lib/doctest-options.html Cheers, f

Alan McIntyre

6:17 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 2:02 PM, Fernando Perez wrote:

...

There's also the option of marking them so doctest skips them via

#doctest: +SKIP

http://docs.python.org/lib/doctest-options.html

For short examples, that seems like a good option, but it seems like you have to have that comment on every line that you want skipped. There are some long examples (like the one in lib/function_base.py:bartlett) that (to me) would look pretty ugly having that comment tacked on to every line. Either way is fine with me in the end, though, so long as it doesn't produce test failures. :)

Pauli Virtanen

6:37 p.m.

New subject: Code samples in docstrings mistaken as doctests

Mon, 23 Jun 2008 14:17:09 -0400, Alan McIntyre wrote:

...

On Mon, Jun 23, 2008 at 2:02 PM, Fernando Perez wrote:

...
There's also the option of marking them so doctest skips them via

#doctest: +SKIP

http://docs.python.org/lib/doctest-options.html

For short examples, that seems like a good option, but it seems like you have to have that comment on every line that you want skipped. There are some long examples (like the one in lib/function_base.py:bartlett) that (to me) would look pretty ugly having that comment tacked on to every line.

Either way is fine with me in the end, though, so long as it doesn't produce test failures. :)

Can you make the convention chosen for the examples (currently only in the doc wiki, not yet in SVN) to work: assuming "import numpy as np" in examples? This would remove the need for those "from numpy import *" lines in the examples that I see were added in r5311. -- Pauli Virtanen

Alan McIntyre

7:03 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 2:37 PM, Pauli Virtanen wrote:

...

Can you make the convention chosen for the examples (currently only in the doc wiki, not yet in SVN) to work: assuming "import numpy as np" in examples?

This would remove the need for those "from numpy import *" lines in the examples that I see were added in r5311.

Sure, I'll look at that. It seems like every possible option for importing stuff from numpy is used in doctests (sometimes even in the same module), so having them standardized with that implicit import is much better.

Alan McIntyre

25 Jun 25 Jun

5:19 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 3:03 PM, Alan McIntyre wrote:

...

On Mon, Jun 23, 2008 at 2:37 PM, Pauli Virtanen wrote:

...
Can you make the convention chosen for the examples (currently only in the doc wiki, not yet in SVN) to work: assuming "import numpy as np" in examples?

This would remove the need for those "from numpy import *" lines in the examples that I see were added in r5311.

Sure, I'll look at that. It seems like every possible option for importing stuff from numpy is used in doctests (sometimes even in the same module), so having them standardized with that implicit import is much better.

It turns out it's possible to give all the doctests an implicit "import numpy as np" (and probably any other arbitrary tweaks to their execution context, if need be). Once I can include some of the other doctest tweaks discussed in this thread, and it's checked in, I'll go back and remove all those "import numpy" statements I inserted into the docstrings. Alan

Alan McIntyre

11:33 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 2:37 PM, Pauli Virtanen wrote:

...

Can you make the convention chosen for the examples (currently only in the doc wiki, not yet in SVN) to work: assuming "import numpy as np" in examples?

Are there any other implicit imports that we will need? How about for SciPy, since the NumPy test framework will also be handling SciPy tests in 0.7?

Fernando Perez

23 Jun 23 Jun

6:46 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 11:17 AM, Alan McIntyre wrote:

...

On Mon, Jun 23, 2008 at 2:02 PM, Fernando Perez wrote:

...
There's also the option of marking them so doctest skips them via

#doctest: +SKIP

http://docs.python.org/lib/doctest-options.html

For short examples, that seems like a good option, but it seems like you have to have that comment on every line that you want skipped. There are some long examples (like the one in lib/function_base.py:bartlett) that (to me) would look pretty ugly having that comment tacked on to every line.

Ugh. Definitely too ugly if it has to go in every line. From reading the docs I interpreted it as affecting the whole example, which would be far more sensible...

...

Either way is fine with me in the end, though, so long as it doesn't produce test failures. :)

Yes, but we also want to make these really easy for users to cleanly paste in with minimal effort. I wonder if a decorator could be applied to those functions so that nose would then skip the doctests: @skip_doctest def foo().... but the nose doctest plugin isn't exactly a model of modularity, so I'm not sure you want to go down that particular road... The no-prompts option seems to be the cleanest right now. You may want then to instead just use a reST block for those, so that sphinx eventually renders them nicely at least as blocks: Plot the window and its frequency response:: ### <<< Added double colon here from numpy import clip, log10, array, bartlett ### <<< Extra indent so this is a reST block from scipy.fftpack import fft etc... Just some ideas... Cheers, f

Stéfan van der Walt

7:21 p.m.

New subject: Code samples in docstrings mistaken as doctests

2008/6/23 Fernando Perez :

...

On Mon, Jun 23, 2008 at 11:17 AM, Alan McIntyre wrote:

...
On Mon, Jun 23, 2008 at 2:02 PM, Fernando Perez wrote:

...
There's also the option of marking them so doctest skips them via

#doctest: +SKIP

http://docs.python.org/lib/doctest-options.html

For short examples, that seems like a good option, but it seems like you have to have that comment on every line that you want skipped. There are some long examples (like the one in lib/function_base.py:bartlett) that (to me) would look pretty ugly having that comment tacked on to every line.

Ugh. Definitely too ugly if it has to go in every line. From reading the docs I interpreted it as affecting the whole example, which would be far more sensible...

...
Either way is fine with me in the end, though, so long as it doesn't produce test failures. :)

Yes, but we also want to make these really easy for users to cleanly paste in with minimal effort. I wonder if a decorator could be applied to those functions so that nose would then skip the doctests:

@skip_doctest def foo()....

Another alternative is to replace +SKIP with something like +IGNORE. That way, the statement is still executed, we just don't care about its outcome. If we skip the line entirely, it often affects the rest of the tests later on. See http://aroberge.blogspot.com/2008/06/monkeypatching-doctest.html for an example implementation. Cheers Stéfan

Alan McIntyre

7:37 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 3:21 PM, Stéfan van der Walt wrote:

...

Another alternative is to replace +SKIP with something like +IGNORE. That way, the statement is still executed, we just don't care about its outcome. If we skip the line entirely, it often affects the rest of the tests later on.

Ugh. That just seems like a lot of unreadable ugliness to me. If this comment magic is the only way to make that stuff execute properly under doctest, I think I'd rather just skip it in favor of clean, uncluttered, non-doctestable code samples in the docstrings. If the code that's currently in docstrings needs to be retained as test code, I'll gladly take the time to put it into a test_ module where it doesn't get in the way of documentation. I'll defer to the consensus, though.

Anne Archibald

7:53 p.m.

New subject: Code samples in docstrings mistaken as doctests

2008/6/23 Alan McIntyre :

...

On Mon, Jun 23, 2008 at 3:21 PM, Stéfan van der Walt wrote:

...
Another alternative is to replace +SKIP with something like +IGNORE. That way, the statement is still executed, we just don't care about its outcome. If we skip the line entirely, it often affects the rest of the tests later on.

Ugh. That just seems like a lot of unreadable ugliness to me. If this comment magic is the only way to make that stuff execute properly under doctest, I think I'd rather just skip it in favor of clean, uncluttered, non-doctestable code samples in the docstrings. If the code that's currently in docstrings needs to be retained as test code, I'll gladly take the time to put it into a test_ module where it doesn't get in the way of documentation. I'll defer to the consensus, though.

I think doctests are valuable: it's very hard for the documentation to get out of sync with the code, and it makes it very easy to write tests, particularly in light of the wiki documentation framework. But I think encrusting examples with weird comments will be a pain for documentors and off-putting to users. Perhaps doctests can be positively marked, in some relatively unobtrusive way? Anne

Pauli Virtanen

8:07 p.m.

New subject: Code samples in docstrings mistaken as doctests

Mon, 23 Jun 2008 15:53:55 -0400, Anne Archibald wrote:

...

2008/6/23 Alan McIntyre :

...
On Mon, Jun 23, 2008 at 3:21 PM, Stéfan van der Walt wrote:

...
Another alternative is to replace +SKIP with something like +IGNORE. That way, the statement is still executed, we just don't care about its outcome. If we skip the line entirely, it often affects the rest of the tests later on.

Ugh. That just seems like a lot of unreadable ugliness to me. If this comment magic is the only way to make that stuff execute properly under doctest, I think I'd rather just skip it in favor of clean, uncluttered, non-doctestable code samples in the docstrings. If the code that's currently in docstrings needs to be retained as test code, I'll gladly take the time to put it into a test_ module where it doesn't get in the way of documentation. I'll defer to the consensus, though.

I think doctests are valuable: it's very hard for the documentation to get out of sync with the code, and it makes it very easy to write tests, particularly in light of the wiki documentation framework. But I think encrusting examples with weird comments will be a pain for documentors and off-putting to users. Perhaps doctests can be positively marked, in some relatively unobtrusive way?

I also think being able to test that the examples in docstrings run correctly could be valuable, but I'm not sure if it makes sense to have this enabled in the default test set. Another idea (in addition to whitelisting): how easy would it be to subclass doctest.DocTestParser so that it would eg. automatically +IGNORE any doctest lines containing ">>> plt."? -- Pauli Virtanen

Alan McIntyre

8:14 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 4:07 PM, Pauli Virtanen wrote:

...

Another idea (in addition to whitelisting): how easy would it be to subclass doctest.DocTestParser so that it would eg. automatically +IGNORE any doctest lines containing ">>> plt."?

I'll play around with that and see how hard it is to just ignore the ugly bits that currently require all that per-line directive stuff.

Michael McNeil Forbes

8:04 p.m.

New subject: Code samples in docstrings mistaken as doctests

...

On 23 Jun 2008, at 12:37 PM, Alan McIntyre wrote:

...
Ugh. That just seems like a lot of unreadable ugliness to me. If this comment magic is the only way to make that stuff execute properly under doctest, I think I'd rather just skip it in favor of clean, uncluttered, non-doctestable code samples in the docstrings.

Another perspective: doctests ensure that documentation stays up to date (if the behaviour or interface changes, then tests will fail indicating that the documentation also needs to be updated.) Thus, one can argue that all examples should also be doctests. This generally makes things a little more ugly, but much less ambiguous. ... Examples: --------- If A, B, C, and D are appropriately shaped 2-d arrays, then one can produce [ A B ] [ C D ] using any of these methods:

...

...
...
A, B, C, D = [[1,1]], [[2,2]], [[3,3]], [[4,4]] np.bmat('A, B; C, D') # From a string matrix([[ 1, 1, 2, 2], [ 3, 3, 4, 4]]) np.bmat([[A,B],[C,D]]) # From a nested sequence matrix([[ 1, 1, 2, 2], [ 3, 3, 4, 4]]) np.bmat(np.r_[np.c_[A,B],np.c_[C,D]]) # From an array matrix([[ 1, 1, 2, 2], [ 3, 3, 4, 4]])

Michael.

Anne Archibald

8:28 p.m.

New subject: Code samples in docstrings mistaken as doctests

2008/6/23 Michael McNeil Forbes :

...

...
On 23 Jun 2008, at 12:37 PM, Alan McIntyre wrote:

...
Ugh. That just seems like a lot of unreadable ugliness to me. If this comment magic is the only way to make that stuff execute properly under doctest, I think I'd rather just skip it in favor of clean, uncluttered, non-doctestable code samples in the docstrings.

Another perspective: doctests ensure that documentation stays up to date (if the behaviour or interface changes, then tests will fail indicating that the documentation also needs to be updated.)

Thus, one can argue that all examples should also be doctests. This generally makes things a little more ugly, but much less ambiguous.

This is a bit awkward. How do you give an example for a random-number generator? Even if you are willing to include a seed in each statement, misleading users into thinking it's necessary, the value returned for a given seed is not necessarily part of the interface a random-number generator agrees to support. I do agree that as many examples as possible should be doctests, but I don't think we should restrict the examples we are allowed to give to only those that can be made to serve as doctests. Anne

Michael McNeil Forbes

8:44 p.m.

New subject: Code samples in docstrings mistaken as doctests

On 23 Jun 2008, at 1:28 PM, Anne Archibald wrote:

...

2008/6/23 Michael McNeil Forbes :

...
Thus, one can argue that all examples should also be doctests. This generally makes things a little more ugly, but much less ambiguous.

This is a bit awkward. How do you give an example for a random-number generator? Even if you are willing to include a seed in each statement, misleading users into thinking it's necessary, the value returned for a given seed is not necessarily part of the interface a random-number generator agrees to support.

I agree that this can be awkward sometimes, and should certainly not be policy, but one can usually get around this. Instead of printing the result, you can use it, or demonstrate porperties:

...

...
...
random_array = np.random.rand(3,4) random_array.shape (3,4) random_array.max() < 1 True random_array.min() > 0 True

etc. Michael.

Robert Kern

8:51 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 15:44, Michael McNeil Forbes wrote:

...

On 23 Jun 2008, at 1:28 PM, Anne Archibald wrote:

...
2008/6/23 Michael McNeil Forbes :

...
Thus, one can argue that all examples should also be doctests. This generally makes things a little more ugly, but much less ambiguous.

This is a bit awkward. How do you give an example for a random-number generator? Even if you are willing to include a seed in each statement, misleading users into thinking it's necessary, the value returned for a given seed is not necessarily part of the interface a random-number generator agrees to support.

I agree that this can be awkward sometimes, and should certainly not be policy, but one can usually get around this. Instead of printing the result, you can use it, or demonstrate porperties:

...
...
...
random_array = np.random.rand(3,4) random_array.shape (3,4) random_array.max() < 1 True random_array.min() > 0 True

Yes, this makes it doctestable, but you've destroyed the exampleness. It should be policy *not* to do this. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Alan McIntyre

9:31 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 4:51 PM, Robert Kern wrote:

...

...
I agree that this can be awkward sometimes, and should certainly not be policy, but one can usually get around this. Instead of printing the result, you can use it, or demonstrate porperties:

...
...
...
random_array = np.random.rand(3,4) random_array.shape (3,4) random_array.max() < 1 True random_array.min() > 0 True

Yes, this makes it doctestable, but you've destroyed the exampleness. It should be policy *not* to do this.

So it seems we have: 1. Example code that is doctestable 2. Example code that probably can't ever be doctestable (random number stuff, etc.), but is still executable 3. Schematic examples that aren't executable Personally, I'm in favor of filling out examples of type #3 to make them at least #2, but maybe that's not always practical. I don't think #3 should ever have ">>>" prompts, so it shouldn't ever be picked up by doctest. I suppose I could go for a decorator option to flag #2. If we execute them, but not look at the results, then at least we find out about examples that are broken enough to raise exceptions.

Michael McNeil Forbes

9:51 p.m.

New subject: Code samples in docstrings mistaken as doctests

...

On Mon, Jun 23, 2008 at 4:51 PM, Robert Kern wrote:

...
...
...
...
...
random_array = np.random.rand(3,4) random_array.shape (3,4) random_array.max() < 1 True random_array.min() > 0 True

Yes, this makes it doctestable, but you've destroyed the exampleness. It should be policy *not* to do this.

Well perhaps... but do you think that rand(d0, d1, ..., dn) -> random values is more exampley than

...

...
...
r = np.random.rand(3,2,4) r.shape (3,2,4)

? On 23 Jun 2008, at 2:31 PM, Alan McIntyre wrote:

...

So it seems we have: 1. Example code that is doctestable 2. Example code that probably can't ever be doctestable (random number stuff, etc.), but is still executable 3. Schematic examples that aren't executable

Personally, I'm in favor of filling out examples of type #3 to make them at least #2, but maybe that's not always practical. I don't think #3 should ever have ">>>" prompts, so it shouldn't ever be picked up by doctest.

I suppose I could go for a decorator option to flag #2. If we execute them, but not look at the results, then at least we find out about examples that are broken enough to raise exceptions.

One can usually do #3 -> #1 or #2 by just leave bare assignments without printing a result (the user can always execute them and look at the result if they want).

...

...
...
r = np.random.rand(3,2,4)

which is cleaner than adding any flags... Michael.

Robert Kern

9:58 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 16:51, Michael McNeil Forbes wrote:

...

...
On Mon, Jun 23, 2008 at 4:51 PM, Robert Kern wrote:

...
...
...
...
> random_array = np.random.rand(3,4) > random_array.shape (3,4) > random_array.max() < 1 True > random_array.min() > 0 True

Yes, this makes it doctestable, but you've destroyed the exampleness. It should be policy *not* to do this.

Well perhaps... but do you think that

rand(d0, d1, ..., dn) -> random values

is more exampley than

...
...
...
r = np.random.rand(3,2,4) r.shape (3,2,4)

?

No. It wasn't an example. It was a specification of the call signature because it is in an extension module, so the call signature is not available like it is for pure Python functions. Thus, it needs to be given in the docstring. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Stéfan van der Walt

11:29 p.m.

New subject: Code samples in docstrings mistaken as doctests

2008/6/23 Michael McNeil Forbes :

...

One can usually do #3 -> #1 or #2 by just leave bare assignments without printing a result (the user can always execute them and look at the result if they want).

...
...
...
r = np.random.rand(3,2,4)

which is cleaner than adding any flags...

Purposefully reducing the clarity of an example to satisfy some tool is not an option. We might be able to work around this specific case, but there will be others. It should be fairly easy to execute the example code, just to make sure it runs. We can always work out a scheme to test its validity later. One route is to use the same docstring scraper we use for the reference guide, to extract all tests. We can then choose a markup which identifies tests with unpredictable results, and refrain from executing them. In some instances, we can even infer which tests to ignore, e.g. the '>>> plt.' example Pauli mentioned. Regards Stéfan

Stéfan van der Walt

11:43 p.m.

New subject: Code samples in docstrings mistaken as doctests

2008/6/24 Stéfan van der Walt :

...

It should be fairly easy to execute the example code, just to make sure it runs. We can always work out a scheme to test its validity later.

Mike Hansen just explained to me that the Sage doctest system sets the random seed before executing each test. If we address a) Random variables b) Plotting representations and c) Endianness we're probably halfway there. Regards Stéfan

Michael Abshoff

11:58 p.m.

New subject: Code samples in docstrings mistaken as doctests

Stéfan van der Walt wrote:

...

2008/6/24 Stéfan van der Walt :

...
It should be fairly easy to execute the example code, just to make sure it runs. We can always work out a scheme to test its validity later.

Hi,

...

Mike Hansen just explained to me that the Sage doctest system sets the random seed before executing each test. If we address

a) Random variables

we have some small extensions to the doctesting framework that allow us to mark doctests as "#random" so that the result it not checked. Carl Witty wrote some code that makes the random number generator in a lot of the Sage components behave consistently on all supported platforms.

...

b) Plotting representations and c) Endianness

Yeah, the Sage test suite seems to catch at least one of those in every release cycle. Another thing we just implemented is a "jar of pickles" that lets us verify that there is no cross platform issues (32 vs. 64 bits and big vs. little endian) as well as no problems with loading pickles from previous releases.

...

we're probably halfway there.

Regards Stéfan

Cheers, Michael

...

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Fernando Perez

24 Jun 24 Jun

12:19 a.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 4:58 PM, Michael Abshoff wrote:

...

...
a) Random variables

we have some small extensions to the doctesting framework that allow us to mark doctests as "#random" so that the result it not checked. Carl Witty wrote some code that makes the random number generator in a lot of the Sage components behave consistently on all supported platforms.

Care to share? (BSD, we can't even look at the Sage code). Cheers, f

Michael Abshoff

12:26 a.m.

New subject: Code samples in docstrings mistaken as doctests

Fernando Perez wrote:

...

On Mon, Jun 23, 2008 at 4:58 PM, Michael Abshoff wrote:

Hi Fernando,

...

...
...
a) Random variables we have some small extensions to the doctesting framework that allow us to mark doctests as "#random" so that the result it not checked. Carl Witty wrote some code that makes the random number generator in a lot of the Sage components behave consistently on all supported platforms.

Care to share? (BSD, we can't even look at the Sage code).

I am not the author, so I need to find out who wrote the code, but I am sure it can be made BSD. We are also working on "doctest+timeit" to hunt for performance regressions, but that one is not ready for prime time yet.

...

Cheers,

f

Cheers, Michael

...

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Fernando Perez

12:37 a.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 5:26 PM, Michael Abshoff wrote:

...

I am not the author, so I need to find out who wrote the code, but I am sure it can be made BSD. We are also working on "doctest+timeit" to hunt for performance regressions, but that one is not ready for prime time yet.

Great, thanks. Cheers, f

Charles R Harris

2:09 a.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 5:58 PM, Michael Abshoff < michael.abshoff@googlemail.com> wrote:

...

Stéfan van der Walt wrote:

...
2008/6/24 Stéfan van der Walt :

...
It should be fairly easy to execute the example code, just to make sure it runs. We can always work out a scheme to test its validity later.

Hi,

...
Mike Hansen just explained to me that the Sage doctest system sets the random seed before executing each test. If we address

a) Random variables

we have some small extensions to the doctesting framework that allow us to mark doctests as "#random" so that the result it not checked. Carl Witty wrote some code that makes the random number generator in a lot of the Sage components behave consistently on all supported platforms.

But there is more than one possible random number generator. If you do that you are tied into one kind of generator and one kind of initialization implementation. Chuck

Michael Abshoff

2:27 a.m.

New subject: Code samples in docstrings mistaken as doctests

Charles R Harris wrote:

...

On Mon, Jun 23, 2008 at 5:58 PM, Michael Abshoff mailto:michael.abshoff@googlemail.com> wrote:

Stéfan van der Walt wrote: > 2008/6/24 Stéfan van der Walt mailto:stefan@sun.ac.za>: >> It should be fairly easy to execute the example code, just to make >> sure it runs. We can always work out a scheme to test its validity >> later.

Hi,

> Mike Hansen just explained to me that the Sage doctest system sets the > random seed before executing each test. If we address > > a) Random variables

we have some small extensions to the doctesting framework that allow us to mark doctests as "#random" so that the result it not checked. Carl Witty wrote some code that makes the random number generator in a lot of the Sage components behave consistently on all supported platforms.

Hi,

...

But there is more than one possible random number generator. If you do that you are tied into one kind of generator and one kind of initialization implementation.

Chuck

Correct, but so far Carl has hooked into six out of the many random number generators in the various components of Sage. This way we can set a global seed and also more easily reproduce issues with algorithms where randomness plays a role without being forced to be on the same platform. There are still doctests in Sage where the randomness comes from sources not in randgen (Carl's code), but sooner or later we will get around to all of them. Cheers, Michael

...

------------------------------------------------------------------------

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Anne Archibald

3:53 a.m.

New subject: Code samples in docstrings mistaken as doctests

2008/6/23 Michael Abshoff :

...

Charles R Harris wrote:

...
On Mon, Jun 23, 2008 at 5:58 PM, Michael Abshoff mailto:michael.abshoff@googlemail.com> wrote:

Stéfan van der Walt wrote: > 2008/6/24 Stéfan van der Walt mailto:stefan@sun.ac.za>: >> It should be fairly easy to execute the example code, just to make >> sure it runs. We can always work out a scheme to test its validity >> later.

Hi,

> Mike Hansen just explained to me that the Sage doctest system sets the > random seed before executing each test. If we address > > a) Random variables

we have some small extensions to the doctesting framework that allow us to mark doctests as "#random" so that the result it not checked. Carl Witty wrote some code that makes the random number generator in a lot of the Sage components behave consistently on all supported platforms.

Hi,

...
But there is more than one possible random number generator. If you do that you are tied into one kind of generator and one kind of initialization implementation.

Chuck

Correct, but so far Carl has hooked into six out of the many random number generators in the various components of Sage. This way we can set a global seed and also more easily reproduce issues with algorithms where randomness plays a role without being forced to be on the same platform. There are still doctests in Sage where the randomness comes from sources not in randgen (Carl's code), but sooner or later we will get around to all of them.

Doesn't this mean you can't change your implementation of random number generators (for example choosing a different implementation of generation of normally-distributed random numbers, or replacing the Mersenne Twister) without causing countless doctests to fail meaninglessly? Anne

Robert Kern

3:57 a.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 22:53, Anne Archibald wrote:

...

2008/6/23 Michael Abshoff :

...

...
Correct, but so far Carl has hooked into six out of the many random number generators in the various components of Sage. This way we can set a global seed and also more easily reproduce issues with algorithms where randomness plays a role without being forced to be on the same platform. There are still doctests in Sage where the randomness comes from sources not in randgen (Carl's code), but sooner or later we will get around to all of them.

Doesn't this mean you can't change your implementation of random number generators (for example choosing a different implementation of generation of normally-distributed random numbers, or replacing the Mersenne Twister) without causing countless doctests to fail meaninglessly?

It's not that bad. After you've verified that your new code works, you regenerate the examples. You check in both at the same time. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Anne Archibald

3:51 a.m.

New subject: Code samples in docstrings mistaken as doctests

2008/6/23 Stéfan van der Walt :

...

2008/6/24 Stéfan van der Walt :

...
It should be fairly easy to execute the example code, just to make sure it runs. We can always work out a scheme to test its validity later.

Mike Hansen just explained to me that the Sage doctest system sets the random seed before executing each test. If we address

a) Random variables b) Plotting representations and c) Endianness

we're probably halfway there.

I agree (though I have reservations about how they are to be addressed). But in the current setting, "halfway there" is still a problem - it seems to me we need, now and later, a way to deal with generic examples that are not doctests. There may not be many of them, and most may be dealt with by falling into categories a, b, and c above, but it is important that we not make it difficult to write new examples even if they can't readily be made into doctests. In particular, we don't want some documentor saying "well, I'd like to write an example, but I don't remember the arcane syntax to prevent this failing a doctest, so I'm not going to bother." Anne

Pauli Virtanen

23 Jun 23 Jun

7:57 p.m.

New subject: Code samples in docstrings mistaken as doctests

Mon, 23 Jun 2008 10:03:28 +0200, Stéfan van der Walt wrote:

...

2008/6/23 Alan McIntyre :

...
Some docstrings have examples of how to use the function that aren't executable code (see numpy.core.defmatrix.bmat for an example) in their current form. Should these examples have the ">>>" removed from them to avoid them being picked up as doctests?

The examples written for the random module warrants the same question. First and foremost, the docstrings are there to illustrate to users how to use the code; second, to serve as tests.

Example codes should run, but I'm not sure whether they should always be valid doctests.

In the `bmat` example, I would remove the '>>>' like you suggested.

"Schematic" code (such as that currently in numpy.bmat) that doesn't run probably shouldn't be written with >>>, and for it the ReST block quote syntax is also looks OK. But I'm personally not in favor of a distinction between a "doctest" and a "code sample", as the difference is not of interest to the main target audience who reads the docstrings (or the reference documentation generated based on them). As I see it, Numpy has a test architecture that is separate from doctests, so that most of the bonus doctests gives us is ensuring that all of our examples run without errors and produce expected results. It's a bit unfortunate though that the doctest directives are as obtrusive as they are and only apply to a single line. One problem that I see now is quite annoying in the sample codes using matplotlib: matplotlib functions tend to return some objects whose <repr> contains a memory address, which causes the code to fit badly in a doctest. This can be worked around (ELLIPSIS, assigning to a variable), but I don't see a clean way. (I'm not so worried here about plot windows popping up as they can be worked around by monkey-patching matplotlib.show and choosing a non-graphical backend.) Another point related to numpy are blank lines often appearing in array printouts (the text <BLANKLINE> is not a pretty sight in documentation). Also, NORMALIZE_WHITESPACE is useful for reducing the whitespace in array printout. -- Pauli Virtanen

Alan McIntyre

8:12 p.m.

New subject: Code samples in docstrings mistaken as doctests

On Mon, Jun 23, 2008 at 3:57 PM, Pauli Virtanen wrote:

...

"Schematic" code (such as that currently in numpy.bmat) that doesn't run probably shouldn't be written with >>>, and for it the ReST block quote syntax is also looks OK.

But I'm personally not in favor of a distinction between a "doctest" and a "code sample", as the difference is not of interest to the main target audience who reads the docstrings (or the reference documentation generated based on them). As I see it, Numpy has a test architecture that is separate from doctests, so that most of the bonus doctests gives us is ensuring that all of our examples run without errors and produce expected results.

I agree with you, Anne and Michael that ensuring that the documentation examples run is important. The more I think about it, the more I'd rather have examples that are a bit verbose. In the particular example of bmat, as a new user, I'd really honestly rather see those three cases fully coded:

...

...
...
A=nd.arange(1,5).reshape(2,2) B= etc. F=bmat('A,B;C,D') F matrix([[1,2,5,6], etc.

5783

Age (days ago)

5785

Last active (days ago)

List overview

Download

33 comments

9 participants

participants (9)

Alan McIntyre
Anne Archibald
Charles R Harris
Fernando Perez
Michael Abshoff
Michael McNeil Forbes
Pauli Virtanen
Robert Kern
Stéfan van der Walt

Code samples in docstrings mistaken as doctests

Alan McIntyre

Alan McIntyre

Alan McIntyre

Alan McIntyre

Alan McIntyre

Alan McIntyre

Alan McIntyre

Michael McNeil Forbes

Michael McNeil Forbes

Alan McIntyre

Michael McNeil Forbes

Alan McIntyre

tags

participants (9)