
I find myself wanting to use doctest for some test-driven development, and find myself slightly frustrated and wonder if others would be interested in seeing the following additional functionality in doctest: 1. Execution context determined by outer-scope doctest defintions. 2. Smart Comparisons that will detect output of a non-ordered type (dict/set), lift and recast it and do a real comparison. Without #1, "literate testing" becomes awash with re-defining re-used variables which, generally, also detracts from exact purpose of the test -- this creates testdoc noise and the docs become less useful. Without #2, "readable docs" nicely co-aligning with "testable docs" tends towards divergence. Perhaps not enough developers use doctest to care, but I find it one of the more enjoyable ways to develop python code -- I don't have to remember test cases nor go through the trouble of setting up unittests. AND, it encourages agile development. Another user wrote a while back of even having a built-in test() method. Wouldn't that really encourage agile developement? And you wouldn't have to muddy up your code with "if __name__ == "__main__": import doctest, yadda yadda". Anyway... of course patches welcome, yes... ;^) mark

On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward@gmail.com> wrote:
Anyway... of course patches welcome, yes... ;^)
Not really. doctest is for *testing code example in docs*. If you try to use it for more than that, it's likely to drive you up the wall, so proposals to make it more than it is usually don't get a great reception (docs patches to make it's limitations clearer are generally welcome, though). The stdib solution for test driven development is unittest (the vast majority of our own regression suite is written that way - only a small proportion uses doctest). An interesting third party alternative that has been created recently is behave: http://crate.io/packages/behave/ Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan wrote:
Really? Not in my experience, although I admit I haven't tried to push the envelope too far. But I haven't had any problem with a literate programming model: * Use short, self-contained but not necessarily exhaustive examples in the code's docstrings (I don't try to give examples of *every* combination of good and bad data, special cases, etc. in the docstring). * Write extensive (ideally exhaustive) examples with explanatory text, in a separate text file. I generally do this to describe, explain and test the interface, rather than the implementation, but I see no reason why it wouldn't work for the implementation as well. It would require writing for the next maintainer rather than for a user of the library. In the external test text file(s), examples don't necessarily need to be self-contained. I have an entire document to create a test environment, if necessary, and can include extra functions, stubs, mocks, etc. as needed, without clashing with the primary purpose of docstrings to be *documentation* first and tests a distant second. If need be, test infrastructure can go into an external module, to be imported, rather than in-place in the doctest file. In my experience, this works well for algorithmic code that doesn't rely on external resources. If my tests require setting up and tearing down resources, I stick to unittest which has better setup/teardown support. (It would be hard to have *less* support for setup and teardown than doctest.) But otherwise, I haven't run into any problems with doctest other than the perennial "oops, I forgot to escape my backslashes!". -- Steven

On Fri, Feb 17, 2012 at 4:57 PM, Mark Janssen <dreamingforward@gmail.com> wrote:
I'm not sure what you mean, but it might be relevant that Sphinx lets you define multiple scopes for doctests. I feel like its approach is the right one, but it isn't reusable in Python docstrings. That said, I think users of doctest have moved away from embedded doctests in docstrings -- it encourages doctests to have way too many "examples" (test cases), which reduces their usefulness as documentation.
2. Smart Comparisons that will detect output of a non-ordered type (dict/set), lift and recast it and do a real comparison.
I think it's better to just always use ast.literal_eval on the output as another form of testing for equivalence. This could break code, but probably not any code worth caring about. (in particular, >>> print 'r""' "" would pass in a literal_eval-ing system, but not in some other system)
Not exactly... doctest has no maintainer, and so no patches ever get accepted. If you want to improve it, you'll have to fork it. I hope you're that sort of person, because doctest can totally be improved. It suffers a lot from people thinking of what it is rather than what it could be. :( I've in the past worked a bit on improving doctest in a fork I started. Its primary purpose was originally to add Cram-like "shell doctests" to doctest (see http://pypi.python.org/pypi/cram ), but since then I started working on other bits here and there. The work I've done is available at https://bitbucket.org/devin.jeanpierre/doctest2 (please forgive the presumptuous name -- I'm considering a rename to "lembas".) The reason I've not worked on it recently is that the problems have gotten harder and my time has run short. I would be very open to collaboration or forking, although I also understand that a largeish expansion with redesigned internals created by an overworked student is probably not the greatest place to start. This is all assuming your intentions are to contribute rather than only suggest. Not that suggestions aren't welcome, I suppose, but maybe not here. doctest is not actively developed or maintained anywhere, as far as I know. (I want to say "except by me", because that'd make me seem all special and so on, but I haven't committed a thing in months.) Mostly, I feel a bit like this thread could accidentally spawn parallel / duplicated work, so I figured I'd put what I have out here. Please don't take it for more than it is, doctest2 is still a work in progress (and, worse, its source code is in the middle of two feature additions!) I definitely hope you help to make the doctest world better. I think it fills a role that should be filled, and its neglect is unfortunate. -- Devin

On Sat, Feb 18, 2012 at 10:43 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
Indeed, my apologies for my earlier crankiness (I should know by now to stay away from mailing lists at crazy hours of the morning). While it's obviously not the ideal, forking orphaned stdlib modules and publishing new versions on PyPI can be an *excellent* idea. The core development team is generally a fairly conservative bunch, so unless a module has a sufficiently active maintainer that feels entitled to make API design decisions, our default response to proposals is going to be "no". One of the *best* ways to change this is to develop a community around an enhanced version of the module - one of our reasons for switching to a DVCS for our development was to help make it easier for people to extract and merge stdlib updates while maintaining their own versions. Then, when you come to python-ideas to say "Hey, wouldn't this be a good idea?", it's possible to point to the PyPI version and say: - people have tried this and liked it - I've been maintaining this for a while now and would continue to do so for the standard library Some major (current or planned) updates to the Python 3.3 standard library occurred because folks decided the stdlib solutions were not in an acceptable state and set out to improve them (specifically, the packaging package came from the distutils2 fork, which continues as a backport to early Python versions, and MRAB's regex module has been approved for addition, although it hasn't actually been incorporated yet). In the past, other major additions like argparse came about that way. A few other stdlib modules have backports on PyPI by their respective stlib maintainers so we can try out new design concepts *before* committing to supporting them in the standard library. A published version of doctest2 that was designed to be suitable for eventual incorporation back into doctest itself (i.e. by maintaining backwards compatibility) sounds like it would be quite popular, and would route around the fact that enhancing it isn't high on the priority list for the current core development team. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Feb 18, 2012 at 8:54 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Heh, "quite popular". Whenever I mention doctest2, people think of doctest. And apparently people really dislike doctest. The way I try to address the immediate fear response is, "sure, doctest is terrible -- why do you think I'm forking it? ;)"; however, I think popularity would be difficult outside of the existing doctest user base. P.S., some uninvited advice to would-be forkers: - Make the starting commit of your repository identical to the original module that you're forking, to make tracking the original module easier. - On that note, also write down the hg revision of the module that you're forking so that you can find later changes. - Immediately change the name of your forked module so that unit tests only run against it rather than accidentally testing the original module. (Also, delete the original from your Python to be sure you edited the test cases right too. And, uh, don't forget the pyc.) Maybe these are obvious to everyone else, but I'd never forked anything before, and so I made all those mistakes. The first dozen or two commits are full of sad things. -- Devin

Mark Janssen wrote:
Can you give an example of how you would like this to work?
2. Smart Comparisons that will detect output of a non-ordered type (dict/set), lift and recast it and do a real comparison.
I would love to see a doctest directive that accepted differences in output order, e.g. would match {1, 2, 3} and {3, 1, 2}. But I think that's a hard problem to solve in the general case. Should it match 123 and 312? I don't think so. Just coming up with a clear and detailed set of requirements for (e.g.) #doctest:+IGNORE_ORDER may be tricky. I'd like a #3 as well: an abbreviated way to spell doctest directives, because they invariably push my tests well past the 80 character mark. -- Steven

On Feb 17, 2012, at 02:57 PM, Mark Janssen wrote:
FWIW, I think doctests are fantastic and I use them all the time. There are IMO a couple of things to keep in mind: - doctests are documentation first. Specifically, they are testable documentation. What better way to ensure that your documentation is accurate and up-to-date? (And no, I do not generally find skew between the code and the separate-file documentation.) - I personally dislike docstring doctests, and much prefer separate reST documents. These have several advantages, such as the ability to inject names into doctests globals (use with care though), and the ability to set up the execution context for doctests (see below). The fact that it's so easy to turn these into documentation with Sphinx is a huge win. Since so many people point this out, let me say that I completely agree that doctests are not a *replacement* for unittests, but they are a fantastic *complement* to unittests. When I TDD, I always start writing the (testable) documentation first, because if I cannot explain the component under test in clearly intelligible English, then I probably don't really understand what it is I'm trying to write. My doctests usually describe mostly the good path through the API. Occasionally I'll describe error modes if I think those are important for understanding how to use the code. However, for all those fuzzy corner cases, weird behaviors, bug fixes, etc., unittests are much better suited because ensuring you've fixed these problems and don't regress in the future doesn't help the narrative very much.
1. Execution context determined by outer-scope doctest defintions.
Can you explain this one? For the separate-reST-document style I use, these are almost always driven by a test_documentation.py which ostensibly fits into the unittest framework. It searches for .rst files and builds up DocFileSuites around them. Using this style it is very easy to clean up resources, reset persistent state (e.g. reset the database after every doctest), call setUp and tearDown methods, and even correctly fiddle the __future__ state expected by doctests. I usually put all this in an additional_tests() method, such as: http://bazaar.launchpad.net/~barry/flufl.enum/trunk/view/head:/flufl/enum/te... So setting up context is as easy as writing a setUp() method and passing that to DocFileSuite. One thing that bums me out about this is that I haven't really made the bulk of additional_tests() very generic. I usually cargo cult most of this code into every package I write. :(
2. Smart Comparisons that will detect output of a non-ordered type (dict/set), lift and recast it and do a real comparison.
I'm of mixed mind with these. Yes, you must be careful with ordering, but I find it less readable to just sort() some dictionary output for example. What I've found much more useful is to iterate over the sorted keys of a dictionary and print the key/values pairs. This general pattern has a few advantages, such as the ability to add some filtering to the output if you don't care about everything, and more importantly, the ability to print most string values without their u'' prefix (for better py2/py3 compatibility from the same code base without the use of 2to3). Nested structures can be more problematic, but I've often found that as the output gets uglier, the narrative suffers, so that's a good time to re-evaluate your documentation!
I've no doubt that doctests could be improved, but I actually find them quite usable as is, with just a little bit of glue code to get it all hooked up. As I say though, I'm biased against docstring doctests. Cheers, -Barry

On Feb 17, 2012, at 1:57 PM, Mark Janssen wrote:
I find myself wanting to use doctest for some test-driven development, and find myself slightly frustrated
ISTM that you're doing it wrong ;-) Doctests are all about testing documentation, not about unittesting. And because they are very literal (in fact, intentionally stupid with respect to whitespace), doctests are inappropriate for test driven development. It is *much* easier to test the function by hand and then cut-and-paste the test/result pair into the docstring. Extending the doctest module to support your style of using it would likely be counter-productive as that would encourage more people to use the wrong tool for the job -- the doctest style is almost completely at odds with the principles of unittesting (i.e. isolated/independent tests, etc). My clients tend to use doctests quite a bit (that is what I teach), yet the need for doctest extensions almost never arises when it is being used as designed. I suggest that you try out some other third-party testing packages that are designed to accommodate other testing styles. Raymond

On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward@gmail.com> wrote:
Anyway... of course patches welcome, yes... ;^)
Not really. doctest is for *testing code example in docs*. If you try to use it for more than that, it's likely to drive you up the wall, so proposals to make it more than it is usually don't get a great reception (docs patches to make it's limitations clearer are generally welcome, though). The stdib solution for test driven development is unittest (the vast majority of our own regression suite is written that way - only a small proportion uses doctest). An interesting third party alternative that has been created recently is behave: http://crate.io/packages/behave/ Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan wrote:
Really? Not in my experience, although I admit I haven't tried to push the envelope too far. But I haven't had any problem with a literate programming model: * Use short, self-contained but not necessarily exhaustive examples in the code's docstrings (I don't try to give examples of *every* combination of good and bad data, special cases, etc. in the docstring). * Write extensive (ideally exhaustive) examples with explanatory text, in a separate text file. I generally do this to describe, explain and test the interface, rather than the implementation, but I see no reason why it wouldn't work for the implementation as well. It would require writing for the next maintainer rather than for a user of the library. In the external test text file(s), examples don't necessarily need to be self-contained. I have an entire document to create a test environment, if necessary, and can include extra functions, stubs, mocks, etc. as needed, without clashing with the primary purpose of docstrings to be *documentation* first and tests a distant second. If need be, test infrastructure can go into an external module, to be imported, rather than in-place in the doctest file. In my experience, this works well for algorithmic code that doesn't rely on external resources. If my tests require setting up and tearing down resources, I stick to unittest which has better setup/teardown support. (It would be hard to have *less* support for setup and teardown than doctest.) But otherwise, I haven't run into any problems with doctest other than the perennial "oops, I forgot to escape my backslashes!". -- Steven

On Fri, Feb 17, 2012 at 4:57 PM, Mark Janssen <dreamingforward@gmail.com> wrote:
I'm not sure what you mean, but it might be relevant that Sphinx lets you define multiple scopes for doctests. I feel like its approach is the right one, but it isn't reusable in Python docstrings. That said, I think users of doctest have moved away from embedded doctests in docstrings -- it encourages doctests to have way too many "examples" (test cases), which reduces their usefulness as documentation.
2. Smart Comparisons that will detect output of a non-ordered type (dict/set), lift and recast it and do a real comparison.
I think it's better to just always use ast.literal_eval on the output as another form of testing for equivalence. This could break code, but probably not any code worth caring about. (in particular, >>> print 'r""' "" would pass in a literal_eval-ing system, but not in some other system)
Not exactly... doctest has no maintainer, and so no patches ever get accepted. If you want to improve it, you'll have to fork it. I hope you're that sort of person, because doctest can totally be improved. It suffers a lot from people thinking of what it is rather than what it could be. :( I've in the past worked a bit on improving doctest in a fork I started. Its primary purpose was originally to add Cram-like "shell doctests" to doctest (see http://pypi.python.org/pypi/cram ), but since then I started working on other bits here and there. The work I've done is available at https://bitbucket.org/devin.jeanpierre/doctest2 (please forgive the presumptuous name -- I'm considering a rename to "lembas".) The reason I've not worked on it recently is that the problems have gotten harder and my time has run short. I would be very open to collaboration or forking, although I also understand that a largeish expansion with redesigned internals created by an overworked student is probably not the greatest place to start. This is all assuming your intentions are to contribute rather than only suggest. Not that suggestions aren't welcome, I suppose, but maybe not here. doctest is not actively developed or maintained anywhere, as far as I know. (I want to say "except by me", because that'd make me seem all special and so on, but I haven't committed a thing in months.) Mostly, I feel a bit like this thread could accidentally spawn parallel / duplicated work, so I figured I'd put what I have out here. Please don't take it for more than it is, doctest2 is still a work in progress (and, worse, its source code is in the middle of two feature additions!) I definitely hope you help to make the doctest world better. I think it fills a role that should be filled, and its neglect is unfortunate. -- Devin

On Sat, Feb 18, 2012 at 10:43 AM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
Indeed, my apologies for my earlier crankiness (I should know by now to stay away from mailing lists at crazy hours of the morning). While it's obviously not the ideal, forking orphaned stdlib modules and publishing new versions on PyPI can be an *excellent* idea. The core development team is generally a fairly conservative bunch, so unless a module has a sufficiently active maintainer that feels entitled to make API design decisions, our default response to proposals is going to be "no". One of the *best* ways to change this is to develop a community around an enhanced version of the module - one of our reasons for switching to a DVCS for our development was to help make it easier for people to extract and merge stdlib updates while maintaining their own versions. Then, when you come to python-ideas to say "Hey, wouldn't this be a good idea?", it's possible to point to the PyPI version and say: - people have tried this and liked it - I've been maintaining this for a while now and would continue to do so for the standard library Some major (current or planned) updates to the Python 3.3 standard library occurred because folks decided the stdlib solutions were not in an acceptable state and set out to improve them (specifically, the packaging package came from the distutils2 fork, which continues as a backport to early Python versions, and MRAB's regex module has been approved for addition, although it hasn't actually been incorporated yet). In the past, other major additions like argparse came about that way. A few other stdlib modules have backports on PyPI by their respective stlib maintainers so we can try out new design concepts *before* committing to supporting them in the standard library. A published version of doctest2 that was designed to be suitable for eventual incorporation back into doctest itself (i.e. by maintaining backwards compatibility) sounds like it would be quite popular, and would route around the fact that enhancing it isn't high on the priority list for the current core development team. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Feb 18, 2012 at 8:54 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Heh, "quite popular". Whenever I mention doctest2, people think of doctest. And apparently people really dislike doctest. The way I try to address the immediate fear response is, "sure, doctest is terrible -- why do you think I'm forking it? ;)"; however, I think popularity would be difficult outside of the existing doctest user base. P.S., some uninvited advice to would-be forkers: - Make the starting commit of your repository identical to the original module that you're forking, to make tracking the original module easier. - On that note, also write down the hg revision of the module that you're forking so that you can find later changes. - Immediately change the name of your forked module so that unit tests only run against it rather than accidentally testing the original module. (Also, delete the original from your Python to be sure you edited the test cases right too. And, uh, don't forget the pyc.) Maybe these are obvious to everyone else, but I'd never forked anything before, and so I made all those mistakes. The first dozen or two commits are full of sad things. -- Devin

Mark Janssen wrote:
Can you give an example of how you would like this to work?
2. Smart Comparisons that will detect output of a non-ordered type (dict/set), lift and recast it and do a real comparison.
I would love to see a doctest directive that accepted differences in output order, e.g. would match {1, 2, 3} and {3, 1, 2}. But I think that's a hard problem to solve in the general case. Should it match 123 and 312? I don't think so. Just coming up with a clear and detailed set of requirements for (e.g.) #doctest:+IGNORE_ORDER may be tricky. I'd like a #3 as well: an abbreviated way to spell doctest directives, because they invariably push my tests well past the 80 character mark. -- Steven

On Feb 17, 2012, at 02:57 PM, Mark Janssen wrote:
FWIW, I think doctests are fantastic and I use them all the time. There are IMO a couple of things to keep in mind: - doctests are documentation first. Specifically, they are testable documentation. What better way to ensure that your documentation is accurate and up-to-date? (And no, I do not generally find skew between the code and the separate-file documentation.) - I personally dislike docstring doctests, and much prefer separate reST documents. These have several advantages, such as the ability to inject names into doctests globals (use with care though), and the ability to set up the execution context for doctests (see below). The fact that it's so easy to turn these into documentation with Sphinx is a huge win. Since so many people point this out, let me say that I completely agree that doctests are not a *replacement* for unittests, but they are a fantastic *complement* to unittests. When I TDD, I always start writing the (testable) documentation first, because if I cannot explain the component under test in clearly intelligible English, then I probably don't really understand what it is I'm trying to write. My doctests usually describe mostly the good path through the API. Occasionally I'll describe error modes if I think those are important for understanding how to use the code. However, for all those fuzzy corner cases, weird behaviors, bug fixes, etc., unittests are much better suited because ensuring you've fixed these problems and don't regress in the future doesn't help the narrative very much.
1. Execution context determined by outer-scope doctest defintions.
Can you explain this one? For the separate-reST-document style I use, these are almost always driven by a test_documentation.py which ostensibly fits into the unittest framework. It searches for .rst files and builds up DocFileSuites around them. Using this style it is very easy to clean up resources, reset persistent state (e.g. reset the database after every doctest), call setUp and tearDown methods, and even correctly fiddle the __future__ state expected by doctests. I usually put all this in an additional_tests() method, such as: http://bazaar.launchpad.net/~barry/flufl.enum/trunk/view/head:/flufl/enum/te... So setting up context is as easy as writing a setUp() method and passing that to DocFileSuite. One thing that bums me out about this is that I haven't really made the bulk of additional_tests() very generic. I usually cargo cult most of this code into every package I write. :(
2. Smart Comparisons that will detect output of a non-ordered type (dict/set), lift and recast it and do a real comparison.
I'm of mixed mind with these. Yes, you must be careful with ordering, but I find it less readable to just sort() some dictionary output for example. What I've found much more useful is to iterate over the sorted keys of a dictionary and print the key/values pairs. This general pattern has a few advantages, such as the ability to add some filtering to the output if you don't care about everything, and more importantly, the ability to print most string values without their u'' prefix (for better py2/py3 compatibility from the same code base without the use of 2to3). Nested structures can be more problematic, but I've often found that as the output gets uglier, the narrative suffers, so that's a good time to re-evaluate your documentation!
I've no doubt that doctests could be improved, but I actually find them quite usable as is, with just a little bit of glue code to get it all hooked up. As I say though, I'm biased against docstring doctests. Cheers, -Barry

On Feb 17, 2012, at 1:57 PM, Mark Janssen wrote:
I find myself wanting to use doctest for some test-driven development, and find myself slightly frustrated
ISTM that you're doing it wrong ;-) Doctests are all about testing documentation, not about unittesting. And because they are very literal (in fact, intentionally stupid with respect to whitespace), doctests are inappropriate for test driven development. It is *much* easier to test the function by hand and then cut-and-paste the test/result pair into the docstring. Extending the doctest module to support your style of using it would likely be counter-productive as that would encourage more people to use the wrong tool for the job -- the doctest style is almost completely at odds with the principles of unittesting (i.e. isolated/independent tests, etc). My clients tend to use doctests quite a bit (that is what I teach), yet the need for doctest extensions almost never arises when it is being used as designed. I suggest that you try out some other third-party testing packages that are designed to accommodate other testing styles. Raymond
participants (8)
-
Barry Warsaw
-
Devin Jeanpierre
-
Eric Snow
-
Mark Janssen
-
Nick Coghlan
-
Raymond Hettinger
-
Ron Adam
-
Steven D'Aprano