Test Class setup and teardown in unittest

Earlier this week on the Testing In Python list, there was a discussion on how to execute a setup and/or teardown for a single test class instead of for each test fixture on the class (see the 'setUp and tearDown behavior' thread). I have had to deal with situation myself before, and I am obviously not the only one (since I did not initiate the thread). As such I'd like to propose adding a class level setup and tear down method the unittest TestCase class. Rationale: Test cases can at times require the setup of expensive resources. This is often the case when implementing integration testing. Having to perform this setup for each fixture can prohibited for large number of fixtures and/or for resources that are expensive to setup. For example, I have several hundred integration tests that hit a live database. If I create the connection object for each fixture, most of the tests fail due to the maximum connection limit being reached. As a work around, I create a connection object once for each test case. Without such functionality built in, the common idiom runs along these lines: class MyTest(TestCase): def setUp(self): if not self.ClassIsSetup: self.setupClass() self.ClassIsSetup=True While this achieves the desired functionality, it is unclear due to conditional setup code and is also error prone as the same code segment would need to appear in every TestCase requiring the functionality. Having a class wide setup and teardown function that implementers of test cases can override would make the code/intent more clear and alleviate the need to implement test case functionality when the user should be focusing on writing tests. I emailed Michael Foord about some of his comments in the TIP thread and to ask if he would be interested in a patch adding this functionality, and I have included his response below. I would like to hear people's comments/suggestions/ideas before I start working on said patch. Thanks, -Mark Michael Foord's Email: ======================================= I would certainly be interested in adding this to unittest. It needs a discussion of the API and the semantics: * What should the methods be called? setup_class and teardown_class or setupClass and teardownClass? For consistency with existing methods the camelCase should probably be used. * If the setupClass fails how should the error be reported? The *easiest* way is for the failure to be reported as part of the first test * Ditto for teardownClass - again the easiest way is for it to be reported as a failure in the last test * If setupClass fails then should all the tests in that class be skipped? I think yes. Also details like ensuring that even if just a single test method from a class is run the setupClass and teardownClass are still run. It probably needs to go to python-dev or python-ideas for discussion. All the best, Michael

On Wed, 2010-01-20 at 19:38 -0500, Mark Roddy wrote:
I think that this is the wrong approach to the problem: - class scope for such fixtures drives large test classes, reduces flexability - doing it the rough way you suggest interacts in non obvious ways with test order randomisation - it also interacts with concurrent testing by having shared objects (classes) hold state in a way that is not introspectable by the test running framework. I'd much much much rather see e.g. testresources integrated into the core allowing fixtures to be shared across test instances in a way that doesn't prohibit their use with concurrent testing, doesn't make it awkward to do it across multiple classes. I'm happy to make any [reasonable] changes (including license) to testresources to make it includable in the stdlib if that's of interest. -Rob

On Wed, Jan 20, 2010 at 8:37 PM, Robert Collins <robertc@robertcollins.net> wrote: think the answer to this is good documentation that clearly states that abusing the functionality can lead to shooting oneself in the foot. the ability to keep track of whether a class's setup method hasn't been implemented, but imitadly I'm not familiar with the internals of any existing randomization system so I will look into these to see what issues that would arise from this change.
A pretty good approach for a complicated setup, but in a simple case where you only need a line or two it seems like a lot of boiler plate to get the job done. Though I'll look into this library a little more as I am not intimately familiar with it at the moment.
-Rob

On Wed, 2010-01-20 at 22:32 -0500, Mark Roddy wrote:
Yes, setUp and tearDown can be problematic as well. Junit which has has setUp and setUpClass a long time has recently (http://kentbeck.github.com/junit/doc/ReleaseNotes4.7.html) added a system called Rules which is similar to testresources (though the fine detail is different). The key elements are the same though: - rules are their own hierarchy - rules are added to tests not part of the test specific code - framework takes care of bringing up and taking down rules.
- doing it the rough way you suggest interacts in non obvious ways with test order randomisation
It can be made to work: the non-obviousness is a simple side effect of having dependencies between multiple tests.
I don't think shared state here is good: but setting attributes *on the class* *is* shared state: Its why I don't like setUpClass :).
If you only need a line or two, its hard to justify setUpClass being used :). Anyhow, its really not much boilerplate: class MyResource(TestResource): def make(self, deps): # line or two return thing I am considering adding a helper to testresources: def simple_resource(maker, cleaner): class MyResource(TestResource): def make(self, deps): return maker(deps) def clean(self, resource): cleaner(resource) return MyResource which would make the boilerplate smaller still. Cheers, Rob

On 21/01/2010 04:03, Robert Collins wrote:
Right, they *can* be problematic but they can also be extremely useful if not abused. My feeling is that the same is true of setupClass and teardownClass. Replacing them with a more complex mechanism is not necessarily a win, although a more complex mechanism for more complex use cases is justified (although it doesn't necessarily *need* to be in the standard library).
This doesn't show how testresources is used in conjunction with unittest - can you give a minimal *self contained* example please. Thanks Michael
Cheers, Rob
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

Mark Roddy <markroddy@...
Yeah, it is a lot of boiler plate when you need only a couple of lines of code. There should be some convenience APIs on top of testresources. That way, you'll get the "nice and easy" along with support for test isolation, randomization and a declarative interface for those who prefer it. jml

On 21/01/2010 01:37, Robert Collins wrote:
Agreed.
- doing it the rough way you suggest interacts in non obvious ways with test order randomisation
Not currently in unittest, so not really an issue.
Again, unittest doesn't do concurrent testing out of the box. How are shared fixtures like this not introspectable? As I see it. Advantages of setupClass and teardownClass: * Simple to implement * Simple to explain * A commonly requested feature * Provided by other test frameworks in and outside of Python Disadvantages: * Can encourage a poor style of testing, including monolithic test classes
I'd much much much rather see e.g. testresources integrated into the
How is testresources superior? Can you demonstrate this - in particular what is the simplest example? The simplest example of setupClass would be: class SomeTest(unittest.TestCase): def setupClass(self): ... # setup shared fixture
Can you demonstrate how testresources solves these problems.
That's not how contributing code to Python works. You need to provide a signed contributor agreement to the PSF and then you specifically license to the PSF any code you are contributing using one of a few specific licenses. For new modules in the standard library you also need to be willing to maintain the code. All the best, Michael Foord
-Rob
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Thu, 2010-01-21 at 11:42 +0000, Michael Foord wrote:
Done by nearly every extended UI out there. I would hope that messing them up would count as an issue.
testtools includes a concurrent TestResult which buffers activities from different executing test cases and outputs them to a single regular TestResult. Using that run() on TestSuite can be redefined to start some threads, giving each one a helper TestResult forwarding to the caller supplied TestResult, and finally treat self.__iter__ as a queue to dispatch to each worker thread.
How are shared fixtures like this not introspectable?
How can you tell if there is shared state when setUpClass is being used?
Mmm. It is commonly requested, but I'm not sure I entirely agree about the simple claims.
Disadvantages:
* Can encourage a poor style of testing, including monolithic test classes
* Can limit concurrency
I'd much much much rather see e.g. testresources integrated into the
How is testresources superior?
It separates concerns, which allows introspection for test runners. This is significantly more flexible, particularly when combined with test parameterisation.
^ this is buggy :) Its buggy because if it really does write to self, you now have what twisted did where all the test cases will be sharing one test object instance - which very frequently confuses and makes isolation between unrelated test state fragile. If its not writing to self, then you don't know what class object to write to (because you might be in a subclass) - that really should be a class method. And the fact that you have this confusion, as demonstrated here, is precisely why I don't agree about easy to implement and easy to understand.
Yes. resources are declared in a list, which allows the test framework, when parallelising, to group together tests that use the same resources when partitioning. Compare this with setUpClass where all tests in an inheritance hierarchy have to be grouped together (after excluding your base TestCase). You ask in another mail for a SSCE: here is one attached to this mail ('foo.py'). Run with 'python -m unittest foo'.
There is other code of mine in the standard library unittest module, done without a contributor agreement: so this is at best inconsistent. I mention license changes because that would be required to include it in the standard library (its currently a copyleft license) - and I'm not the sole author. Regardless, whatever process and actions are needed, I will do them, or arrange with other people where needed. -Rob

On Thu, Jan 21, 2010 at 8:47 PM, Robert Collins <robertc@robertcollins.net> wrote:
FWIW I'd like to add that after a number of years of having py.test support setup_class and setup_module i came to the conclusion some time ago that this "grouping" requirement is bad for functional tests which usually require more setup management and parametrization than classic unittests. It's also not easy to have expensive setup easily shared between test modules and it's easy to get this wrong and to in the end hinder refactorings rather than to further it. At least that's my and some of my users experience with it. And is the reason why I came up with a new method to do test fixtures that allows to directly manipulate setup state in a single place instead of all across test suites. Basically test functions receive objects ("funcargs") that are created/handled in factories. Loose coupling here allows to have the same setup used across many test modules efficiently. Obviously different from the testresources approach although i share a number of related design considerations and goals. cheers, holger

On Thu, Jan 21, 2010 at 2:47 PM, Robert Collins <robertc@robertcollins.net> wrote:
Again, this should be taken into account to be easy to accommodate. For in process parallelization, I think the best approach would be to make it easy to extend to add the locking that would be necessary (thread extensions must already be doing a decent amount of locking to accomplish this). Multi-process/machine distribute testing will take some more analysis to determine how to accomidate.
Since the discussion thus far has focused almost entirely on the merits of the feature (and not on how to implement) I don't think it's fair to make an assumption either way on how easy or not it would be to implement. I already have most of the implementation mapped out in my head (of which I'm sure there will be issues to be resolved) and could have a first try patch before the end of the weekend, but I thought there should be more input before I go running off and doing so.
This is a definite upside (especially for for complicated resources) but it also increases the complexity reducing the ease of understanding the test case by introducing a host of new semamtics people will not (at first) be familiar with. By the fact that people have frequently requested a class setup/teardown illustrates that this has a semantic they can easily understand (since it's understood without even existing at the moment). For complex resources that require complicated tear down and/or the ability to invalidate to signal a need to recreate the resource (I believe this is refered to being 'dirty' in the framework) the loss of ease of understanding clearly out ways the additional layer, but for simple resources (of which I think the attached is a great example) I don't see this trade off being paid off. So I don't really see this as having to be an either or situation. I don't think people should have to understand a whole additional API in order to achieve this behavior, and I also wouldn't advocate people being force to fit a round peg in a square whole for extremely large resources. The way I see it we don't have two competing implementations, but two possible new features: class oriented setup/teardown functionality (as already found in JUnit and Nunit), and a resource abstraction layer (possible testresources itself, possibly something inspired by the JUnit Rules previously mentioned, or a combination of the two).

Mark Roddy <markroddy@...> writes:
I agree that this is a common problem, and that the Python community would benefit from a well-known, well-understood and widely applicable solution.
I'd take issue with the argument that this makes the intent clearer. In your motivating example, what you mean is "the test needs a connection" not "I want to do a set up that spans the scope of this class". A better approach would be to _declare_ the dependency in the test and let something else figure out how to best provide the dependency. Also, setUpClass / tearDownClass is probably a bad idea. We implemented such behaviour in Twisted's testing framework some time ago, and have only recently managed to undo the damage.[1] If you do this in such a way that setUpClass methods are encouraged to set attributes on 'self', then you are compelled to share the TestCase instance between tests. This runs contrary to the design of unittest, and breaks code that relies on a test object represent a single test. It turns out that if you are building extensions to testing frameworks (like any big app does), it helps a lot to have an object per runnable test. In particular, it makes distributing tests across multiple processors & multiple computers much, much easier. It also poses a difficult challenge for test runners that provide features such as running the tests in a random order. It's very hard to know if the class is actually "done" and ready to be torn down. Finally, we found that it's use often lead to hard-to-debug failures due to test isolation issues. There are already alternatives for this in the Python unit testing world. zope.testing provides a facility called 'layers' which solves this problem. I don't like it[2], but if we are talking about changing the standard library then we should consult existing practice. Another solution is testresources[3]. It takes a declarative approach and works with all the code that's already in the Python standard library. I'm not deeply familiar with xUnit implementations in other languages, but the problem isn't unique to Python. I imagine it would be worth doing some research on what Nunit, JUnit etc do.
Replies below. ...
In Twisted, we called them setUpClass and tearDownClass.
Of course, it actually represents a failure in all of the tests. Another way of doing it is to construct a test-like object representing the entire class and record it as a failure in that.
* Ditto for teardownClass - again the easiest way is for it to be reported as a failure in the last test
Ditto.
* If setupClass fails then should all the tests in that class be skipped? I think yes.
They should all be failed.
That's really really hard, and not a detail at all. See above. I know this is a mostly negative response, but I really do hope it helps. jml [1] http://twistedmatrix.com/trac/ticket/4175 [2] http://code.mumak.net/2009/09/layers-are-terrible.html [3] http://pypi.python.org/pypi/testresources/

On Wed, Jan 20, 2010 at 9:04 PM, Jonathan Lange <jml@mumak.net> wrote:
I like the class setup semantics as the closely follow the existing fixture setup semantics. Having to declare and configure dependencies (while possibly being clearer as far as expressing intent) introduces a completely different set of semantics that have to be grasped in order to be used. I'm going to with hold forming an opinion as to whether or not this presents any meaningful barrier to entry until after I've had a chance to review Rob's resources library.
Thanks, I was not aware of this. Do you have an references as to the particular problems it caused? The ticket only seems to describe it being removed (with much excitement :), but doesn't seem to mention the motivation.
I was actually thinking that a setupclass method would be a classmethod since the use case is sharing across fixtures within a class. Wouldn't that be enough to cover this issue?
True, and clearly the case for highly focused unit tests. However, for integration tests (or whatever we could call tests that are explicitly designed to work with an expensive resource) it can be cost prohibitive to have this type of isolation (or flat out impossible in the case that I gave). I'll look into the implementation of some of the testing frameworks that support distributed testing and see if there isn't a way that this can be supported in both contexts (is it possible to implement in a way that the case setup would get run in each process/machine?).
My initial (off the top of my head) thinking was to count the number of test fixtures to be run via a class attribute set in the test case constructor, and then the test runner would either decrement this after the test is complete and call the tear down method once the counter reached zero. This doesn't seem like it would be affected by randomized testing ordering, but I'll look into some existing implementations to see how this could be affected. Any obvious issues I'm missing?
Finally, we found that it's use often lead to hard-to-debug failures due to test isolation issues.
I think there's a distinction between "can lead to bad situations" and "encourages bad situations". The former is almost impossible to avoid without becoming java :). That latter is much subtler, but can be addressed. Do you have any suggestions for altering the semantics to discourage abuse without reducing flexibility. With a similar feature I use, we have a rule to not use the case setup unless explicitly writing integration tests, though there is no functional way to enforce this, only communicating the idea (via documentation and code reviews).
Thanks, I will look into this and try to enumerate some pros and cons. Are there any specifics about it that you don't like?
Another solution is testresources[3]. It takes a declarative approach and works with all the code that's already in the Python standard library.
Will be looking into this as well as previously stated.
Both JUnit and Nunit have a class setup and teardown functionality: http://www.junit.org/apidocs/org/junit/BeforeClass.html http://www.junit.org/apidocs/org/junit/AfterClass.html http://www.nunit.org/index.php?p=fixtureSetup&r=2.5.3 http://www.nunit.org/index.php?p=fixtureTeardown&r=2.5.3 (The Nunit implementation I found a little confusing as they apparently refer to a TestCase as a Fixture).

On Wed, 2010-01-20 at 22:33 -0500, Mark Roddy wrote:
A data point here is that the setUp/tearDown/cleanUp semantics are not trivial themselves: here is a self check (check the source after coming up with answers :P) - does tearDown run if setUp raises? - do cleanUps? - do cleanUps run before or after tearDown? Also would you add classCleanUps or something, if you add a class setUp? clean ups are _much_ nicer than tearDown, so I'd really want to see that.
Its been trimmed, but here was your example: "For example, I have several hundred integration tests that hit a live database. If I create the connection object for each fixture, most of the tests fail due to the maximum connection limit being reached. As a work around, I create a connection object once for each test case." Database connections can be dirtied - they need to be in a clean state with no outstanding requests, and outside a transaction, to be valid for the start of a test. I do appreciate that if you have a buggy TCP stack, or are not closing your connections, you can certainly hit connection limits. I'd be inclined to have a pool of connections, rather than one connection per class. That would under ideal circumstances give you one connection for an entire test run (without concurrency), but also mean that a broken connection would at most affect only one test.
Its about partitioning the state: e.g. (ugly) deep copy the class object per thread.
Well, this has a few undesirable facets: - its complex - you'll need to honour tests that don't have that attribute set - and tests that already have that attribute set - it breaks the abstraction of 'a test knows how to run itself'. - you can't set that attribute in test case constructors because tests can be constructed while the suite is running - using a global like that will mean that a single classes class setup stuff would *stay setup* while other classes run, with a randomised order : thats bad because things that are expensive and need to be optimised also tend to be *large*, either in memory or external resources (like TCP connections in your example).
Finally, we found that it's use often lead to hard-to-debug failures due to test isolation issues.
Have you seen Rusty Russell's interface usability scale? Having to have a rule strongly suggests that the API is prone to misuse :).
See my other mail for a link to JUnit's Rules. Cheers, Rob

Mark Roddy <markroddy@gmail.com> writes:
I think this is a great idea, I ran into this problem too.
No strong feelings about this, I'm happy as long as some solution gets implemented. -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C

Ignoring many of the finer points brought up here, and putting practicality before purity, I think having setUpClass and tearDownClass methods is a great idea. While we're at it I would also recommend adding module-level setUp and tearDown function -- Google's extension of pyunit implements these and they are often handy for a variety of use cases. -- --Guido van Rossum (python.org/~guido)

On Thu, Jan 21, 2010 at 6:09 PM, Guido van Rossum <guido@python.org> wrote:
If going for that i'd rather like to see those named setup_class/teardown_class and setup_module/teardown_module like py.test and nose do for a long time now. When i first went for those i actually did so because i wanted to follow PEP8 ... But the stronger argument now is that it would be cool seeing some tool/approach convergence and unittest is the new kid on the setup-block there :) cheers, holger

On Thu, Jan 21, 2010 at 12:18 PM, Holger Krekel <holger.krekel@gmail.com> wrote:
Even PEP 8 admits that consistency within a module trumps theany global style requirements. It's already setUp and tearDown, and any new methods added should follow that camel-case convention. -- --Guido van Rossum (python.org/~guido)

On Thu, Jan 21, 2010 at 10:24 PM, Guido van Rossum <guido@python.org> wrote:
I see the module consistency argument. It'd still confuse a lot of existing test suites and test writers out there which already use the idiom. The PEP8 mention was for reference that the naming wasn't chosen arbitrarily. A more fundamental question maybe is if/how how we want to work towards python test tool convergence on concepts and naming. best, holger
-- --Guido van Rossum (python.org/~guido)

On 21/01/2010 19:12, Brett Cannon wrote:
unittest creates one instance *per test method*. That way tests are isolated from each other in separate instances. __init__ and __del__ are (or would be) run per test method. Michael
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Thu, Jan 21, 2010 at 2:15 PM, Michael Foord <fuzzyman@voidspace.org.uk> wrote:
Also, if you did a teardown (of any type/scope) in __del__ you wouldn't be able report errors occurring in the method as __del__ isn't called until the object is no longer referenced (possibly not occurring until the interpreter shuts down). -Mark

On Thu, Jan 21, 2010 at 8:12 PM, Brett Cannon <brett@python.org> wrote:
Test resource finalization needs strict semantics and predictable timing i'd say - and at least '__del__' doesn't provide that on some python interpreters. Moreover, unitttest (as well as py.test and other test runners) instantiate fresh instances for a test method exec to prevent hidden dependencies between tests. cheers, holger

On a slight tangent, I'd be happier if we could move away from the setUp/tearDown model and move towards __enter__/__exit__. The existing convention comes from a time before the with statement - extending it further doesn't seem like it would be taking things in the right direction. By allowing test suites to identify context managers that are invoked by the test framework at various points, it allows the setup/teardown code to be cleanly separated from the test classes themselves. E.g. the test framework might promise to do the following for each test: with module_cm(test_instance): # However identified with class_cm(test_instance): # However identified with test_instance: # ** test_instance.test_method() (** Assuming the addition of "def __enter__(self): self.setUp()" and "def __exit__(self, *args): self.tearDown()" to unittest.TestCase for backwards compatibility) Caching of expensive state on the module and class context managers would then be the prerogative of those context managers rather than the responsibility of the test cases themselves. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Wed, 2010-01-20 at 19:38 -0500, Mark Roddy wrote:
I think that this is the wrong approach to the problem: - class scope for such fixtures drives large test classes, reduces flexability - doing it the rough way you suggest interacts in non obvious ways with test order randomisation - it also interacts with concurrent testing by having shared objects (classes) hold state in a way that is not introspectable by the test running framework. I'd much much much rather see e.g. testresources integrated into the core allowing fixtures to be shared across test instances in a way that doesn't prohibit their use with concurrent testing, doesn't make it awkward to do it across multiple classes. I'm happy to make any [reasonable] changes (including license) to testresources to make it includable in the stdlib if that's of interest. -Rob

On Wed, Jan 20, 2010 at 8:37 PM, Robert Collins <robertc@robertcollins.net> wrote: think the answer to this is good documentation that clearly states that abusing the functionality can lead to shooting oneself in the foot. the ability to keep track of whether a class's setup method hasn't been implemented, but imitadly I'm not familiar with the internals of any existing randomization system so I will look into these to see what issues that would arise from this change.
A pretty good approach for a complicated setup, but in a simple case where you only need a line or two it seems like a lot of boiler plate to get the job done. Though I'll look into this library a little more as I am not intimately familiar with it at the moment.
-Rob

On Wed, 2010-01-20 at 22:32 -0500, Mark Roddy wrote:
Yes, setUp and tearDown can be problematic as well. Junit which has has setUp and setUpClass a long time has recently (http://kentbeck.github.com/junit/doc/ReleaseNotes4.7.html) added a system called Rules which is similar to testresources (though the fine detail is different). The key elements are the same though: - rules are their own hierarchy - rules are added to tests not part of the test specific code - framework takes care of bringing up and taking down rules.
- doing it the rough way you suggest interacts in non obvious ways with test order randomisation
It can be made to work: the non-obviousness is a simple side effect of having dependencies between multiple tests.
I don't think shared state here is good: but setting attributes *on the class* *is* shared state: Its why I don't like setUpClass :).
If you only need a line or two, its hard to justify setUpClass being used :). Anyhow, its really not much boilerplate: class MyResource(TestResource): def make(self, deps): # line or two return thing I am considering adding a helper to testresources: def simple_resource(maker, cleaner): class MyResource(TestResource): def make(self, deps): return maker(deps) def clean(self, resource): cleaner(resource) return MyResource which would make the boilerplate smaller still. Cheers, Rob

On 21/01/2010 04:03, Robert Collins wrote:
Right, they *can* be problematic but they can also be extremely useful if not abused. My feeling is that the same is true of setupClass and teardownClass. Replacing them with a more complex mechanism is not necessarily a win, although a more complex mechanism for more complex use cases is justified (although it doesn't necessarily *need* to be in the standard library).
This doesn't show how testresources is used in conjunction with unittest - can you give a minimal *self contained* example please. Thanks Michael
Cheers, Rob
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

Mark Roddy <markroddy@...
Yeah, it is a lot of boiler plate when you need only a couple of lines of code. There should be some convenience APIs on top of testresources. That way, you'll get the "nice and easy" along with support for test isolation, randomization and a declarative interface for those who prefer it. jml

On 21/01/2010 01:37, Robert Collins wrote:
Agreed.
- doing it the rough way you suggest interacts in non obvious ways with test order randomisation
Not currently in unittest, so not really an issue.
Again, unittest doesn't do concurrent testing out of the box. How are shared fixtures like this not introspectable? As I see it. Advantages of setupClass and teardownClass: * Simple to implement * Simple to explain * A commonly requested feature * Provided by other test frameworks in and outside of Python Disadvantages: * Can encourage a poor style of testing, including monolithic test classes
I'd much much much rather see e.g. testresources integrated into the
How is testresources superior? Can you demonstrate this - in particular what is the simplest example? The simplest example of setupClass would be: class SomeTest(unittest.TestCase): def setupClass(self): ... # setup shared fixture
Can you demonstrate how testresources solves these problems.
That's not how contributing code to Python works. You need to provide a signed contributor agreement to the PSF and then you specifically license to the PSF any code you are contributing using one of a few specific licenses. For new modules in the standard library you also need to be willing to maintain the code. All the best, Michael Foord
-Rob
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Thu, 2010-01-21 at 11:42 +0000, Michael Foord wrote:
Done by nearly every extended UI out there. I would hope that messing them up would count as an issue.
testtools includes a concurrent TestResult which buffers activities from different executing test cases and outputs them to a single regular TestResult. Using that run() on TestSuite can be redefined to start some threads, giving each one a helper TestResult forwarding to the caller supplied TestResult, and finally treat self.__iter__ as a queue to dispatch to each worker thread.
How are shared fixtures like this not introspectable?
How can you tell if there is shared state when setUpClass is being used?
Mmm. It is commonly requested, but I'm not sure I entirely agree about the simple claims.
Disadvantages:
* Can encourage a poor style of testing, including monolithic test classes
* Can limit concurrency
I'd much much much rather see e.g. testresources integrated into the
How is testresources superior?
It separates concerns, which allows introspection for test runners. This is significantly more flexible, particularly when combined with test parameterisation.
^ this is buggy :) Its buggy because if it really does write to self, you now have what twisted did where all the test cases will be sharing one test object instance - which very frequently confuses and makes isolation between unrelated test state fragile. If its not writing to self, then you don't know what class object to write to (because you might be in a subclass) - that really should be a class method. And the fact that you have this confusion, as demonstrated here, is precisely why I don't agree about easy to implement and easy to understand.
Yes. resources are declared in a list, which allows the test framework, when parallelising, to group together tests that use the same resources when partitioning. Compare this with setUpClass where all tests in an inheritance hierarchy have to be grouped together (after excluding your base TestCase). You ask in another mail for a SSCE: here is one attached to this mail ('foo.py'). Run with 'python -m unittest foo'.
There is other code of mine in the standard library unittest module, done without a contributor agreement: so this is at best inconsistent. I mention license changes because that would be required to include it in the standard library (its currently a copyleft license) - and I'm not the sole author. Regardless, whatever process and actions are needed, I will do them, or arrange with other people where needed. -Rob

On Thu, Jan 21, 2010 at 8:47 PM, Robert Collins <robertc@robertcollins.net> wrote:
FWIW I'd like to add that after a number of years of having py.test support setup_class and setup_module i came to the conclusion some time ago that this "grouping" requirement is bad for functional tests which usually require more setup management and parametrization than classic unittests. It's also not easy to have expensive setup easily shared between test modules and it's easy to get this wrong and to in the end hinder refactorings rather than to further it. At least that's my and some of my users experience with it. And is the reason why I came up with a new method to do test fixtures that allows to directly manipulate setup state in a single place instead of all across test suites. Basically test functions receive objects ("funcargs") that are created/handled in factories. Loose coupling here allows to have the same setup used across many test modules efficiently. Obviously different from the testresources approach although i share a number of related design considerations and goals. cheers, holger

On Thu, Jan 21, 2010 at 2:47 PM, Robert Collins <robertc@robertcollins.net> wrote:
Again, this should be taken into account to be easy to accommodate. For in process parallelization, I think the best approach would be to make it easy to extend to add the locking that would be necessary (thread extensions must already be doing a decent amount of locking to accomplish this). Multi-process/machine distribute testing will take some more analysis to determine how to accomidate.
Since the discussion thus far has focused almost entirely on the merits of the feature (and not on how to implement) I don't think it's fair to make an assumption either way on how easy or not it would be to implement. I already have most of the implementation mapped out in my head (of which I'm sure there will be issues to be resolved) and could have a first try patch before the end of the weekend, but I thought there should be more input before I go running off and doing so.
This is a definite upside (especially for for complicated resources) but it also increases the complexity reducing the ease of understanding the test case by introducing a host of new semamtics people will not (at first) be familiar with. By the fact that people have frequently requested a class setup/teardown illustrates that this has a semantic they can easily understand (since it's understood without even existing at the moment). For complex resources that require complicated tear down and/or the ability to invalidate to signal a need to recreate the resource (I believe this is refered to being 'dirty' in the framework) the loss of ease of understanding clearly out ways the additional layer, but for simple resources (of which I think the attached is a great example) I don't see this trade off being paid off. So I don't really see this as having to be an either or situation. I don't think people should have to understand a whole additional API in order to achieve this behavior, and I also wouldn't advocate people being force to fit a round peg in a square whole for extremely large resources. The way I see it we don't have two competing implementations, but two possible new features: class oriented setup/teardown functionality (as already found in JUnit and Nunit), and a resource abstraction layer (possible testresources itself, possibly something inspired by the JUnit Rules previously mentioned, or a combination of the two).

Mark Roddy <markroddy@...> writes:
I agree that this is a common problem, and that the Python community would benefit from a well-known, well-understood and widely applicable solution.
I'd take issue with the argument that this makes the intent clearer. In your motivating example, what you mean is "the test needs a connection" not "I want to do a set up that spans the scope of this class". A better approach would be to _declare_ the dependency in the test and let something else figure out how to best provide the dependency. Also, setUpClass / tearDownClass is probably a bad idea. We implemented such behaviour in Twisted's testing framework some time ago, and have only recently managed to undo the damage.[1] If you do this in such a way that setUpClass methods are encouraged to set attributes on 'self', then you are compelled to share the TestCase instance between tests. This runs contrary to the design of unittest, and breaks code that relies on a test object represent a single test. It turns out that if you are building extensions to testing frameworks (like any big app does), it helps a lot to have an object per runnable test. In particular, it makes distributing tests across multiple processors & multiple computers much, much easier. It also poses a difficult challenge for test runners that provide features such as running the tests in a random order. It's very hard to know if the class is actually "done" and ready to be torn down. Finally, we found that it's use often lead to hard-to-debug failures due to test isolation issues. There are already alternatives for this in the Python unit testing world. zope.testing provides a facility called 'layers' which solves this problem. I don't like it[2], but if we are talking about changing the standard library then we should consult existing practice. Another solution is testresources[3]. It takes a declarative approach and works with all the code that's already in the Python standard library. I'm not deeply familiar with xUnit implementations in other languages, but the problem isn't unique to Python. I imagine it would be worth doing some research on what Nunit, JUnit etc do.
Replies below. ...
In Twisted, we called them setUpClass and tearDownClass.
Of course, it actually represents a failure in all of the tests. Another way of doing it is to construct a test-like object representing the entire class and record it as a failure in that.
* Ditto for teardownClass - again the easiest way is for it to be reported as a failure in the last test
Ditto.
* If setupClass fails then should all the tests in that class be skipped? I think yes.
They should all be failed.
That's really really hard, and not a detail at all. See above. I know this is a mostly negative response, but I really do hope it helps. jml [1] http://twistedmatrix.com/trac/ticket/4175 [2] http://code.mumak.net/2009/09/layers-are-terrible.html [3] http://pypi.python.org/pypi/testresources/

On Wed, Jan 20, 2010 at 9:04 PM, Jonathan Lange <jml@mumak.net> wrote:
I like the class setup semantics as the closely follow the existing fixture setup semantics. Having to declare and configure dependencies (while possibly being clearer as far as expressing intent) introduces a completely different set of semantics that have to be grasped in order to be used. I'm going to with hold forming an opinion as to whether or not this presents any meaningful barrier to entry until after I've had a chance to review Rob's resources library.
Thanks, I was not aware of this. Do you have an references as to the particular problems it caused? The ticket only seems to describe it being removed (with much excitement :), but doesn't seem to mention the motivation.
I was actually thinking that a setupclass method would be a classmethod since the use case is sharing across fixtures within a class. Wouldn't that be enough to cover this issue?
True, and clearly the case for highly focused unit tests. However, for integration tests (or whatever we could call tests that are explicitly designed to work with an expensive resource) it can be cost prohibitive to have this type of isolation (or flat out impossible in the case that I gave). I'll look into the implementation of some of the testing frameworks that support distributed testing and see if there isn't a way that this can be supported in both contexts (is it possible to implement in a way that the case setup would get run in each process/machine?).
My initial (off the top of my head) thinking was to count the number of test fixtures to be run via a class attribute set in the test case constructor, and then the test runner would either decrement this after the test is complete and call the tear down method once the counter reached zero. This doesn't seem like it would be affected by randomized testing ordering, but I'll look into some existing implementations to see how this could be affected. Any obvious issues I'm missing?
Finally, we found that it's use often lead to hard-to-debug failures due to test isolation issues.
I think there's a distinction between "can lead to bad situations" and "encourages bad situations". The former is almost impossible to avoid without becoming java :). That latter is much subtler, but can be addressed. Do you have any suggestions for altering the semantics to discourage abuse without reducing flexibility. With a similar feature I use, we have a rule to not use the case setup unless explicitly writing integration tests, though there is no functional way to enforce this, only communicating the idea (via documentation and code reviews).
Thanks, I will look into this and try to enumerate some pros and cons. Are there any specifics about it that you don't like?
Another solution is testresources[3]. It takes a declarative approach and works with all the code that's already in the Python standard library.
Will be looking into this as well as previously stated.
Both JUnit and Nunit have a class setup and teardown functionality: http://www.junit.org/apidocs/org/junit/BeforeClass.html http://www.junit.org/apidocs/org/junit/AfterClass.html http://www.nunit.org/index.php?p=fixtureSetup&r=2.5.3 http://www.nunit.org/index.php?p=fixtureTeardown&r=2.5.3 (The Nunit implementation I found a little confusing as they apparently refer to a TestCase as a Fixture).

On Wed, 2010-01-20 at 22:33 -0500, Mark Roddy wrote:
A data point here is that the setUp/tearDown/cleanUp semantics are not trivial themselves: here is a self check (check the source after coming up with answers :P) - does tearDown run if setUp raises? - do cleanUps? - do cleanUps run before or after tearDown? Also would you add classCleanUps or something, if you add a class setUp? clean ups are _much_ nicer than tearDown, so I'd really want to see that.
Its been trimmed, but here was your example: "For example, I have several hundred integration tests that hit a live database. If I create the connection object for each fixture, most of the tests fail due to the maximum connection limit being reached. As a work around, I create a connection object once for each test case." Database connections can be dirtied - they need to be in a clean state with no outstanding requests, and outside a transaction, to be valid for the start of a test. I do appreciate that if you have a buggy TCP stack, or are not closing your connections, you can certainly hit connection limits. I'd be inclined to have a pool of connections, rather than one connection per class. That would under ideal circumstances give you one connection for an entire test run (without concurrency), but also mean that a broken connection would at most affect only one test.
Its about partitioning the state: e.g. (ugly) deep copy the class object per thread.
Well, this has a few undesirable facets: - its complex - you'll need to honour tests that don't have that attribute set - and tests that already have that attribute set - it breaks the abstraction of 'a test knows how to run itself'. - you can't set that attribute in test case constructors because tests can be constructed while the suite is running - using a global like that will mean that a single classes class setup stuff would *stay setup* while other classes run, with a randomised order : thats bad because things that are expensive and need to be optimised also tend to be *large*, either in memory or external resources (like TCP connections in your example).
Finally, we found that it's use often lead to hard-to-debug failures due to test isolation issues.
Have you seen Rusty Russell's interface usability scale? Having to have a rule strongly suggests that the API is prone to misuse :).
See my other mail for a link to JUnit's Rules. Cheers, Rob

Mark Roddy <markroddy@gmail.com> writes:
I think this is a great idea, I ran into this problem too.
No strong feelings about this, I'm happy as long as some solution gets implemented. -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C

Ignoring many of the finer points brought up here, and putting practicality before purity, I think having setUpClass and tearDownClass methods is a great idea. While we're at it I would also recommend adding module-level setUp and tearDown function -- Google's extension of pyunit implements these and they are often handy for a variety of use cases. -- --Guido van Rossum (python.org/~guido)

On Thu, Jan 21, 2010 at 6:09 PM, Guido van Rossum <guido@python.org> wrote:
If going for that i'd rather like to see those named setup_class/teardown_class and setup_module/teardown_module like py.test and nose do for a long time now. When i first went for those i actually did so because i wanted to follow PEP8 ... But the stronger argument now is that it would be cool seeing some tool/approach convergence and unittest is the new kid on the setup-block there :) cheers, holger

On Thu, Jan 21, 2010 at 12:18 PM, Holger Krekel <holger.krekel@gmail.com> wrote:
Even PEP 8 admits that consistency within a module trumps theany global style requirements. It's already setUp and tearDown, and any new methods added should follow that camel-case convention. -- --Guido van Rossum (python.org/~guido)

On Thu, Jan 21, 2010 at 10:24 PM, Guido van Rossum <guido@python.org> wrote:
I see the module consistency argument. It'd still confuse a lot of existing test suites and test writers out there which already use the idiom. The PEP8 mention was for reference that the naming wasn't chosen arbitrarily. A more fundamental question maybe is if/how how we want to work towards python test tool convergence on concepts and naming. best, holger
-- --Guido van Rossum (python.org/~guido)

On 21/01/2010 19:12, Brett Cannon wrote:
unittest creates one instance *per test method*. That way tests are isolated from each other in separate instances. __init__ and __del__ are (or would be) run per test method. Michael
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Thu, Jan 21, 2010 at 2:15 PM, Michael Foord <fuzzyman@voidspace.org.uk> wrote:
Also, if you did a teardown (of any type/scope) in __del__ you wouldn't be able report errors occurring in the method as __del__ isn't called until the object is no longer referenced (possibly not occurring until the interpreter shuts down). -Mark

On Thu, Jan 21, 2010 at 8:12 PM, Brett Cannon <brett@python.org> wrote:
Test resource finalization needs strict semantics and predictable timing i'd say - and at least '__del__' doesn't provide that on some python interpreters. Moreover, unitttest (as well as py.test and other test runners) instantiate fresh instances for a test method exec to prevent hidden dependencies between tests. cheers, holger

On a slight tangent, I'd be happier if we could move away from the setUp/tearDown model and move towards __enter__/__exit__. The existing convention comes from a time before the with statement - extending it further doesn't seem like it would be taking things in the right direction. By allowing test suites to identify context managers that are invoked by the test framework at various points, it allows the setup/teardown code to be cleanly separated from the test classes themselves. E.g. the test framework might promise to do the following for each test: with module_cm(test_instance): # However identified with class_cm(test_instance): # However identified with test_instance: # ** test_instance.test_method() (** Assuming the addition of "def __enter__(self): self.setUp()" and "def __exit__(self, *args): self.tearDown()" to unittest.TestCase for backwards compatibility) Caching of expensive state on the module and class context managers would then be the prerogative of those context managers rather than the responsibility of the test cases themselves. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
participants (9)
-
Brett Cannon
-
Guido van Rossum
-
Holger Krekel
-
Jonathan Lange
-
Mark Roddy
-
Michael Foord
-
Nick Coghlan
-
Nikolaus Rath
-
Robert Collins