[py-dev] utest thoughts

Sun Oct 3 21:07:17 CEST 2004

holger krekel wrote:
>>Could this be like:
>>
>>data = [(1, 'one'), (2, 'two'), ...]
>>def test_collector():
>>    for args in data:
>>        # Depending on when tests are run, I think leaving out args=args
>>        # could silently make all your tests use the same data :(
>>        yield lambda args=args: test_converter(*args)
>># Should we then to test_collector = test_collector() ?
> 
> 
> yes, though 
> 
> a) the collection function should not have a 'test' pre- or postfix 
>    i think.  

Sure.

> b) collectors always yield collectors or Units, not plain functions 
>    so you would actually do 
> 
>       yield py.test.Unit(my_custom_test_function, *args)
> 
> In the current py code (see my other postings) you'll find 
> in py/test/test/data/Collector.py the current way to do custom 
> collectors.  This Collector class is instantiated with 
> the module's location and takes over the collecting 
> process for the module.  No further attempt is made 
> to collect anything from a module which contains a 
> custom "Collector" object. 

I have a problem with APIs that jump in complexity.  In this case you 
can forget about Unit until you start needing to generate multiple tests 
programmatically -- your functions *were* your units, until that point. 
  If functions are units in one place, they should be units everyplace.

This is where adaptation is a potentially useful idea.  Instead of 
asking for a specific interface for any particular object, you are only 
asking for an object that can be adapted to a particular interface.  You 
do this adaptation at the last possible moment, so that something like a 
custom collector doesn't produce Units, but merely produces something 
that can be turned into a Unit (or rather, turned into something that 
supports the IUnit interface).

Doing it this way, functions really *are* units, for all purposes. 
Also, you've introduced a decoupled abstraction.  The tests can be any 
kind of object, whatever makes sense for the project -- maybe functions, 
maybe doctests, maybe classes, and so on.  The runner iterates over 
unites.  The collector is an adapter from packages to units, and 
recursively it adapts from modules to units, from functions to units, 
etc.  But with adaptation the collector isn't something that the runner 
starts up, rather it is a registered service.

So customization might simply mean that the runner starts by loading 
__init__, or test_init, or something like that.  That module can 
register adapters which are more specialized than the standard adapters, 
or override the standard adapters for a subset of the package.  (How to 
get them to apply to a subset?  I'm not sure, that's not a metaphor I've 
seen with adaptation, though maybe multiple-value adapters apply somehow 
-- I'm still new at using adaptation.)

>>In the case I'm thinking of, where I run the identical set of tests on 
>>different backends, it should be available as a command-line argument. 
>>But if there's a generic command-line argument (like -D) then that could 
>>be used to set arbitrary options (assuming the config file can accept 
>>arbitrary options).
> 
> 
> you already took '-D' for the pdb debugging but I guess we don't need
> "--pdb" as a short option. I took the freedom to rename it from 
> "--usedpdb" by the way. 

Sure.  I just took the option name from Zope, but I don't really care 
what it is.

> Introducing "-D" for passing options/backends to the tests 
> still requires more careful thoughts about test configuration
> (files) in general. I think it should be possible to have 
> test-configuration files per directory, which might modify 
> the collection process and deal with configuration data ("-D") 
> issues. They _may_ also provide a different runner if it 
> turns out to make sense.  Again please note, that py.test's 
> "runner" has simpler responsibilities than unittest.py - based 
> runners. We have the runner, the collectors and the reporter. 

I know, I think I say "runner" to mean "the whole process".

If we are assuming that the tests are run in serial, configuration could 
be done through an initialization hook.  So you just put something in like:

# in test_init.py...
import py.test
def test_setup():
     py.test.options.oldusepdb = py.test.options.usepdb
     py.test.options.usepdb = True

def test_teardown():
     py.test.options.usepdb = py.test.options.oldusepdb
     del py.test.options.oldusepdb

Kind of lame, but maybe functional.

>>>Note, that it should always be possible to run tests of any
>>>application with py-tests by invoking 'py.test APP-DIRECTORY' or 
>>>simply 'py.test' while being in the app directory.  
>>
>>What about "python setup.py test" ?  This would allow for a common way 
>>to invoke tests, regardless of runner; people could use this for their 
>>unittest-based tests as well, or whatever they are using.
> 
> 
> I think it's easier to say 
> 
>     py.test APP-DIRECTORY 
> 
> and i don't want to deal with distutils-hacks if it can be avoided ... 

My only concern is that py.test not be too novel; making test creation 
accessible is a big motivation for me.  I like your idea of encouraging 
bug reports to be done by committing a failing test.  Acknowledging that 
other people are going to be running their tests using unittest, or 
unittest with custom runners, using setup.py is a way to provide a 
common entry point across all Python projects.

>>I think I tried this at one point, but then got bored of trying to 
>>figure out the distutils just to add this one little command.
> 
> 
> ... we seem to share the same view here :-) 
>  
> 
>>In Zope3 they explicitly add doctests to the testing, it isn't just 
>>automatically picked up.
> 
> 
> Argh! I have to say i dislike writing repetitive unneccessary
> "manually-synced" boilerplate code for tests, be they unit,
> doc or any other kind of test. 

Yeah... but it's not *that* bad.  It would probably look like:

from py.test import make_doctest # <-- not a great name
import somemodule

doctest_collector = make_doctest(somemodule)
# or...
doctest_collector = [make_doctest(somemodule.SomeClass),
                      make_doctest(somemodule.SomeClass2)]

Maybe it's not that big a deal, because practically every module is 
going to be imported anyway.  I don't know how much time it takes to 
parse docstrings; heck, you could cache that parsing too, like .pyc 
caches the compilation.  Or you could look for the __test__ variable 
that doctest uses; then you'd have to use __test__ = {} when you didn't 
have any additional tests, but that's relatively little boilerplate... 
though that's probably not a good compromise.

>>In part because the doctests are typically in modules that
>>aren't otherwise inspected for tests (they are inline with
>>the normal code, not in seperate test_* modules).  I think
>>there may be a performance issue with inspecting all modules
>>for doctests, and potentially an issue of finding things
>>that look like tests but aren't (though that probably isn't
>>a big problem, since there's ways to exclude docstrings from
>>doctest).
> 
> 
> Hum, performance problem.  I think importing all test and implementation 
> code of the py lib requires 2 seconds on a file system. As 
> the collection happens iterative these two seconds are distributed 
> across the whole testing time.  With "Test-Session" modes it 
> will be less. 
> 
> But if there is a performance problem we may think of ways 
> to exclude files from looking for doctests by means of the
> per-directory py.test file. 
>  
> 
>>>>* Different levels of tests (-a --at-level or --all; default level is 1, 
>>>>which doesn't run all tests).  They have lots of tests, so I'm guessing 
>>>>they like to avoid running tests which are unlikely to fail.
>>>
>>>having different levels of tests seems interesting. I'd like
>>>more of a keyword based approach where all tests are
>>>associated with some keywords and you can select tests by
>>>providing keywords.  Keywords could be automatically
>>>associated from filename and python name components e.g.
>>>['std', 'path', 'local', 'test_listdir'].  You could
>>>additionally associate a 'slow' keyword to some tests (somehow) 
>>>and start 'py.test --exclude="slow"' or put this as a default in
>>>the configuration file. 
>>
>>That would work well, I think.  How might tests be annotated?  On a 
>>module-by-module basis, using some particular symbol 
>>(__test_keywords__)?  Function attributes?  On an ad hoc basis by 
>>customizing the collector?
> 
> 
> It has to be possible to add keywords (i think we should
> only ever deal with adding keywords) on the module, class and 
> method level.  Finding syntax for the module level and the 
> class level boils down to thinking about a good name which 
> lists the additional keywords. 
> 
> For the method level we would have to worry about syntax 
> so i'd like to avoid it alltogether.   

Why not just attributes for all of these, e.g.:

# module
__test_keywords__ = ['bug001', 'pkg1']

def test_speed(): ...
test_speed.__test_keywords__ = ['profile']

class SpeedTester:
     __test_keywords__ = ['profile']
     def test_remote(self): ...
     test_remote.__test_keywords__ = ['urllib']

And so on.  Since attributes can be added to anything, it provides a 
pretty easy interface.

> I think that with "automatic keywords", derived from 
> the name of the test module, class (if any) and method 
> name we are all set. You can always "add" a keyword by
> extending the testname, avoiding any redundancy. 
> 
> For example, we could introduce "bug" tests: 
> 
>     def test_bug001_pypath_sucks():
>         # ... 
> 
> and then you could run a specific "bug001" test by something like 
> 
>     py.test --include="bug001"

In some ways this seems simple, but I'd expect the keywords to be fairly 
volatile.  I'm not sure that I like that the function names would have 
to be as volatile as the keywords; e.g., if you add a new keyword, will 
you rename half your functions?  And the names also have to have other 
parts to describe them that aren't intended to be keywords, and these 
could be mistaken as keywords by the collector.

> note, btw, that i plan to distribute ".svn" directories along with 
> every copy of 'py'. This allows _everyone_ commit access 
> to test_*.py and *_test.py files in the repository! So if 
> someone finds a problem or wants a guarantee from the py lib 
> he can simply contribute it!  Committing to the implementation 
> tree will require registration for an account, though. 

That would be pretty neat.  I wonder, could you allow anonymous commit 
access to a subset of the tree?  I guess that's not too hard with 
Apache.  I'd rather not let anyone willy-nilly commit test cases to the 
core; many perceived bugs aren't real bugs, and the tests might not fit 
project standards.  I think a bugreport/ directory, where each bug is 
reported as a module containing a test case and perhaps a docstring 
explaining expected behavior, would be perfect.  They can be moved into 
the main tests if appropriate, or just hang out there until they are 
reviewed, and you would allow anonymous commits in that directory.

>>It's actually the kind of place where adaptation would be interesting; 
>>objects would be adapted to test cases, where part of the test case API 
>>was a set of keywords.  That would allow for a lot of customization, 
>>while the actual tests could remain fairly simple.  Part of the base of 
>>py.test would be adapters for packages, modules, and functions; the 
>>module adapter looks for the test_* functions, function adapters might 
>>look for function attributes, etc.  There'd be another adapter for 
>>unittest.TestCase and unittest.TestSuite, and so on.  Packages could 
>>create their own adapters for further customization.
> 
> 
> I am not sure, i am following.  What would be the distinct advantates
> of introducing this machinery? 

I wrote about it more above, but the advantages would be a unified and 
reasonably transparent way of converting ad hoc or package-specific test 
cases into py.test test cases, and one that was decoupled from the tests 
themselves.  E.g., supporting unittest.TestCase would just be a matter 
of providing an adapter.

>>>   - armin and me want a mechanism by which to include '.c'
>>>     files in the library and seemlessly compiling (and 
>>>     possibly caching) them via distutils-mechanism.  This
>>>     should allow to work with a svn-checkout containing 
>>>     c-coded modules without any explicit user interaction 
>>>     and especially without distutils-installing it. 
>>
>>Right, this is what Zope is doing.  It builds the package (but does not 
>>install it) before running the tests (python setup.py build).
> 
> 
> We mean it a lot more automatic.  You should not need to run _any_ 
> intermediate command, no matter how simple.   A checkout of the
> py lib should be enough.  And you can simply modify your '.c' 
> file somewhere under the implementation tree and nicely expose 
> any objects implemented via the "package export" runtime mechanism. 

It would be automatic -- Zope's test.py automatically runs "setup.py 
build" (or whatever the internal method call is to do that) before every 
test run, then adjusts the path accordingly so code is loaded out of 
build/.  It should be configurable, since for pure-python packages it's 
not necessary or desired -- it's just a bunch of copying that doesn't 
serve much purpose.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org