[Python-Dev] Cleaning-up the new unittest API

Sat Oct 30 05:14:27 CEST 2010

The API for the unittest module has grown fat (the documentation
is approaching 2,000 lines and 10,000 words like a small book). 
I think we can improve learnability by focusing on the most 
important parts of the API.

I would like to simplify and clean-up the API for the unittest module
by de-documenting assertSetEqual(), assertDictEqual(),
assertListEqual, and assertTupleEqual().

All of those methods are already called automatically by 
assertEqual(), so those methods never need to be called directly.  
Or, if you need to be more explicit about the type checking for 
sequences, the assertSequenceEqual() method will suffice.
Either way, there's no need to expose the four type specific methods.

Besides de-documenting those four redundant methods,
I propose that assertItemsEqual() be deprecated just like
its brother assertSameElements().  I haven't found anyone
who accurately guesses what those methods entail based
on their method names ("items" usually implies key/value 
pairs elsewhere in the language; nor is it clear whether order is 
important, whether the elements need to be hashable or
orderable or just define equality tests, nor is is clear whether 
duplicates cause the test to fail).

Given the purpose of the unittest module, it's important that
the reader have a crystal clear understanding of what a test
is doing.  Their attention needs to be focused on the subject
of the test, not on questioning the semantics of the test method.

IMO, users are far better-off sticking with assertEqual() so they
can be specific about the nature of the test:

   # hashable elements; ignore dups
   assertEqual(set(a), set(b))

   # orderable elements; dups matter, order doesn't
   assertEqual(sorted(a), sorted(b))

   # eq tested elements, dups matter, order matters
   assertEqual(list(a), list(b))

   # hashable keys, eq tested values
   # ignore dups, ignore order
   assertEqual(dict(a), dict(b))

These take just a few more characters than assertSameElements()
and assertItemsEqual(), but they are far more clear about their meaning.  
You won't have to second guess what semantics are hidden 
behind the abstraction.

There are a couple other problems with the new API but it is probably
too late to do anything about it.

* elsewhere in Python we spell comparison names with abbreviations
   like eq, ne, lt, le, gt, ge.    In unittest, those are spelled in an awkward,
   not easily remembered manner:   assertLessEqual(a, b), etc.  
   Fortunately, it's clear what the mean; however, it's not easy to guess 
   their spelling.

* the names for assertRegexpMatches() and assertNotRegexpMatches
   are deeply misleading since they are implemented in terms of
   re.search(), not re.match().    

Raymond

P.S.  I also looked ar assertDictContainsSubset(a,b).  It is a bit
over-specialized, but at least it is crystal clear what is does
and it beats the awkward alternative using dict views:

   assertLessEqual(a.items(), b.items())

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20101029/0ddf4842/attachment.html>