[Tutor] duplication in unit tests
Serdar Tumgoren
zstumgoren at gmail.com
Wed Dec 9 00:02:50 CET 2009
Hi everyone,
I'm trying to apply some lessons from the recent list discussions on
unit testing and Test-Driven Development, but I seem to have hit a
sticking point.
As part of my program, I'm planning to create objects that perform
some initial data clean-up and then parse and database the cleaned
data. Currently I'm expecting to have a FileCleaner and Parser
classes. Using the TDD approach, I've so far come up with the below:
class FileCleaner(object):
def __init__(self, datastring):
self.source = datastring
def convertEmDashes(self):
"""Convert unicode emdashes to minus signs"""
self.datastring = self.source.replace(u'\u2014','-')
def splitLines(self):
"""Generate and store a list of cleaned, non-empty lines"""
self.data = [x.strip() for x in
self.datastring.strip().split('\n') if x.strip()]
My confusion involves the test code for the above class and its
methods. The only way I can get splitLines to pass its unit test is by
first calling the convertEmDashes method, and then splitLines.
class TestFileCleaner(unittest.TestCase):
def setUp(self):
self.sourcestring = u"""This line has an em\u2014dash.\n
So does this \u2014\n."""
self.cleaner = FileCleaner(self.sourcestring)
def test_convertEmDashes(self):
"""convertEmDashes should remove minus signs from datastring
attribute"""
teststring = self.sourcestring.replace(u'\u2014','-')
self.cleaner.convertEmDashes()
self.assertEqual(teststring, self.cleaner.datastring)
def test_splitLines(self):
"""splitLines should create a list of cleaned lines"""
teststring = self.sourcestring.replace(u'\u2014','-')
data = [x.strip() for x in teststring.strip().split('\n') if x.strip()]
self.cleaner.convertEmDashes()
self.cleaner.splitLines()
self.assertEqual(data, self.cleaner.data)
Basically, I'm duplicating the steps from the first test method in the
second test method (and this duplication will accrue as I add more
"cleaning" methods).
I understand that TestCase's setUp method is called before each test
is run (and therefore the FileCleaner object is created anew), but
this coupling of a test to other methods of the class under test seems
to violate the principle of testing methods in isolation.
So my questions -- Am I misunderstanding how to properly write unit
tests for this case? Or perhaps I've structured my program
incorrectly, and that's what this duplication reveals? I suspected,
for instance, that perhaps I should group these methods
(convertEmDashes, splitLines, etc.) into a single larger function or
method.
But that approach seems to violate the "best practice" of writing
small methods. As you can tell, I'm a bit at sea on this. Your
guidance is greatly appreciated!!
Regards,
Serdar
ps - recommendations on cleaning up and restructuring code are also welcome!
More information about the Tutor
mailing list