Reprise: [2,3,4,7] --> "2-4,7"

Peter Hansen peter at
Wed Jun 18 04:59:20 CEST 2003

I followed the original thread with great interest, as I thought 
of using the problem as the focus of a workshop in Test-Driven 
Development which I was planning to hold at work with my group.

(Yes, those of you sick of TDD in general, or especially my comments
about it and the value of unit testing, might like to hit "k" at this
point.  Or stay on board and perhaps find something of interest. :-)

I saved the original messages, but studiously avoided looking at
any of the solutions: I wanted the chance to develop a solution on
my own, using TDD, and to explore any differences which might be
attributable to the use of that approach.

(Note here: I don't yet consider myself extremely proficient at TDD, 
but even so I am much better with it than without.  I wanted the practice, 
sure, but also a chance to compare my results objectively with what 
others produced.)

(Another note: I probably spent substantially longer on this than 
any of the others.  For the record, it took me much of a tedious
Sunday to get my code working.)

I ended up writing 19 unit tests to drive my design, and worked from
a set of four (initially, then five) acceptance tests based on 
the two initial uses cases (the thread's title, and the example 
within), plus a special case (the example reversed), plus the one
failure pointed out by M.Garcia (foo, foo1).  Later, after I finished,
I added another case after noticing a likely failure mode in M.Fletcher's
code.  (Sorry Mike! :-)

Here's the summary of the results.  NOTE: some of the failures are
based solely on the programmer's perfectly valid interpretation of 
ambiguous requirements, but I might have chosen to interpret them
differently in my acceptance tests.  See the explanatory notes.

Programmer      Results (of 5 tests) Notes
--------------- -------------------- --------------------
ACooke          2 failures, 1 error  [3] [4]
Alex            1 error              [3]
ASchmolk        1 failure            [6]
CezaryB         no failures
GYoung          no failures
MFletcher       2 failures           [1] [2]
MGarcia#1       1 failure            [1]
PHansen         no failures
RHettinger      4 failures, 1 error  [3] [4] [5] 
SDaniels        1 error              [3]

[1] When an item with a leading zero (e.g. mx09) is not part of a "span"
    it is unclear whether it should be left as-is or returned as an
    integer with the zero stripped (e.g. mx9) as would happen if it 
    forms a span's endpoint.  The choice is arbitrary: I happened to
    choose "stripped" initially, then changed my mind once I ran 
    GYoung's code, treating the OP's own code as the requirement.
    It's unfair to call this one a real failure.

[2] The "foo0,foo1" case is treated as "foo,foo1".

[3] Raises ValueError when comparing "foo" and "foo1".

[4] Merges non-adjacent spans, contrary to the requirement that the
    "original order of names be preserved".

[5] Returns "2-6" from input of "2,3,4,7".  Returns "foo0-2" from input 
    of "foo0,foo1".  Probably an off-by-one error?

[6] Returns "2-4" from input of "2,3,4,7".  Another off-by-one?

Here's some informal data on the sizes of the programs from PyCount,
purely for the curious.  I've marked with (*) those which had no
failures other than item [1] above.

   lines    code     doc comment   blank  file
      28      22       1       0       5  .\
      14      10       1       0       3  .\
      24      21       1       0       2  .\
      42      36       1       0       5  .\ (*)
      29      23       1       0       5  .\ (*)
      50      39       3       3       5  .\
      46      42       1       0       3  .\ (*)
      64      48       1       7       8  .\ (*)
      29      21       1       2       5  .\
      32      26       1       0       5  .\

One should not, of course, expect much in the way of doc/comment lines
for code posted to the newsgroup.  Speaking empirically, anyway. :-)

This is an early result, from an evening's work wrapping my acceptance
tests around each programmer's code.  I have not yet attempted to 
compare more subjective aspects, such as how readable the code is,
nor how amenable to refactoring.

For the record, since the others have all posted their code for 
public viewing and, unwittingly, this dissection, here is mine:

from __future__ import generators

def areConsecutive(left, right):
        return left[0] == right[0] and int(left[1]) + 1 == int(right[1])
    except (TypeError, ValueError):
        return False

def endpoints(items):
    if items:
        items = iter(items)
        left =
            while 1:
                # have at least one item, try finding a next
                test =
                if areConsecutive(left, test):
                        # we've found a span of at least two, continue checking
                        while 1:
                            right = test
                            test =

                            if areConsecutive(right, test):
                                # still have a span, continue
                                right = test
                                # span has ended: yield it and continue
                                yield left, right
                                left = test
                    except StopIteration:
                        # ran out during span: yield it and we're done
                        yield left, right
                    # yield non-span and try again
                    yield left,
                    left = test

        # handle final case where we have a left but no right
        except StopIteration:
            yield left,

def formatRange(items):
    if len(items) == 2:
        return '%s%s-%s' % (tuple(items[0]) + (items[1][1],))
        return '%s%s' % tuple(items[0])

def split(item):
    import re
    prefix, val = re.match('(\D*)(\d*)', item).groups()
    return prefix, val

def collapse(items):
    items = [split(x) for x in items]
    result = []
    for span in endpoints(items):
    return result

I would really have liked to simplify endpoints(), but I didn't
seem to have any of the inspiration that led some of the more
brilliant among us to such short answers as they produced! :-(

Here are the acceptance test cases.  If there's any interest 
at all, I can post the unit tests which drove my design above.

'''Acceptance tests for TDD seminar problem.'''

# Run as "python programmer" where programmer is, e.g. "phansen".
# Modules under test are stored as "" and so on.

import unittest

# module under test
import sys
    name = 'ranges_' + sys.argv[1]
    sys.argv[1:] = sys.argv[2:]
    name = 'ranges'

moduleUnderTest = __import__(name)
collapse = moduleUnderTest.collapse

class TestCase(unittest.TestCase):
    def test01(self):
        '''simple numbers, from title of posting'''
        self.assertEqual(collapse(['2','3','4','7']), ['2-4','7'])

    def test02(self):
        '''include prefixes, contiguous ranges separated by other elements, non-numeric items'''

    def test03(self):
        '''reversed example: don't form any ranges, don't munge leading zeros either'''

    def test04(self):
        '''case from newsgroup that wasn't handled by one solution'''

    def test05(self):
        '''test for handling 0 as a valid integer: Mike's code appears flawed in this respect'''

if __name__ == '__main__':

-with-apologies-to-anyone-feeling-over-scrutinized-ly y'rs,
 Peter Hansen

More information about the Python-list mailing list