[Tutor] A Demolished Function

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Fri, 16 Nov 2001 18:53:07 -0800 (PST)


Hi everyone,

I thought I might want to toss this out to the list.  I'm trying to
improve a "conservatively splitting" function.  Examples speak louder than
words, so here goes:


###
>>>                                                      ## Example 1
>>> sir_re = re.compile("sir")
>>> conservativeSplit(sir_re,
                     "five sir four sir three sir two sir one sir")
['five ', 'sir', ' four ', 'sir', ' three ', 'sir',
 ' two ', 'sir', ' one ', 'sir']
>>>                                                      ## Example 2
>>> digit_re = re.compile("\d+")
>>> conservativeSplit(digit_re, "5 sir 4 sir 3 sir 2 sir 1 sir")
['5', ' sir ', '4', ' sir ', '3', ' sir ', '2', ' sir ', '1', ' sir']
###


This function is similar but different from re.split() --- it conserves
the delimiter.  This might be useful if the thing that I'm using to split
my string is itself something I want to keep my eye on.


Here's what my implementation looks like:

###
def conservativeSplit(regex, stuff):
    """Split 'stuff' along 'regex' seams."""
    fragments = []
    while 1:
        match = regex.search(stuff)
        if not match: break
        begin, end = match.span()
        if begin == 0:
            fragments.append(stuff[begin : end])
            stuff = stuff[end :]
        else:
            fragments.append(stuff[0 : begin])
            fragments.append(stuff[begin : end])
            stuff = stuff[end :]
    if stuff:
        fragments.append(stuff)
    return fragments
###

After writing this function, though, I still feel tense and apprehensive.  
Does anyone see any improvements one could make to make the function
easier to read?  Any criticism or dissension would be great.  Thanks!