[Tutor] A Demolished Function
Danny Yoo
dyoo@hkn.eecs.berkeley.edu
Fri, 16 Nov 2001 18:53:07 -0800 (PST)
Hi everyone,
I thought I might want to toss this out to the list. I'm trying to
improve a "conservatively splitting" function. Examples speak louder than
words, so here goes:
###
>>> ## Example 1
>>> sir_re = re.compile("sir")
>>> conservativeSplit(sir_re,
"five sir four sir three sir two sir one sir")
['five ', 'sir', ' four ', 'sir', ' three ', 'sir',
' two ', 'sir', ' one ', 'sir']
>>> ## Example 2
>>> digit_re = re.compile("\d+")
>>> conservativeSplit(digit_re, "5 sir 4 sir 3 sir 2 sir 1 sir")
['5', ' sir ', '4', ' sir ', '3', ' sir ', '2', ' sir ', '1', ' sir']
###
This function is similar but different from re.split() --- it conserves
the delimiter. This might be useful if the thing that I'm using to split
my string is itself something I want to keep my eye on.
Here's what my implementation looks like:
###
def conservativeSplit(regex, stuff):
"""Split 'stuff' along 'regex' seams."""
fragments = []
while 1:
match = regex.search(stuff)
if not match: break
begin, end = match.span()
if begin == 0:
fragments.append(stuff[begin : end])
stuff = stuff[end :]
else:
fragments.append(stuff[0 : begin])
fragments.append(stuff[begin : end])
stuff = stuff[end :]
if stuff:
fragments.append(stuff)
return fragments
###
After writing this function, though, I still feel tense and apprehensive.
Does anyone see any improvements one could make to make the function
easier to read? Any criticism or dissension would be great. Thanks!