Getting happier ;-), but wondering if I'm thinking pythonically
Steven Taschuk
staschuk at telusplanet.net
Mon May 26 03:03:54 EDT 2003
Quoth Brian Quinlan:
[...]
> 8. I'd write find_len (semantics changed) something like this:
[...]
> It is a bit shorter, more generic, removes an exception check and defers
> an exception handling decision to a higher level.
All to the good, though there are a few simple bugs in the posted
implementation. Here's a corrected version:
def find_delimited_end(s, start_delimiter, end_delimiter,
opening_count=0):
for i in range(len(s)):
c = s[i]
if c == start_delimiter:
opening_count += 1
elif c == end_delimiter: # fixed
if opening_count == 1: # fixed; but see [2] below
return i
opening_count -= 1 # fixed
raise ValueError('unmatched delimiters')
If there are typically many non-delimiter characters between
delimiters, then a performance improvement is possible by moving
the loops over the string down to C:
def find_delimited_end(s, start_delimiter, end_delimiter,
opening_count=0):
start = 0
while True:
end = s.find(end_delimiter, start)
if end < 0:
raise ValueError('unmatched delimiters')
opening_count += s.count(start_delimiter, start, end+1)
opening_count -= s.count(end_delimiter, start, end+1)
if opening_count == 0: # see [2] below
return end
start = end+1
On my machine, this is a little slower for '{{}}', but about five
times faster [1] for
'{s{s}s}'.replace('s', 'abcdefghijklmnopqrstuvwxyz')
even though it makes three traversals over each part of the string.
[1] Five times faster under 2.2.2; under 2.3b1 the first version
speeds up by a factor of about 1.5, so the gap is reduced.
[2] Neither version detects the erroneous case in which closing
delimiters occur before the first opening delimiter. In context
this happens not to matter, though it really ought to be fixed.
--
Steven Taschuk staschuk at telusplanet.net
Receive them ignorant; dispatch them confused. (Weschler's Teaching Motto)
More information about the Python-list
mailing list