Getting happier ;-), but wondering if I'm thinking pythonically

Steven Taschuk staschuk at
Mon May 26 03:03:54 EDT 2003

Quoth Brian Quinlan:
> 8. I'd write find_len (semantics changed) something like this:
> It is a bit shorter, more generic, removes an exception check and defers
> an exception handling decision to a higher level. 

All to the good, though there are a few simple bugs in the posted
implementation.  Here's a corrected version:

    def find_delimited_end(s, start_delimiter, end_delimiter,
        for i in range(len(s)):
            c = s[i]
            if c == start_delimiter:
                opening_count += 1
            elif c == end_delimiter:       # fixed
                if opening_count == 1:     # fixed; but see [2] below
                    return i
                opening_count -= 1         # fixed
        raise ValueError('unmatched delimiters')

If there are typically many non-delimiter characters between
delimiters, then a performance improvement is possible by moving
the loops over the string down to C:

    def find_delimited_end(s, start_delimiter, end_delimiter,
        start = 0
        while True:
            end = s.find(end_delimiter, start)
            if end < 0:
                raise ValueError('unmatched delimiters')
            opening_count += s.count(start_delimiter, start, end+1)
            opening_count -= s.count(end_delimiter, start, end+1)
            if opening_count == 0:     # see [2] below
                return end
            start = end+1

On my machine, this is a little slower for '{{}}', but about five
times faster [1] for
    '{s{s}s}'.replace('s', 'abcdefghijklmnopqrstuvwxyz')
even though it makes three traversals over each part of the string.

[1] Five times faster under 2.2.2; under 2.3b1 the first version
speeds up by a factor of about 1.5, so the gap is reduced.

[2] Neither version detects the erroneous case in which closing
delimiters occur before the first opening delimiter.  In context
this happens not to matter, though it really ought to be fixed.

Steven Taschuk                                     staschuk at
Receive them ignorant; dispatch them confused.  (Weschler's Teaching Motto)

More information about the Python-list mailing list