[Python-Dev] Broken strptime in Python 2.3a1 & CVS

Tim Peters tim.one@comcast.net
Tue, 14 Jan 2003 19:06:27 -0500


[Brett Cannon]
> ...
> And to comment on the speed drawback: there is already a partial solution
> to this.  ``_strptime`` has the ability to return the regex it creates to
> parse the data string and then subsequently have the user pass that in
> instead of a format string::

You're carrying restructured text too far <wink>::

I expect it would be better for strptime to maintain its own internal cache
mapping format strings to compiled regexps (as a dict, indexed by format
strings).  Dict lookup is cheap.  In most programs, this dict will remain
empty.  In most of the rest, it will have one entry.  *Some* joker will feed
it an unbounded number of distinct format strings, though, so blow the cache
away if it gets "too big":

    regexp = cache.get(fmtstring)
    if regexp is None:
        regexp = compile_the_regexp(fmtstring)
        if len(cache) > 30:  # whatever
            cache.clear()
        cache[fmtstring] = regexp

Then you're robust against all comers (it's also thread-safe).