[Python-Dev] Broken strptime in Python 2.3a1 & CVS
Tim Peters
tim.one@comcast.net
Tue, 14 Jan 2003 19:06:27 -0500
[Brett Cannon]
> ...
> And to comment on the speed drawback: there is already a partial solution
> to this. ``_strptime`` has the ability to return the regex it creates to
> parse the data string and then subsequently have the user pass that in
> instead of a format string::
You're carrying restructured text too far <wink>::
I expect it would be better for strptime to maintain its own internal cache
mapping format strings to compiled regexps (as a dict, indexed by format
strings). Dict lookup is cheap. In most programs, this dict will remain
empty. In most of the rest, it will have one entry. *Some* joker will feed
it an unbounded number of distinct format strings, though, so blow the cache
away if it gets "too big":
regexp = cache.get(fmtstring)
if regexp is None:
regexp = compile_the_regexp(fmtstring)
if len(cache) > 30: # whatever
cache.clear()
cache[fmtstring] = regexp
Then you're robust against all comers (it's also thread-safe).