How Can I Increase the Speed of a Large Number of Date Conversions

Larry Bates larry.bates at
Fri Jun 8 06:12:37 CEST 2007

James T. Dennis wrote:
> Some Other Guy <bgates at> wrote:
>> vdicarlo wrote:
>>> I am a programming amateur and a Python newbie who needs to convert
>>> about 100,000,000 strings of the form "1999-12-30" into ordinal dates
>>> for sorting, comparison, and calculations. Though my script does a ton
>>> of heavy calculational lifting (for which numpy and psyco are a
>>> blessing) besides converting dates, it still seems to like to linger
>>> in the datetime and time libraries.  (Maybe there's a hot module in
>>> there with a cute little function and an impressive set of
>>> attributes.)
>> ...
>>> dateTuple = time.strptime("2005-12-19", '%Y-%m-%d')
>>>             dateTuple = dateTuple[:3]
>>>             date =[0], dateTuple[1],
>>> dateTuple[2])
>>>             ratingDateOrd = date.toordinal()
>> There's nothing terribly wrong with that, although strptime() is overkill
>> if you already know the date format.  You could get the date like this:
>>   date = apply(, map(int, "2005-12-19".split('-')))
>> But, more importantly... 100,000,000 individual dates would cover 274000
>> years!  Do you really need that much??  You could just precompute a
>> dictionary that maps a date string to the ordinal for the last 50 years
>> or so. That's only 18250 entries, and can be computed in less than a second.
>> Lookups after that will be near instantaneous:
>  For that matter why not memoize the results of each conversion
>  (toss it in a dictionary and precede each conversion with a
>  check like: if this_date in datecache: return datecache[this_date]
>  else: ret=convert(this_date); datecache[this_date]=ret; return ret)
>  (If you don't believe that will help, consider that a memo-ized
>  implementation of a recursive Fibonacci function runs about as quickly
>  as iterative approach).
Even better do something like (not tested):

try: dateord=datedict[cdate]
except KeyError:*[int(x) for x in "2005-12-19".split('-'))

hat way you build the cache on the fly and there is no penalty if
lookup key is already in the cache.

More information about the Python-list mailing list