[Tutor] memoize, lookup, or KIS?
Albert-Jan Roskam
fomcl at yahoo.com
Mon Nov 19 12:02:47 CET 2012
Hi,
I have a function that converts a date value, expressed as the number of seconds sinds start of the gregorian calendar, into a human-readable format (typically an iso-date). So if a record contains x date values, and a data set contains y records, the number of function calls are x * y. Imagine a data set with 1M records with dob and enrollment_date in in it: the number of function calls is huge (well, 2M).
I was reading about memoize decorators the other day and I realized that this function might benefit from memoizing, or a lookup table. On the other hand, it might complicate the code too much, so it might be better to Keep It Simple (KIS). Is the code below a sound approach? I believe that, in effect, it uses a memoization approach (as it is a slowly growing lookup table).
import datetime
class Test(object):
def __init__(self):
self.isoDateLookup = {}
self.lookupCount = 0
def spss2strDate(self, gregorianDate, fmt="%Y-%m-%d", recodeSysmisTo=""):
""" This function converts internal SPSS dates (number of seconds
since midnight, Oct 14, 1582 (the beginning of the Gregorian calendar))
to a human-readable format """
MAXLOOKUP = 10**6
try:
if not hasattr(self, "gregorianEpoch"):
self.gregorianEpoch = datetime.datetime(1582, 10, 14, 0, 0, 0)
if fmt == "%Y-%m-%d" and len(self.isoDateLookup) <= MAXLOOKUP:
try:
result = self.isoDateLookup[gregorianDate]
self.lookupCount += 1
except KeyError:
theDate = self.gregorianEpoch + datetime.timedelta(seconds=gregorianDate)
result = datetime.datetime.strftime(theDate, fmt)
self.isoDateLookup[gregorianDate] = result
return result
else:
theDate = self.gregorianEpoch + datetime.timedelta(seconds=gregorianDate)
return datetime.datetime.strftime(theDate, fmt)
except OverflowError:
return recodeSysmisTo
except TypeError:
return recodeSysmisTo
except ValueError:
return recodeSysmisTo
if __name__ == "__main__":
import random
t = Test()
someDate = 11654150400.0
aDay = 24 * 60 * 60
random.seed(43210)
for i in xrange(10**3):
randDate = random.randint(0, 10**3) * random.choice([aDay, -aDay]) + someDate
t.spss2strDate(randDate)
print t.lookupCount
Regards,
Albert-Jan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20121119/0108c61a/attachment.html>
More information about the Tutor
mailing list