[Tutor] memoize, lookup, or KIS?

Albert-Jan Roskam fomcl at yahoo.com
Mon Nov 19 12:02:47 CET 2012


Hi,

I have a function that converts a date value, expressed as the number of seconds sinds start of the gregorian calendar, into a human-readable format (typically an iso-date). So if a record contains x date values, and a data set contains y records, the number of function calls are x * y. Imagine a data set with 1M records with dob and enrollment_date in in it: the number of function calls is huge (well, 2M). 


I was reading about memoize decorators the other day and I realized that this function might benefit from memoizing, or a lookup table. On the other hand, it might complicate the code too much, so it might be better to Keep It Simple (KIS). Is the code below a sound approach? I believe that, in effect, it uses a memoization approach (as it is a slowly growing lookup table).


import datetime

class Test(object):

    def __init__(self):
        self.isoDateLookup = {}
        self.lookupCount = 0
        
    def spss2strDate(self, gregorianDate, fmt="%Y-%m-%d", recodeSysmisTo=""):
        """ This function converts internal SPSS dates (number of seconds
        since midnight, Oct 14, 1582 (the beginning of the Gregorian calendar))
        to a human-readable format """
        MAXLOOKUP = 10**6
        try:
            if not hasattr(self, "gregorianEpoch"):
                self.gregorianEpoch = datetime.datetime(1582, 10, 14, 0, 0, 0)
            if fmt == "%Y-%m-%d" and len(self.isoDateLookup) <= MAXLOOKUP:
                try:
                    result = self.isoDateLookup[gregorianDate]
                    self.lookupCount += 1
                except KeyError:
                    theDate = self.gregorianEpoch + datetime.timedelta(seconds=gregorianDate)
                    result = datetime.datetime.strftime(theDate, fmt)
                    self.isoDateLookup[gregorianDate] = result
                return result
            else:
                theDate = self.gregorianEpoch + datetime.timedelta(seconds=gregorianDate)
                return datetime.datetime.strftime(theDate, fmt)
        except OverflowError:
            return recodeSysmisTo
        except TypeError:
            return recodeSysmisTo
        except ValueError:
            return recodeSysmisTo

if __name__ == "__main__":
    import random
    t = Test()
    someDate = 11654150400.0
    aDay = 24 * 60 * 60
    random.seed(43210)
    for i in xrange(10**3):
        randDate = random.randint(0, 10**3) * random.choice([aDay, -aDay]) + someDate
        t.spss2strDate(randDate)
    print t.lookupCount

 
Regards,
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a 
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20121119/0108c61a/attachment.html>


More information about the Tutor mailing list