[ python-Bugs-1065388 ] calendar day/month name lookup too slow

SourceForge.net noreply at sourceforge.net
Sat Nov 13 16:26:51 CET 2004


Bugs item #1065388, was opened at 2004-11-12 13:12
Message generated for change (Comment added) made by montanaro
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1065388&group_id=5470

Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Skip Montanaro (montanaro)
Summary: calendar day/month name lookup too slow

Initial Comment:
The day and month lookups in calendar.py recompute the
entire array of names on each __getitem__ call. This
caused me some embarrassment when a colleague put a
month name lookup in a critical inner loop, and
removing the lookup sped it up by an order of magnitude
(and there was plenty going on in that loop!). More to
the point, something that *looks* like a quick lookup
operation should behave as such.

I understand this is hard to fix, because we have no
way of trapping locale changes -- perhaps adding a
mechanism to do that would be the start for fixing this.

----------------------------------------------------------------------

>Comment By: Skip Montanaro (montanaro)
Date: 2004-11-13 09:26

Message:
Logged In: YES 
user_id=44345

Looks good to me.  I'll check in the change and close out the 
report unless you think this should wait until after 2.4 is released.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-11-12 17:09

Message:
Logged In: YES 
user_id=31435

Guido, see my earlier patch (which is attached).  Looping 
cannot be eliminated because i may be a slice object, in 
which case __getitem__ needs to return a list.  The patch 
avoids looping when i is not a slice object.  It also builds the 
date objects just once.  I believe it's the simplest approach 
of this kind that's strong enough so that test_calendar still 
passes.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2004-11-12 15:47

Message:
Logged In: YES 
user_id=6380

The loop could be eliminated from __getitem__ by writing
this instead:

def __getitem__(self, i):
    return datetime.date(2001, i, 1).strftime(self.format)

and similar for _localized_day.

But we could improve even upon that by creating an instance
variable of 13 date objects like this:

    # in __init__
    self._datelist = [None] + [datetime.date(2001, j+1, 1)
                                      for j in range(12)]

def __getitem__(self, i):
    return self._datelist[i].strftime(self.format)

(I would make this change myself but I want someone to
review this because I recall there were subtle compatibility
issues around this...)

That's still suboptimal because the strftime is done over
and over, but at least it's done only once per __getitem__
call rather than 12x.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-11-12 14:00

Message:
Logged In: YES 
user_id=31435

The entire array is recomputed because the index may be a 
slice object that references the entire array.

Can't imagine what "*looks* like a quick lookup" could mean in 
any objective sense.  I assume your colleague was using "[]" 
notation.  Doesn't necessarily look quick to me <wink>.

Any inner loop that expects to run long enough that the 
month may change is necessarily not a critical inner loop <0.5 
wink>.

Maybe the attached patch is good enough.  It creates 
appropriate datetime objects just once, caches their strftime
() methods, and recomputes only as much as a specific 
__getitem__ invocation needs.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1065388&group_id=5470


More information about the Python-bugs-list mailing list