[ python-Bugs-1065388 ] calendar day/month name lookup too slow
SourceForge.net
noreply at sourceforge.net
Sat Nov 13 16:26:51 CET 2004
Bugs item #1065388, was opened at 2004-11-12 13:12
Message generated for change (Comment added) made by montanaro
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1065388&group_id=5470
Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Skip Montanaro (montanaro)
Summary: calendar day/month name lookup too slow
Initial Comment:
The day and month lookups in calendar.py recompute the
entire array of names on each __getitem__ call. This
caused me some embarrassment when a colleague put a
month name lookup in a critical inner loop, and
removing the lookup sped it up by an order of magnitude
(and there was plenty going on in that loop!). More to
the point, something that *looks* like a quick lookup
operation should behave as such.
I understand this is hard to fix, because we have no
way of trapping locale changes -- perhaps adding a
mechanism to do that would be the start for fixing this.
----------------------------------------------------------------------
>Comment By: Skip Montanaro (montanaro)
Date: 2004-11-13 09:26
Message:
Logged In: YES
user_id=44345
Looks good to me. I'll check in the change and close out the
report unless you think this should wait until after 2.4 is released.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2004-11-12 17:09
Message:
Logged In: YES
user_id=31435
Guido, see my earlier patch (which is attached). Looping
cannot be eliminated because i may be a slice object, in
which case __getitem__ needs to return a list. The patch
avoids looping when i is not a slice object. It also builds the
date objects just once. I believe it's the simplest approach
of this kind that's strong enough so that test_calendar still
passes.
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2004-11-12 15:47
Message:
Logged In: YES
user_id=6380
The loop could be eliminated from __getitem__ by writing
this instead:
def __getitem__(self, i):
return datetime.date(2001, i, 1).strftime(self.format)
and similar for _localized_day.
But we could improve even upon that by creating an instance
variable of 13 date objects like this:
# in __init__
self._datelist = [None] + [datetime.date(2001, j+1, 1)
for j in range(12)]
def __getitem__(self, i):
return self._datelist[i].strftime(self.format)
(I would make this change myself but I want someone to
review this because I recall there were subtle compatibility
issues around this...)
That's still suboptimal because the strftime is done over
and over, but at least it's done only once per __getitem__
call rather than 12x.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2004-11-12 14:00
Message:
Logged In: YES
user_id=31435
The entire array is recomputed because the index may be a
slice object that references the entire array.
Can't imagine what "*looks* like a quick lookup" could mean in
any objective sense. I assume your colleague was using "[]"
notation. Doesn't necessarily look quick to me <wink>.
Any inner loop that expects to run long enough that the
month may change is necessarily not a critical inner loop <0.5
wink>.
Maybe the attached patch is good enough. It creates
appropriate datetime objects just once, caches their strftime
() methods, and recomputes only as much as a specific
__getitem__ invocation needs.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1065388&group_id=5470
More information about the Python-bugs-list
mailing list