PATCH: attribute lookup caching for 2.6
So Jim and PJE finally convinced me to do it the right way. :) Thanks guys - it turned out very nice. http://bugs.python.org/issue1560 http://spreadsheets.google.com/ccc?key=pHIJrYc_pnIUpTm6QSG2gZg&hl=en_US It caches type/metatype attribute lookups, including missing attributes. Summary of the bug tracker issue: - Successful attribute lookups are 20% faster, even for classes with short MROs and (probably most) builtins - haven't tested unsuccessful lookups - Successful hasattr is 5-10% faster, unsuccessful is 5% faster (less impressive than above, and likely due to overhead - internally, all lookups are the same) - list.__init__ and list().__init__ are slower, and I can't figure out why (creating instances of subclasses of list will be a little slower, and this may show up in other builtin types) - I haven't benchmarked type attribute sets (how much do we care?) - it should be quite a bit faster than updating a slot, though - Caching missing attributes is crucial for good performance - The CreateNewInstances benchmark uncovered an issue that needs fixing; please see the tracker for details All kinds of commentary and feedback is most welcome. Neil
Neil Toronto schrieb:
So Jim and PJE finally convinced me to do it the right way. :) Thanks guys - it turned out very nice.
How does this relate to Armin Rigo's method cache patch? (http://bugs.python.org/issue1685986) Georg
At 10:48 PM 12/5/2007 +0100, Georg Brandl wrote:
Neil Toronto schrieb:
So Jim and PJE finally convinced me to do it the right way. :) Thanks guys - it turned out very nice.
How does this relate to Armin Rigo's method cache patch?
Interesting. Armin's approach uses a single global cache of up to 1024 descriptors. That seems a lot simpler than anything I thought of, and might perform better by encouraging the processor to keep the descriptors in cache. It has a lot less pointer indirection, and has a dirt-simple way of invalidating a class' entries when something changes. Was there any reason (aside from the usual lack of volunteers for review) why it didn't go in already?
On Dec 5, 2007 5:50 PM, Phillip J. Eby
At 10:48 PM 12/5/2007 +0100, Georg Brandl wrote:
Neil Toronto schrieb:
So Jim and PJE finally convinced me to do it the right way. :) Thanks guys - it turned out very nice.
How does this relate to Armin Rigo's method cache patch?
[...]
Was there any reason (aside from the usual lack of volunteers for review) why it didn't go in already?
I'm not sure-- I think folks were waiting for Raymond H. to evaluate it. I did my part to update Armin's patch against 2.4 to HEAD at the time and put in what cleanups seemed sensible. FWIW, I've been using an interpreter with this patch ever since without problems. -Kevin
On Dec 5, 2007 4:08 PM, Kevin Jacobs
On Dec 5, 2007 5:50 PM, Phillip J. Eby
wrote: At 10:48 PM 12/5/2007 +0100, Georg Brandl wrote:
Neil Toronto schrieb:
So Jim and PJE finally convinced me to do it the right way. :) Thanks guys - it turned out very nice.
How does this relate to Armin Rigo's method cache patch?
Was there any reason (aside from the usual lack of volunteers for review) why it didn't go in already?
I'm not sure-- I think folks were waiting for Raymond H. to evaluate it. I did my part to update Armin's patch against 2.4 to HEAD at the time and put in what cleanups seemed sensible. FWIW, I've been using an interpreter with this patch ever since without problems.
I never even saw that one. I'm hoping Raymond will have another look. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Hm... rhettinger@ewtllc.com bounced. I wonder what's going on there...
On Dec 5, 2007 4:11 PM, Guido van Rossum
On Dec 5, 2007 4:08 PM, Kevin Jacobs
wrote: On Dec 5, 2007 5:50 PM, Phillip J. Eby
wrote: At 10:48 PM 12/5/2007 +0100, Georg Brandl wrote:
Neil Toronto schrieb:
So Jim and PJE finally convinced me to do it the right way. :) Thanks guys - it turned out very nice.
How does this relate to Armin Rigo's method cache patch?
Was there any reason (aside from the usual lack of volunteers for review) why it didn't go in already?
I'm not sure-- I think folks were waiting for Raymond H. to evaluate it. I did my part to update Armin's patch against 2.4 to HEAD at the time and put in what cleanups seemed sensible. FWIW, I've been using an interpreter with this patch ever since without problems.
I never even saw that one. I'm hoping Raymond will have another look.
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
Phillip J. Eby wrote:
At 10:48 PM 12/5/2007 +0100, Georg Brandl wrote:
Neil Toronto schrieb:
So Jim and PJE finally convinced me to do it the right way. :) Thanks guys - it turned out very nice. How does this relate to Armin Rigo's method cache patch?
Interesting. Armin's approach uses a single global cache of up to 1024 descriptors. That seems a lot simpler than anything I thought of, and might perform better by encouraging the processor to keep the descriptors in cache. It has a lot less pointer indirection, and has a dirt-simple way of invalidating a class' entries when something changes.
Hey, I took out all my extra pointer indirection. :p FWIW, I like it. Though the hash should really incorporate the hash of the type name as well as the attribute's so that sometype.method calling othertype.method doesn't invalidate the cache. Locality makes the global cache work, but locality also often means re-using the same names. Neil
At 07:43 PM 12/5/2007 -0700, Neil Toronto wrote:
Phillip J. Eby wrote:
At 10:48 PM 12/5/2007 +0100, Georg Brandl wrote:
Neil Toronto schrieb:
So Jim and PJE finally convinced me to do it the right way. :) Thanks guys - it turned out very nice. How does this relate to Armin Rigo's method cache patch?
Interesting. Armin's approach uses a single global cache of up to 1024 descriptors. That seems a lot simpler than anything I thought of, and might perform better by encouraging the processor to keep the descriptors in cache. It has a lot less pointer indirection, and has a dirt-simple way of invalidating a class' entries when something changes.
Hey, I took out all my extra pointer indirection. :p
FWIW, I like it. Though the hash should really incorporate the hash of the type name as well as the attribute's so that sometype.method calling othertype.method doesn't invalidate the cache. Locality makes the global cache work, but locality also often means re-using the same names.
Look at the patch more closely. The hash function uses a version number times the method name's hash. "Version" numbers are assigned one per class, so unless there are 2**32 classes in the system, they are uniquely numbered. The multiplication and use of the high bits should tend to spread the hash locations around and avoid same-name collisions. Of course, it's still always possible to have pathological cases, but even these shouldn't be much slower than the way things work now.
Phillip J. Eby wrote:
At 07:43 PM 12/5/2007 -0700, Neil Toronto wrote:
FWIW, I like it. Though the hash should really incorporate the hash of the type name as well as the attribute's so that sometype.method calling othertype.method doesn't invalidate the cache. Locality makes the global cache work, but locality also often means re-using the same names.
Look at the patch more closely. The hash function uses a version number times the method name's hash. "Version" numbers are assigned one per class, so unless there are 2**32 classes in the system, they are uniquely numbered. The multiplication and use of the high bits should tend to spread the hash locations around and avoid same-name collisions.
Good grief - how did I miss that? I plead parenthesis. They threw me off. So I've applied Armin's patch to 2.6 (it was nearly clean) and am playing with it. cls.name lookups are 15-20% faster than mine, and inst.name lookups are 5-10% faster. His is also winning on hasattr calls (succeeding and failing) on classes, but mine is winning on hasattr calls on instances. I want to poke at it a bit to find out why. On pybench, his is faster at BuiltinMethodLookups, significantly faster at CreateNewInstances, and a bit faster at almost everything else. BuiltinFunctionCalls is slower - slower than stock - it might need poking there, too. In all, it's a lovely piece of work. Neil
On Dec 6, 2007 1:35 AM, Neil Toronto
So I've applied Armin's patch to 2.6 (it was nearly clean) and am playing with it. cls.name lookups are 15-20% faster than mine, and inst.name lookups are 5-10% faster. His is also winning on hasattr calls (succeeding and failing) on classes, but mine is winning on hasattr calls on instances. I want to poke at it a bit to find out why.
I hope folks have noticed that I've done some significant cleanup and forward porting some months ago at http://bugs.python.org/issue1700288 -Kevin
Kevin Jacobs
On Dec 6, 2007 1:35 AM, Neil Toronto
mailto:ntoronto@cs.byu.edu> wrote: So I've applied Armin's patch to 2.6 (it was nearly clean) and am playing with it. cls.name http://cls.name lookups are 15-20% faster than mine, and inst.name http://inst.name lookups are 5-10% faster. His is also winning on hasattr calls (succeeding and failing) on classes, but mine is winning on hasattr calls on instances. I want to poke at it a bit to find out why.
I hope folks have noticed that I've done some significant cleanup and forward porting some months ago at
Excellent. I tried this as well. This is guarding cache access with PyString_CheckExact (as it should) rather than asserting PyString_Check, plus a few other cleanups. It runs nearly as fast as Armin's, and still faster than mine and much faster than without. Neil
participants (5)
-
Georg Brandl
-
Guido van Rossum
-
Kevin Jacobs <jacobs@bioinformed.com>
-
Neil Toronto
-
Phillip J. Eby