default of returning None hurts performance?
data:image/s3,"s3://crabby-images/f81c3/f81c349b494ddf4b2afda851969a1bfe75852ddf" alt=""
food for thought as noticed by a coworker who has been profiling some hot code to optimize a library... If a function does not have a return statement we return None. Ironically this makes the foo2 function below faster than the bar2 function at least as measured using bytecode size: Python 2.6.2 (r262:71600, Jul 24 2009, 17:29:21) [GCC 4.2.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.
3 9 LOAD_FAST 1 (y) 12 RETURN_VALUE
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
Gregory P. Smith <greg <at> krypto.org> writes:
food for thought as noticed by a coworker who has been profiling some hot code
to optimize a library...If a function does not have a return statement we return None. Ironically this makes the foo2 function below faster than the bar2 function at least as measured using bytecode size I would be surprised if this "bytecode size" difference made a significant difference in runtimes, given that function call cost should dwarf the cumulated cost of POP_TOP and LOAD_CONST (two of the simplest opcodes you could find). Did your coworker run any timings instead of basing his assumptions on bytecode size? Regards Antoine.
data:image/s3,"s3://crabby-images/f81c3/f81c349b494ddf4b2afda851969a1bfe75852ddf" alt=""
On Mon, Aug 31, 2009 at 2:20 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
the attached sample code repeatably shows that it makes a difference though its really not much of one (2-3%). I was just wondering if a bytecode for a superinstruction of the common sequence: 6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE might be worth it.
data:image/s3,"s3://crabby-images/b3054/b3054acc16151b5d3e6c737fd426ff8c1e6bef92" alt=""
On Mon, Aug 31, 2009 at 3:07 PM, Gregory P. Smith<greg@krypto.org> wrote:
I doubt it. You'd save a bit of stack manipulation, but since this will only appear at the end of a function, I'd be skeptical that this would make any macrobenchmarks (statistically) significantly faster. Collin Winter
data:image/s3,"s3://crabby-images/f4fd1/f4fd1fa44ee3e2dfb9483613e3e30063bdd3c2ba" alt=""
Raymond Hettinger wrote:
I fail to understand this crude logic. How often is the inner-loop really going to solely call C code? Any call to Python in an inner-loop is going to suffer this penalty on the order of the number of loop iterations)? -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, 1 Sep 2009 05:51:49 pm Scott Dial wrote:
Most functions don't suffer this penalty. Consider the following two functions: def g(x): return x() def h(x): x() Now disassemble:
The first doesn't suffer any such default "return None" penalty, and so won't gain any performance benefit from optimizing it. It is only the subset of functions which don't explicitly return anything which will see any potential benefit. Let me call such functions "procedures" to avoid confusion with those functions which won't see any benefit. While procedures may see some benefit, it's a trivial amount, probably not worth the extra complexity. According to Gregory's tests, the difference is approximately 2% on a trivial do-nothing function. According to my tests on my PC, I might hope to save somewhat less than 0.1 microsecond per procedure call as an absolute saving. As a relative saving though, it will most likely be insignificant: for comparison's sake, urllib2.Request('http://example.com') takes around 150μs on my machine, and math.sin(1.1) around 90μs. For any procedure which does non-trivial amounts of work, saving 0.1μs is insignificant, no matter how many times it is called inside a loop. -- Steven D'Aprano
data:image/s3,"s3://crabby-images/13b4e/13b4e5ff3b1283636b05d49618b52ac01142d3f1" alt=""
Raymond Hettinger wrote:
I fail to understand this crude logic. How often is the inner-loop really going to solely call C code? Any call to Python in an inner-loop is going to suffer this penalty on the order of the number of loop iterations)? -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
Gregory P. Smith <greg <at> krypto.org> writes:
I was just wondering if a bytecode for a superinstruction of the common
sequence:
I think superinstructions in general would be a good thing to experiment, as wpython showed. Direct addressing (via a pseudo register file combining locals and constants) would eliminate many bookkeeping-related opcodes in common bytecode. Regards Antoine.
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Antoine Pitrou wrote:
Did your coworker run any timings instead of basing his assumptions on bytecode size?
In any case, what are you suggesting -- that the last value returned by a function call in the body should be the default return value? I don't think the unpredictability that would introduce would be a good idea. -- Greg
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit :
"We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
Well, feel free to prefer an unreadable language if you want :) Having implicit return values is certainly not something which follows Python's design principles. Even C abandoned the idea. In any case, this discussion is off-topic for this thread. If you want to discuss the topic further, you can post to python-list or python-ideas (it will most certainly be shot down anyway).
data:image/s3,"s3://crabby-images/99d30/99d30c298af984baeb60b06385b26c1909e06b49" alt=""
On 1 Sep 2009, at 15:25 , Antoine Pitrou wrote: Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit : like the Python community never lifts features from such languages, so obviously they do (some at least) things right.
it will most certainly be shot down anyway Yep, so there's not much point in bringing it up there.
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit :
"We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
IIRC, the default return value of a Smalltalk method is self, not the last thing evaluated. (And no, that's not going to happen in Python either -- the BDFL has rejected similar suggestions on previous occasions.) -- Greg
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Xavier Morel wrote:
Methods yes (and that's one of the few Smalltalk design "features" I consider truly dumb, considering it has message cascading)
Cascading is something different -- it's for sending multiple messages to the *same* receiver. It's not dumb to have both. -- Greg
data:image/s3,"s3://crabby-images/46dc6/46dc618d3e52171111ae75db482ab8f02667c0e6" alt=""
On 3 Sep 2009, at 23:33 , Greg Ewing wrote: Xavier Morel wrote:
I know what cascading is for. The issue is that with message cascading + the "yourself" message, you *never* need to chain on self (you can just cascade and -- if you end up needing the instance to drop down at the end of the cascade -- send `yourself`). Chaining on self is completely redundant in smalltalk as the purpose of this pattern is *also* to send a sequence of messages to the same receiver (something message cascading already handles & guarantees). Therefore defaulting method to self-chaining is very dumb and serves no purpose whatsoever. It doesn't make the language easier to use, less verbose or more practical. It just wastes return values.
data:image/s3,"s3://crabby-images/46dc6/46dc618d3e52171111ae75db482ab8f02667c0e6" alt=""
On 1 Sep 2009, at 02:01 , Greg Ewing wrote:
It couldn't work in Python because statements aren't expressions, therefore I think def foo(): if cond: 3 else: 4 would break (given if:else: doesn't return a value, the function couldn't have a return value), but in languages where everything is an expression (where if:else: does return a value) there's nothing unpredictable about it.
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Xavier Morel wrote:
I fail to grasp the unpredictability of "the last expression evaluated in the body of a function is its return value".
It's unpredictable in the sense that if you're writing a function that's not intended to return a value, you're not thinking about what the last call you make in the function returns, so to a first approximation it's just some random value. I often write code that makes use of the fact that falling off the end of a function returns None. This has been a documented part of the Python language from the beginning, and changing it would break a lot of code for no good reason. -- Greg
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
Greg Ewing wrote:
It also means adding a debugging message, assertion, or otherwise side-effect free statement can change the return value of the function. Not cool. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
Gregory P. Smith <greg <at> krypto.org> writes:
food for thought as noticed by a coworker who has been profiling some hot code
to optimize a library...If a function does not have a return statement we return None. Ironically this makes the foo2 function below faster than the bar2 function at least as measured using bytecode size I would be surprised if this "bytecode size" difference made a significant difference in runtimes, given that function call cost should dwarf the cumulated cost of POP_TOP and LOAD_CONST (two of the simplest opcodes you could find). Did your coworker run any timings instead of basing his assumptions on bytecode size? Regards Antoine.
data:image/s3,"s3://crabby-images/f81c3/f81c349b494ddf4b2afda851969a1bfe75852ddf" alt=""
On Mon, Aug 31, 2009 at 2:20 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
the attached sample code repeatably shows that it makes a difference though its really not much of one (2-3%). I was just wondering if a bytecode for a superinstruction of the common sequence: 6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE might be worth it.
data:image/s3,"s3://crabby-images/b3054/b3054acc16151b5d3e6c737fd426ff8c1e6bef92" alt=""
On Mon, Aug 31, 2009 at 3:07 PM, Gregory P. Smith<greg@krypto.org> wrote:
I doubt it. You'd save a bit of stack manipulation, but since this will only appear at the end of a function, I'd be skeptical that this would make any macrobenchmarks (statistically) significantly faster. Collin Winter
data:image/s3,"s3://crabby-images/f4fd1/f4fd1fa44ee3e2dfb9483613e3e30063bdd3c2ba" alt=""
Raymond Hettinger wrote:
I fail to understand this crude logic. How often is the inner-loop really going to solely call C code? Any call to Python in an inner-loop is going to suffer this penalty on the order of the number of loop iterations)? -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, 1 Sep 2009 05:51:49 pm Scott Dial wrote:
Most functions don't suffer this penalty. Consider the following two functions: def g(x): return x() def h(x): x() Now disassemble:
The first doesn't suffer any such default "return None" penalty, and so won't gain any performance benefit from optimizing it. It is only the subset of functions which don't explicitly return anything which will see any potential benefit. Let me call such functions "procedures" to avoid confusion with those functions which won't see any benefit. While procedures may see some benefit, it's a trivial amount, probably not worth the extra complexity. According to Gregory's tests, the difference is approximately 2% on a trivial do-nothing function. According to my tests on my PC, I might hope to save somewhat less than 0.1 microsecond per procedure call as an absolute saving. As a relative saving though, it will most likely be insignificant: for comparison's sake, urllib2.Request('http://example.com') takes around 150μs on my machine, and math.sin(1.1) around 90μs. For any procedure which does non-trivial amounts of work, saving 0.1μs is insignificant, no matter how many times it is called inside a loop. -- Steven D'Aprano
data:image/s3,"s3://crabby-images/13b4e/13b4e5ff3b1283636b05d49618b52ac01142d3f1" alt=""
Raymond Hettinger wrote:
I fail to understand this crude logic. How often is the inner-loop really going to solely call C code? Any call to Python in an inner-loop is going to suffer this penalty on the order of the number of loop iterations)? -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
Gregory P. Smith <greg <at> krypto.org> writes:
I was just wondering if a bytecode for a superinstruction of the common
sequence:
I think superinstructions in general would be a good thing to experiment, as wpython showed. Direct addressing (via a pseudo register file combining locals and constants) would eliminate many bookkeeping-related opcodes in common bytecode. Regards Antoine.
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Antoine Pitrou wrote:
Did your coworker run any timings instead of basing his assumptions on bytecode size?
In any case, what are you suggesting -- that the last value returned by a function call in the body should be the default return value? I don't think the unpredictability that would introduce would be a good idea. -- Greg
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit :
"We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
Well, feel free to prefer an unreadable language if you want :) Having implicit return values is certainly not something which follows Python's design principles. Even C abandoned the idea. In any case, this discussion is off-topic for this thread. If you want to discuss the topic further, you can post to python-list or python-ideas (it will most certainly be shot down anyway).
data:image/s3,"s3://crabby-images/99d30/99d30c298af984baeb60b06385b26c1909e06b49" alt=""
On 1 Sep 2009, at 15:25 , Antoine Pitrou wrote: Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit : like the Python community never lifts features from such languages, so obviously they do (some at least) things right.
it will most certainly be shot down anyway Yep, so there's not much point in bringing it up there.
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit :
"We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
IIRC, the default return value of a Smalltalk method is self, not the last thing evaluated. (And no, that's not going to happen in Python either -- the BDFL has rejected similar suggestions on previous occasions.) -- Greg
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Xavier Morel wrote:
Methods yes (and that's one of the few Smalltalk design "features" I consider truly dumb, considering it has message cascading)
Cascading is something different -- it's for sending multiple messages to the *same* receiver. It's not dumb to have both. -- Greg
data:image/s3,"s3://crabby-images/46dc6/46dc618d3e52171111ae75db482ab8f02667c0e6" alt=""
On 3 Sep 2009, at 23:33 , Greg Ewing wrote: Xavier Morel wrote:
I know what cascading is for. The issue is that with message cascading + the "yourself" message, you *never* need to chain on self (you can just cascade and -- if you end up needing the instance to drop down at the end of the cascade -- send `yourself`). Chaining on self is completely redundant in smalltalk as the purpose of this pattern is *also* to send a sequence of messages to the same receiver (something message cascading already handles & guarantees). Therefore defaulting method to self-chaining is very dumb and serves no purpose whatsoever. It doesn't make the language easier to use, less verbose or more practical. It just wastes return values.
data:image/s3,"s3://crabby-images/46dc6/46dc618d3e52171111ae75db482ab8f02667c0e6" alt=""
On 1 Sep 2009, at 02:01 , Greg Ewing wrote:
It couldn't work in Python because statements aren't expressions, therefore I think def foo(): if cond: 3 else: 4 would break (given if:else: doesn't return a value, the function couldn't have a return value), but in languages where everything is an expression (where if:else: does return a value) there's nothing unpredictable about it.
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Xavier Morel wrote:
I fail to grasp the unpredictability of "the last expression evaluated in the body of a function is its return value".
It's unpredictable in the sense that if you're writing a function that's not intended to return a value, you're not thinking about what the last call you make in the function returns, so to a first approximation it's just some random value. I often write code that makes use of the fact that falling off the end of a function returns None. This has been a documented part of the Python language from the beginning, and changing it would break a lot of code for no good reason. -- Greg
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
Greg Ewing wrote:
It also means adding a debugging message, assertion, or otherwise side-effect free statement can change the return value of the function. Not cool. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
participants (14)
-
Antoine Pitrou
-
Benjamin Peterson
-
Collin Winter
-
Greg Ewing
-
Gregory P. Smith
-
Jake McGuire
-
Kristján Valur Jónsson
-
Nick Coghlan
-
Raymond Hettinger
-
Scott Dial
-
Scott Dial
-
Steven D'Aprano
-
Xavier Morel
-
Xavier Morel