default of returning None hurts performance?
food for thought as noticed by a coworker who has been profiling some hot code to optimize a library... If a function does not have a return statement we return None. Ironically this makes the foo2 function below faster than the bar2 function at least as measured using bytecode size: Python 2.6.2 (r262:71600, Jul 24 2009, 17:29:21) [GCC 4.2.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import dis def foo(x): ... y = x() ... return y ... def foo2(x): ... return x() ... def bar(x): ... y = x() ... def bar2(x): ... x() ... dis.dis(foo) 2 0 LOAD_FAST 0 (x) 3 CALL_FUNCTION 0 6 STORE_FAST 1 (y)
3 9 LOAD_FAST 1 (y) 12 RETURN_VALUE
dis.dis(foo2) 2 0 LOAD_FAST 0 (x) 3 CALL_FUNCTION 0 6 RETURN_VALUE dis.dis(bar) 2 0 LOAD_FAST 0 (x) 3 CALL_FUNCTION 0 6 STORE_FAST 1 (y) 9 LOAD_CONST 0 (None) 12 RETURN_VALUE dis.dis(bar2) 2 0 LOAD_FAST 0 (x) 3 CALL_FUNCTION 0 6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE
Gregory P. Smith <greg <at> krypto.org> writes:
food for thought as noticed by a coworker who has been profiling some hot code
to optimize a library...If a function does not have a return statement we return None. Ironically this makes the foo2 function below faster than the bar2 function at least as measured using bytecode size I would be surprised if this "bytecode size" difference made a significant difference in runtimes, given that function call cost should dwarf the cumulated cost of POP_TOP and LOAD_CONST (two of the simplest opcodes you could find). Did your coworker run any timings instead of basing his assumptions on bytecode size? Regards Antoine.
On Mon, Aug 31, 2009 at 2:20 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Gregory P. Smith <greg <at> krypto.org> writes:
food for thought as noticed by a coworker who has been profiling some hot
code to optimize a library...If a function does not have a return statement we return None. Ironically this makes the foo2 function below faster than the bar2 function at least as measured using bytecode size
I would be surprised if this "bytecode size" difference made a significant difference in runtimes, given that function call cost should dwarf the cumulated cost of POP_TOP and LOAD_CONST (two of the simplest opcodes you could find).
the attached sample code repeatably shows that it makes a difference though its really not much of one (2-3%). I was just wondering if a bytecode for a superinstruction of the common sequence: 6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE might be worth it.
On Mon, Aug 31, 2009 at 3:07 PM, Gregory P. Smith<greg@krypto.org> wrote:
On Mon, Aug 31, 2009 at 2:20 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Gregory P. Smith <greg <at> krypto.org> writes:
food for thought as noticed by a coworker who has been profiling some hot code
to optimize a library...If a function does not have a return statement we return None. Ironically this makes the foo2 function below faster than the bar2 function at least as measured using bytecode size
I would be surprised if this "bytecode size" difference made a significant difference in runtimes, given that function call cost should dwarf the cumulated cost of POP_TOP and LOAD_CONST (two of the simplest opcodes you could find).
the attached sample code repeatably shows that it makes a difference though its really not much of one (2-3%).
I was just wondering if a bytecode for a superinstruction of the common sequence:
6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE
might be worth it.
I doubt it. You'd save a bit of stack manipulation, but since this will only appear at the end of a function, I'd be skeptical that this would make any macrobenchmarks (statistically) significantly faster. Collin Winter
I was just wondering if a bytecode for a superinstruction of the common sequence:
6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE
might be worth it.
[Collin Winter]
I doubt it. You'd save a bit of stack manipulation, but since this will only appear at the end of a function, I'd be skeptical that this would make any macrobenchmarks (statistically) significantly faster.
I concur with Collin. And since it appears only at the end of a function, the optimization doesn't help inner-loops in a function (where most of the time usually spent). Raymond
Raymond Hettinger wrote:
I was just wondering if a bytecode for a superinstruction of the common sequence:
6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE
might be worth it.
[Collin Winter]
I doubt it. You'd save a bit of stack manipulation, but since this will only appear at the end of a function, I'd be skeptical that this would make any macrobenchmarks (statistically) significantly faster.
I concur with Collin. And since it appears only at the end of a function, the optimization doesn't help inner-loops in a function (where most of the time usually spent).
I fail to understand this crude logic. How often is the inner-loop really going to solely call C code? Any call to Python in an inner-loop is going to suffer this penalty on the order of the number of loop iterations)? -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
On Tue, 1 Sep 2009 05:51:49 pm Scott Dial wrote:
Raymond Hettinger wrote:
I was just wondering if a bytecode for a superinstruction of the common sequence:
6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE
might be worth it.
[Collin Winter]
I doubt it. You'd save a bit of stack manipulation, but since this will only appear at the end of a function, I'd be skeptical that this would make any macrobenchmarks (statistically) significantly faster.
I concur with Collin. And since it appears only at the end of a function, the optimization doesn't help inner-loops in a function (where most of the time usually spent).
I fail to understand this crude logic. How often is the inner-loop really going to solely call C code? Any call to Python in an inner-loop is going to suffer this penalty on the order of the number of loop iterations)?
Most functions don't suffer this penalty. Consider the following two functions: def g(x): return x() def h(x): x() Now disassemble:
dis.dis(g) 2 0 LOAD_FAST 0 (x) 3 CALL_FUNCTION 0 6 RETURN_VALUE dis.dis(h) 2 0 LOAD_FAST 0 (x) 3 CALL_FUNCTION 0 6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE
The first doesn't suffer any such default "return None" penalty, and so won't gain any performance benefit from optimizing it. It is only the subset of functions which don't explicitly return anything which will see any potential benefit. Let me call such functions "procedures" to avoid confusion with those functions which won't see any benefit. While procedures may see some benefit, it's a trivial amount, probably not worth the extra complexity. According to Gregory's tests, the difference is approximately 2% on a trivial do-nothing function. According to my tests on my PC, I might hope to save somewhat less than 0.1 microsecond per procedure call as an absolute saving. As a relative saving though, it will most likely be insignificant: for comparison's sake, urllib2.Request('http://example.com') takes around 150μs on my machine, and math.sin(1.1) around 90μs. For any procedure which does non-trivial amounts of work, saving 0.1μs is insignificant, no matter how many times it is called inside a loop. -- Steven D'Aprano
Raymond Hettinger wrote:
I was just wondering if a bytecode for a superinstruction of the common sequence:
6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE
might be worth it.
[Collin Winter]
I doubt it. You'd save a bit of stack manipulation, but since this will only appear at the end of a function, I'd be skeptical that this would make any macrobenchmarks (statistically) significantly faster.
I concur with Collin. And since it appears only at the end of a function, the optimization doesn't help inner-loops in a function (where most of the time usually spent).
I fail to understand this crude logic. How often is the inner-loop really going to solely call C code? Any call to Python in an inner-loop is going to suffer this penalty on the order of the number of loop iterations)? -Scott -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
Gregory P. Smith <greg <at> krypto.org> writes:
I was just wondering if a bytecode for a superinstruction of the common
sequence:
6 POP_TOP 7 LOAD_CONST 0 (None) 10 RETURN_VALUE might be worth it.
I think superinstructions in general would be a good thing to experiment, as wpython showed. Direct addressing (via a pseudo register file combining locals and constants) would eliminate many bookkeeping-related opcodes in common bytecode. Regards Antoine.
Antoine Pitrou wrote:
Did your coworker run any timings instead of basing his assumptions on bytecode size?
In any case, what are you suggesting -- that the last value returned by a function call in the body should be the default return value? I don't think the unpredictability that would introduce would be a good idea. -- Greg
On Mon, Aug 31, 2009 at 5:01 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>wrote:
Antoine Pitrou wrote:
Did your coworker run any timings instead of basing his assumptions on
bytecode size?
In any case, what are you suggesting -- that the last value returned by a function call in the body should be the default return value?
I don't think the unpredictability that would introduce would be a good idea.
gads no. we're not shell or perl! don't do that :)
On 1 Sep 2009, at 03:03 , Gregory P. Smith wrote:
On Mon, Aug 31, 2009 at 5:01 PM, Greg Ewing <greg.ewing@canterbury.ac.nz
wrote:
Antoine Pitrou wrote:
Did your coworker run any timings instead of basing his assumptions on
bytecode size?
In any case, what are you suggesting -- that the last value returned by a function call in the body should be the default return value?
I don't think the unpredictability that would introduce would be a good idea.
gads no. we're not shell or perl! don't do that :) "We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit :
"We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
Well, feel free to prefer an unreadable language if you want :) Having implicit return values is certainly not something which follows Python's design principles. Even C abandoned the idea. In any case, this discussion is off-topic for this thread. If you want to discuss the topic further, you can post to python-list or python-ideas (it will most certainly be shot down anyway).
"We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
Well, feel free to prefer an unreadable language if you want :) Smalltalk or Haskell are hardly inherently unreadable. And it's not
On 1 Sep 2009, at 15:25 , Antoine Pitrou wrote: Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit : like the Python community never lifts features from such languages, so obviously they do (some at least) things right.
it will most certainly be shot down anyway Yep, so there's not much point in bringing it up there.
Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit :
"We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
IIRC, the default return value of a Smalltalk method is self, not the last thing evaluated. (And no, that's not going to happen in Python either -- the BDFL has rejected similar suggestions on previous occasions.) -- Greg
On 2 Sep 2009, at 00:10 , Greg Ewing wrote: Le mardi 01 septembre 2009 à 15:09 +0200, Xavier Morel a écrit :
"We" are not Erlang, Smalltalk, OCaml or Haskell either, sadly.
IIRC, the default return value of a Smalltalk method is self, not the last thing evaluated. Methods yes (and that's one of the few Smalltalk design "features" I consider truly dumb, considering it has message cascading), but I was referring to blocks rather than methods and the return value of blocks is the last evaluated expression.
(And no, that's not going to happen in Python either -- the BDFL has rejected similar suggestions on previous occasions.) I know.
Xavier Morel wrote:
Methods yes (and that's one of the few Smalltalk design "features" I consider truly dumb, considering it has message cascading)
Cascading is something different -- it's for sending multiple messages to the *same* receiver. It's not dumb to have both. -- Greg
On 3 Sep 2009, at 23:33 , Greg Ewing wrote: Xavier Morel wrote:
Methods yes (and that's one of the few Smalltalk design "features" I consider truly dumb, considering it has message cascading)
Cascading is something different -- it's for sending multiple messages to the *same* receiver. It's not dumb to have both.
I know what cascading is for. The issue is that with message cascading + the "yourself" message, you *never* need to chain on self (you can just cascade and -- if you end up needing the instance to drop down at the end of the cascade -- send `yourself`). Chaining on self is completely redundant in smalltalk as the purpose of this pattern is *also* to send a sequence of messages to the same receiver (something message cascading already handles & guarantees). Therefore defaulting method to self-chaining is very dumb and serves no purpose whatsoever. It doesn't make the language easier to use, less verbose or more practical. It just wastes return values.
On 1 Sep 2009, at 02:01 , Greg Ewing wrote:
I don't think the unpredictability that would introduce would be a good idea. I fail to grasp the unpredictability of "the last expression evaluated in the body of a function is its return value".
It couldn't work in Python because statements aren't expressions, therefore I think def foo(): if cond: 3 else: 4 would break (given if:else: doesn't return a value, the function couldn't have a return value), but in languages where everything is an expression (where if:else: does return a value) there's nothing unpredictable about it.
Xavier Morel wrote:
I fail to grasp the unpredictability of "the last expression evaluated in the body of a function is its return value".
It's unpredictable in the sense that if you're writing a function that's not intended to return a value, you're not thinking about what the last call you make in the function returns, so to a first approximation it's just some random value. I often write code that makes use of the fact that falling off the end of a function returns None. This has been a documented part of the Python language from the beginning, and changing it would break a lot of code for no good reason. -- Greg
Greg Ewing wrote:
Xavier Morel wrote:
I fail to grasp the unpredictability of "the last expression evaluated in the body of a function is its return value".
It's unpredictable in the sense that if you're writing a function that's not intended to return a value, you're not thinking about what the last call you make in the function returns, so to a first approximation it's just some random value.
I often write code that makes use of the fact that falling off the end of a function returns None. This has been a documented part of the Python language from the beginning, and changing it would break a lot of code for no good reason.
It also means adding a debugging message, assertion, or otherwise side-effect free statement can change the return value of the function. Not cool. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Not advocating a change, merely pointing out that it's how Ruby works. K
-----Original Message----- From: python-dev-bounces+kristjan=ccpgames.com@python.org [mailto:python-dev-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Nick Coghlan Sent: 2. september 2009 11:12 To: Greg Ewing Cc: python-dev@python.org Subject: Re: [Python-Dev] default of returning None hurts performance?
It also means adding a debugging message, assertion, or otherwise side-effect free statement can change the return value of the function. Not cool.
On Mon, Aug 31, 2009 at 5:01 PM, Greg Ewing <greg.ewing@canterbury.ac.nz>wrote:
Antoine Pitrou wrote:
Did your coworker run any timings instead of basing his assumptions on
bytecode size?
In any case, what are you suggesting -- that the last value returned by a function call in the body should be the default return value?
I don't think the unpredictability that would introduce would be a good idea.
Never mind the fact that there is a lot of Python code out there that assumes that functions without an explicit return value, return None. This is pretty fundamental to the Python API. -jake
participants (14)
-
Antoine Pitrou
-
Benjamin Peterson
-
Collin Winter
-
Greg Ewing
-
Gregory P. Smith
-
Jake McGuire
-
Kristján Valur Jónsson
-
Nick Coghlan
-
Raymond Hettinger
-
Scott Dial
-
Scott Dial
-
Steven D'Aprano
-
Xavier Morel
-
Xavier Morel