are NULL checks in Objects/abstract.c really needed?

[this was sent to python-list, but i'm re-posting here as told by Skip] hello, i had a quick look at Objects/abstract.c in 2.2.2's source. almost every function there checks whether the objects it's passed are not NULL. if they are, SystemError exception occurs. since i've never come across such exception i've commented out those checks. the resulting python binary did 6.5% more pystones on average (the numbers are below). my question is: are those checks really necessary in non-debug python build? the pystone results: BEFORE: $ for (( i = 0; i <= 5; i++ )); do ./pystone.py; done Pystone(1.1) time for 10000 passes = 0.6 This machine benchmarks at 16666.7 pystones/second Pystone(1.1) time for 10000 passes = 0.56 This machine benchmarks at 17857.1 pystones/second Pystone(1.1) time for 10000 passes = 0.58 This machine benchmarks at 17241.4 pystones/second Pystone(1.1) time for 10000 passes = 0.57 This machine benchmarks at 17543.9 pystones/second Pystone(1.1) time for 10000 passes = 0.57 This machine benchmarks at 17543.9 pystones/second AFTER: $ for (( i = 0; i <= 5; i++ )); do ./pystone.py; done Pystone(1.1) time for 10000 passes = 0.54 This machine benchmarks at 18518.5 pystones/second Pystone(1.1) time for 10000 passes = 0.57 This machine benchmarks at 17543.9 pystones/second Pystone(1.1) time for 10000 passes = 0.55 This machine benchmarks at 18181.8 pystones/second Pystone(1.1) time for 10000 passes = 0.52 This machine benchmarks at 19230.8 pystones/second Pystone(1.1) time for 10000 passes = 0.52 This machine benchmarks at 19230.8 pystones/second Pystone(1.1) time for 10000 passes = 0.54 -- fuf (fuf@mageo.cz)

Michal Vitecek <fuf@mageo.cz> writes:
There are a number of bits of stupidly defensive programming in Python... personally, I'd like to see the back of them.
the resulting python binary did 6.5% more pystones on average (the numbers are below).
Wow! Can we persuade you to try CVS HEAD?
my question is: are those checks really necessary in non-debug python build?
This is the tricky bit, of course. I don't think so, but it's hard to be sure. OTOH, it could be the easiest 5% speed up ever... Cheers, M. -- This makes it possible to pass complex object hierarchies to a C coder who thinks computer science has made no worthwhile advancements since the invention of the pointer. -- Gordon McMillan, 30 Jul 1998

Michael Hudson <mwh@python.net> writes:
Actually, I've now tried it, and saw a pystone increase of more like 0.1%. Are you sure the abstract.c changes are the only difference between the two binaries? Cheers, M. -- I've reinvented the idea of variables and types as in a programming language, something I do on every project. -- Greg Ward, September 1998

Michael Hudson wrote:
Wow! Can we persuade you to try CVS HEAD?
okay - i did as you said and the speed-up is only 2.1% so it's not probably worth it. here come the numbers: BEFORE: $ for (( i = 0; i <= 5; i++ )); do ./python Lib/test/pystone.py; done Pystone(1.1) time for 50000 passes = 1.97 This machine benchmarks at 25380.7 pystones/second Pystone(1.1) time for 50000 passes = 1.92 This machine benchmarks at 26041.7 pystones/second Pystone(1.1) time for 50000 passes = 1.96 This machine benchmarks at 25510.2 pystones/second Pystone(1.1) time for 50000 passes = 1.97 This machine benchmarks at 25380.7 pystones/second Pystone(1.1) time for 50000 passes = 1.96 This machine benchmarks at 25510.2 pystones/second Pystone(1.1) time for 50000 passes = 1.96 This machine benchmarks at 25510.2 pystones/second AFTER: $ for (( i = 0; i <= 5; i++ )); do ./python Lib/test/pystone.py; done Pystone(1.1) time for 50000 passes = 1.95 This machine benchmarks at 25641 pystones/second Pystone(1.1) time for 50000 passes = 1.93 This machine benchmarks at 25906.7 pystones/second Pystone(1.1) time for 50000 passes = 1.91 This machine benchmarks at 26178 pystones/second Pystone(1.1) time for 50000 passes = 1.92 This machine benchmarks at 26041.7 pystones/second Pystone(1.1) time for 50000 passes = 1.89 This machine benchmarks at 26455 pystones/second Pystone(1.1) time for 50000 passes = 1.89 This machine benchmarks at 26455 pystones/second -- fuf (fuf@mageo.cz)

Michal Vitecek <fuf@mageo.cz> writes:
I didn't say "*two* point one", I said "*nought* point one"!: BEFORE: $ for i in 1 2 3 4 5; do ./python- ../Lib/test/pystone.py; done Pystone(1.1) time for 50000 passes = 3.39 This machine benchmarks at 14749.3 pystones/second Pystone(1.1) time for 50000 passes = 3.39 This machine benchmarks at 14749.3 pystones/second Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.37 This machine benchmarks at 14836.8 pystones/second Pystone(1.1) time for 50000 passes = 3.39 This machine benchmarks at 14749.3 pystones/second AFTER: $ for i in 1 2 3 4 5; do ./python ../Lib/test/pystone.py; done Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.4 This machine benchmarks at 14705.9 pystones/second If it was a 2% gain, I'd say go for it (though Guido isn't so sure, it seems). What compiler/platform are you using? Cheers, M. -- languages shape the way we think, or don't. -- Erik Naggum, comp.lang.lisp

Michal, Can you post your changes to abstract.c as a patch on SourceForge? That would allow multiple people to mull it over and all be sure they are working from the same code base. If Michael Hudson and Guido reported substantially different speedups than you, perhaps you were doing something they weren't. Skip

Michal Vitecek wrote:
I'd rather suggest to take a look at making more use of the available Python macros in the interpreter. Things like PyInt_AsLong() can often be written as PyInt_AS_LONG() because there's a type check only a few lines above the call. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Mar 13 2003)
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
Python UK 2003, Oxford: 19 days left EuroPython 2003, Charleroi, Belgium: 103 days left

Unfortunately, this is part of the safety net for poor extension writers, and I'm not sure we can drop it. Given that Pystone is so regular, it's probably just one or two of the functions you changed that make the difference. If you can figure out which ones, perhaps you could inline just those (in the switch in ceval.c) and get the same effect. Anyway, I only get a 1% speedup. --Guido van Rossum (home page: http://www.python.org/~guido/)

Of course. My thought was that either one will come to the attention of the extension writer before the extension goes out. But then, if the code in question never got excercised, then it would crash in the hands of a user. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# #################################################################

On Thu, 2003-03-13 at 15:16, Raymond Hettinger wrote:
That's right. We should expect that some number of bugs in extension code are going to be found by end users. An end user is better able to cope with a SystemError than a core file. Long running servers have a different reason to prefer SystemError. A Zope process allows untrusted code to call some extension module, believing it is safe. A bug is found in the extension. If the bug tickles an assert(), Zope crashes. If the bug raises an exception, Zope catches it and continues.
Your funky sig is back :-). Jeremy

Raymond> Can we get most of the same benefit by using an assert() rather Raymond> than NULL-->SystemError? Jeremy> No. assert() causes the program to fail. SystemError() raises Jeremy> an exception and lets the program keep going. Those are vastly Jeremy> different effects. It's not clear to me that you'd see any benefit anyway. The checking code currently looks like this: if (o == NULL) return null_error(); If you changed it to use assert you'd have assert(o != NULL); which expands to ((o != NULL) ? 0 : __assert(...)); In the common case you still test for either o==NULL or o!=NULL. Unless one test is terrifically faster than the other (and you executed it a helluva lot) you wouldn't gain anything except the loss of the possibility (however slim) that you might be able to recover. Still, for people who's only desire is speed and are willing to sacrifice checks to get it, perhaps we should have a --without-null-checks configure flag. ;-) I bet if you were ruthless in eliminating checks (especially in ceval.c) you would see an easily measurable speedup. Skip

[Skip Montanaro]
In the release build, Python arranges to #define the preprocessor NDEBUG symbol, which in turn causes assert() to expand to nothing (or maybe to (void)0, or something like that, depending on the compiler). That's standard ANSI C behavior for assert(). IOW, asserts cost nothing in a release build -- and don't do anything in a release build either.

Tim> In the release build, Python arranges to #define the preprocessor Tim> NDEBUG symbol, which in turn causes assert() to expand to nothing Yeah, I forgot about that. Okay, so the analysis was flawed. You didn't comment on the --without-null-checks option. ;-) Skip

Michal Vitecek <fuf@mageo.cz> writes:
There are a number of bits of stupidly defensive programming in Python... personally, I'd like to see the back of them.
the resulting python binary did 6.5% more pystones on average (the numbers are below).
Wow! Can we persuade you to try CVS HEAD?
my question is: are those checks really necessary in non-debug python build?
This is the tricky bit, of course. I don't think so, but it's hard to be sure. OTOH, it could be the easiest 5% speed up ever... Cheers, M. -- This makes it possible to pass complex object hierarchies to a C coder who thinks computer science has made no worthwhile advancements since the invention of the pointer. -- Gordon McMillan, 30 Jul 1998

Michael Hudson <mwh@python.net> writes:
Actually, I've now tried it, and saw a pystone increase of more like 0.1%. Are you sure the abstract.c changes are the only difference between the two binaries? Cheers, M. -- I've reinvented the idea of variables and types as in a programming language, something I do on every project. -- Greg Ward, September 1998

Michael Hudson wrote:
Wow! Can we persuade you to try CVS HEAD?
okay - i did as you said and the speed-up is only 2.1% so it's not probably worth it. here come the numbers: BEFORE: $ for (( i = 0; i <= 5; i++ )); do ./python Lib/test/pystone.py; done Pystone(1.1) time for 50000 passes = 1.97 This machine benchmarks at 25380.7 pystones/second Pystone(1.1) time for 50000 passes = 1.92 This machine benchmarks at 26041.7 pystones/second Pystone(1.1) time for 50000 passes = 1.96 This machine benchmarks at 25510.2 pystones/second Pystone(1.1) time for 50000 passes = 1.97 This machine benchmarks at 25380.7 pystones/second Pystone(1.1) time for 50000 passes = 1.96 This machine benchmarks at 25510.2 pystones/second Pystone(1.1) time for 50000 passes = 1.96 This machine benchmarks at 25510.2 pystones/second AFTER: $ for (( i = 0; i <= 5; i++ )); do ./python Lib/test/pystone.py; done Pystone(1.1) time for 50000 passes = 1.95 This machine benchmarks at 25641 pystones/second Pystone(1.1) time for 50000 passes = 1.93 This machine benchmarks at 25906.7 pystones/second Pystone(1.1) time for 50000 passes = 1.91 This machine benchmarks at 26178 pystones/second Pystone(1.1) time for 50000 passes = 1.92 This machine benchmarks at 26041.7 pystones/second Pystone(1.1) time for 50000 passes = 1.89 This machine benchmarks at 26455 pystones/second Pystone(1.1) time for 50000 passes = 1.89 This machine benchmarks at 26455 pystones/second -- fuf (fuf@mageo.cz)

Michal Vitecek <fuf@mageo.cz> writes:
I didn't say "*two* point one", I said "*nought* point one"!: BEFORE: $ for i in 1 2 3 4 5; do ./python- ../Lib/test/pystone.py; done Pystone(1.1) time for 50000 passes = 3.39 This machine benchmarks at 14749.3 pystones/second Pystone(1.1) time for 50000 passes = 3.39 This machine benchmarks at 14749.3 pystones/second Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.37 This machine benchmarks at 14836.8 pystones/second Pystone(1.1) time for 50000 passes = 3.39 This machine benchmarks at 14749.3 pystones/second AFTER: $ for i in 1 2 3 4 5; do ./python ../Lib/test/pystone.py; done Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.38 This machine benchmarks at 14792.9 pystones/second Pystone(1.1) time for 50000 passes = 3.4 This machine benchmarks at 14705.9 pystones/second If it was a 2% gain, I'd say go for it (though Guido isn't so sure, it seems). What compiler/platform are you using? Cheers, M. -- languages shape the way we think, or don't. -- Erik Naggum, comp.lang.lisp

Michal, Can you post your changes to abstract.c as a patch on SourceForge? That would allow multiple people to mull it over and all be sure they are working from the same code base. If Michael Hudson and Guido reported substantially different speedups than you, perhaps you were doing something they weren't. Skip

Michal Vitecek wrote:
I'd rather suggest to take a look at making more use of the available Python macros in the interpreter. Things like PyInt_AsLong() can often be written as PyInt_AS_LONG() because there's a type check only a few lines above the call. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Mar 13 2003)
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
Python UK 2003, Oxford: 19 days left EuroPython 2003, Charleroi, Belgium: 103 days left

Unfortunately, this is part of the safety net for poor extension writers, and I'm not sure we can drop it. Given that Pystone is so regular, it's probably just one or two of the functions you changed that make the difference. If you can figure out which ones, perhaps you could inline just those (in the switch in ceval.c) and get the same effect. Anyway, I only get a 1% speedup. --Guido van Rossum (home page: http://www.python.org/~guido/)

Of course. My thought was that either one will come to the attention of the extension writer before the extension goes out. But then, if the code in question never got excercised, then it would crash in the hands of a user. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# #################################################################

On Thu, 2003-03-13 at 15:16, Raymond Hettinger wrote:
That's right. We should expect that some number of bugs in extension code are going to be found by end users. An end user is better able to cope with a SystemError than a core file. Long running servers have a different reason to prefer SystemError. A Zope process allows untrusted code to call some extension module, believing it is safe. A bug is found in the extension. If the bug tickles an assert(), Zope crashes. If the bug raises an exception, Zope catches it and continues.
Your funky sig is back :-). Jeremy

Raymond> Can we get most of the same benefit by using an assert() rather Raymond> than NULL-->SystemError? Jeremy> No. assert() causes the program to fail. SystemError() raises Jeremy> an exception and lets the program keep going. Those are vastly Jeremy> different effects. It's not clear to me that you'd see any benefit anyway. The checking code currently looks like this: if (o == NULL) return null_error(); If you changed it to use assert you'd have assert(o != NULL); which expands to ((o != NULL) ? 0 : __assert(...)); In the common case you still test for either o==NULL or o!=NULL. Unless one test is terrifically faster than the other (and you executed it a helluva lot) you wouldn't gain anything except the loss of the possibility (however slim) that you might be able to recover. Still, for people who's only desire is speed and are willing to sacrifice checks to get it, perhaps we should have a --without-null-checks configure flag. ;-) I bet if you were ruthless in eliminating checks (especially in ceval.c) you would see an easily measurable speedup. Skip

[Skip Montanaro]
In the release build, Python arranges to #define the preprocessor NDEBUG symbol, which in turn causes assert() to expand to nothing (or maybe to (void)0, or something like that, depending on the compiler). That's standard ANSI C behavior for assert(). IOW, asserts cost nothing in a release build -- and don't do anything in a release build either.

Tim> In the release build, Python arranges to #define the preprocessor Tim> NDEBUG symbol, which in turn causes assert() to expand to nothing Yeah, I forgot about that. Okay, so the analysis was flawed. You didn't comment on the --without-null-checks option. ;-) Skip
participants (8)
-
Guido van Rossum
-
Jeremy Hylton
-
M.-A. Lemburg
-
Michael Hudson
-
Michal Vitecek
-
Raymond Hettinger
-
Skip Montanaro
-
Tim Peters