[Patches] [ python-Patches-918462 ] simple

SourceForge.net noreply at sourceforge.net
Tue Mar 23 02:26:51 EST 2004


Patches item #918462, was opened at 2004-03-17 20:50
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=918462&group_id=5470

Category: Core (C code)
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Raymond Hettinger (rhettinger)
Summary: simple 

Initial Comment:
All this "is" vs "==" discussion led me to look at ceval.c.  
The attached patch seems to speed up "is" and "is not" 
comparisons - saving a function call to do a simple pointer 
comparison for non-integer arguments.

The test suite passes, but it's been quite awhile since I 
messed around with the interpreter code, so I thought I 
ought to have another pair of eyeballs check it out...


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2004-03-23 02:26

Message:
Logged In: YES 
user_id=80475

I'm pretty sure that this is a false optimization because
the time saved in the function call is being offset by the
extra unpredictable branch for the other tests.  

Even if those others are losing 1% while either "is" or
"isnot" gain 10%, the comparisons are not apt.  The total
time for rich compares is so long that 1% represents much
more real time than 1% of an is/insnot test.  Also, the
results need to be considered in aggregate with real times
(not percentages) and appropriate frequency weighting (if
known).  For example:

    IS      occurs 100 times     saving  9 microsec each time
    ISNOT occurs  70 times     saving  9 microsec each time
    EQ     occurs 700 times     costing 4 microsec each time
    NE     occurs  50 times      costing 4 microsec each time
    LT     occurs  100 times     costing 4 microsec each time
    -->  weighted result      1.8 microsec lost

Of course, this can't be done exactly or even inexactly, but
it shows that the percentages can't be considered out of the
context of dynamic usage frequency, aggregations of all the
operators, and real time.

If something like this patch needs to go in, consider making
the branches predictable:

slow_compare:
      if (oparg == PyCmp_IS) {
              x = (v == w) ? Py_True : Py_False;
              Py_INCREF(x);
      } else if (oparg == PyCmp_IS_NOT) {
              x = (v != w) ? Py_True : Py_False;
              Py_INCREF(x);
      } else
              x = cmp_outcome(oparg, v, w);

Also, when it comes to micro-optimizations that are compiler
sensitive, the Intel timing tests should be built with the
compiler actually used to build the distribution (no sense
convincing ourselves of an optimization that doesn't occur
on the real distribution).


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-03-23 00:15

Message:
Logged In: YES 
user_id=31435

When you introduce a new branch, and time it in isolation, 
HW may have enough resource to optimize for both branch 
targets simultaneously.  Run a ton of other stuff too, though, 
and then it can start to lose.  Still, for detailed answers about 
anything at this level, you need to use a HW simulator -- 
modern processors are intractably complex, and the user-
visible programming model supplied by Pentium in particular is 
multiple layers removed from bottom-line reality now, so much 
so that  Intel doesn't even try to supply "instruction timings" 
anymore (they depend in complex ways on the internal states 
of resources that aren't visible in the programming model).

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2004-03-22 16:53

Message:
Logged In: YES 
user_id=44345

I reran the test on a Linux system today and got similar results.  
I'm pasting them here mostly as documentation.  I'm still a bit 
confused why the == and > tests should show improvement, but 
they often do on both platforms.  Any ideas? Looking at the 
assembly code generated GCC inserts basically the same four 
instructions on both the Intel and PowerPC platforms:

	cmpl	$8, -40(%ebp)
	je	.L580
	cmpl	$9, -40(%ebp)
	je	.L583

on Intel or

	cmpwi cr7,r24,8
	beq- cr7,L622
	cmpwi cr7,r24,9
	beq- cr7,L625

on PowerPC.

I also tried pystone.  I see performance hits on both Linux and 
Mac OSX:

    Fastest of ten runs

		    patched                 unpatched
    Linux           37878.8                 38167.9
    Mac OSX         13888.9                 14124.3

Oh well...  It was a thought.

Test output on Linux:


s = 'abc'
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.116        0.103        0.013        -11.2
s == 'abc'          0.145        0.141        0.004         -2.8
s > 'abc'           0.140        0.142       -0.002          1.4
s is 4              0.139        0.121        0.018        -12.9
s == 4              0.271        0.293       -0.022          8.1
s > 4               0.276        0.273        0.003         -1.1
s is -1001          0.126        0.120        0.006         -4.8
s == -1001          0.270        0.272       -0.002          0.7
s > -1001           0.282        0.275        0.007         -2.5
s is 34.7           0.133        0.119        0.014        -10.5
s == 34.7           0.352        0.343        0.009         -2.6
s > 34.7            0.340        0.344       -0.004          1.2
s is 'a b c'        0.135        0.118        0.017        -12.6
s == 'a b c'        0.159        0.157        0.002         -1.3
s > 'a b c'         0.200        0.201       -0.001          0.5
s is True           0.177        0.170        0.007         -4.0
s == True           0.316        0.318       -0.002          0.6
s > True            0.321        0.321        0.000          0.0

s = 4
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.143        0.120        0.023        -16.1
s == 'abc'          0.266        0.285       -0.019          7.1
s > 'abc'           0.270        0.276       -0.006          2.2
s is 4              0.175        0.103        0.072        -41.1
s == 4              0.105        0.105        0.000          0.0
s > 4               0.106        0.107       -0.001          0.9
s is -1001          0.119        0.119        0.000          0.0
s == -1001          0.119        0.119        0.000          0.0
s > -1001           0.121        0.178       -0.057         47.1
s is 34.7           0.127        0.129       -0.002          1.6
s == 34.7           0.201        0.195        0.006         -3.0
s > 34.7            0.193        0.197       -0.004          2.1
s is 'a b c'        0.212        0.125        0.087        -41.0
s == 'a b c'        0.268        0.271       -0.003          1.1
s > 'a b c'         0.269        0.276       -0.007          2.6
s is True           0.196        0.160        0.036        -18.4
s == True           0.239        0.258       -0.019          7.9
s > True            0.265        0.237        0.028        -10.6

s = None
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.120        0.109        0.011         -9.2
s == 'abc'          0.203        0.204       -0.001          0.5
s > 'abc'           0.206        0.206        0.000          0.0
s is 4              0.119        0.110        0.009         -7.6
s == 4              0.217        0.214        0.003         -1.4
s > 4               0.214        0.220       -0.006          2.8
s is -1001          0.120        0.107        0.013        -10.8
s == -1001          0.207        0.207        0.000          0.0
s > -1001           0.207        0.214       -0.007          3.4
s is 34.7           0.122        0.112        0.010         -8.2
s == 34.7           0.274        0.270        0.004         -1.5
s > 34.7            0.272        0.271        0.001         -0.4
s is 'a b c'        0.148        0.128        0.020        -13.5
s == 'a b c'        0.240        0.242       -0.002          0.8
s > 'a b c'         0.206        0.210       -0.004          1.9
s is True           0.162        0.153        0.009         -5.6
s == True           0.267        0.262        0.005         -1.9
s > True            0.284        0.258        0.026         -9.2

s = -1000
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.218        0.128        0.090        -41.3
s == 'abc'          0.274        0.275       -0.001          0.4
s > 'abc'           0.264        0.301       -0.037         14.0
s is 4              0.125        0.120        0.005         -4.0
s == 4              0.123        0.122        0.001         -0.8
s > 4               0.119        0.121       -0.002          1.7
s is -1001          0.123        0.123        0.000          0.0
s == -1001          0.132        0.123        0.009         -6.8
s > -1001           0.121        0.121        0.000          0.0
s is 34.7           0.130        0.215       -0.085         65.4
s == 34.7           0.199        0.197        0.002         -1.0
s > 34.7            0.194        0.236       -0.042         21.6
s is 'a b c'        0.158        0.140        0.018        -11.4
s == 'a b c'        0.294        0.293        0.001         -0.3
s > 'a b c'         0.302        0.300        0.002         -0.7
s is True           0.190        0.161        0.029        -15.3
s == True           0.234        0.232        0.002         -0.9
s > True            0.238        0.234        0.004         -1.7

s = 34.2
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.133        0.120        0.013         -9.8
s == 'abc'          0.338        0.330        0.008         -2.4
s > 'abc'           0.350        0.338        0.012         -3.4
s is 4              0.126        0.121        0.005         -4.0
s == 4              0.194        0.197       -0.003          1.5
s > 4               0.193        0.196       -0.003          1.6
s is -1001          0.132        0.120        0.012         -9.1
s == -1001          0.293        0.193        0.100        -34.1
s > -1001           0.196        0.190        0.006         -3.1
s is 34.7           0.117        0.105        0.012        -10.3
s == 34.7           0.153        0.153        0.000          0.0
s > 34.7            0.156        0.155        0.001         -0.6
s is 'a b c'        0.152        0.138        0.014         -9.2
s == 'a b c'        0.360        0.398       -0.038         10.6
s > 'a b c'         0.334        0.354       -0.020          6.0
s is True           0.171        0.174       -0.003          1.8
s == True           0.248        0.254       -0.006          2.4
s > True            0.247        0.244        0.003         -1.2

s = 'a b c'
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.137        0.117        0.020        -14.6
s == 'abc'          0.157        0.158       -0.001          0.6
s > 'abc'           0.204        0.201        0.003         -1.5
s is 4              0.131        0.119        0.012         -9.2
s == 4              0.269        0.272       -0.003          1.1
s > 4               0.277        0.277        0.000          0.0
s is -1001          0.153        0.146        0.007         -4.6
s == -1001          0.299        0.294        0.005         -1.7
s > -1001           0.299        0.302       -0.003          1.0
s is 34.7           0.153        0.146        0.007         -4.6
s == 34.7           0.374        0.368        0.006         -1.6
s > 34.7            0.342        0.336        0.006         -1.8
s is 'a b c'        0.140        0.118        0.022        -15.7
s == 'a b c'        0.150        0.158       -0.008          5.3
s > 'a b c'         0.160        0.156        0.004         -2.5
s is True           0.193        0.194       -0.001          0.5
s == True           0.345        0.338        0.007         -2.0
s > True            0.318        0.319       -0.001          0.3

s = object()
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.158        0.143        0.015         -9.5
s == 'abc'          0.298        0.294        0.004         -1.3
s > 'abc'           0.288        0.292       -0.004          1.4
s is 4              0.129        0.121        0.008         -6.2
s == 4              0.249        0.250       -0.001          0.4
s > 4               0.248        0.249       -0.001          0.4
s is -1001          0.151        0.152       -0.001          0.7
s == -1001          0.271        0.266        0.005         -1.8
s > -1001           0.284        0.271        0.013         -4.6
s is 34.7           0.152        0.140        0.012         -7.9
s == 34.7           0.364        0.385       -0.021          5.8
s > 34.7            0.429        0.392        0.037         -8.6
s is 'a b c'        0.152        0.138        0.014         -9.2
s == 'a b c'        0.300        0.297        0.003         -1.0
s > 'a b c'         0.288        0.285        0.003         -1.0
s is True           0.192        0.184        0.008         -4.2
s == True           0.325        0.329       -0.004          1.2
s > True            0.324        0.322        0.002         -0.6

s = []
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.126        0.121        0.005         -4.0
s == 'abc'          0.266        0.285       -0.019          7.1
s > 'abc'           0.273        0.271        0.002         -0.7
s is 4              0.125        0.119        0.006         -4.8
s == 4              0.269        0.269        0.000          0.0
s > 4               0.268        0.274       -0.006          2.2
s is -1001          0.133        0.121        0.012         -9.0
s == -1001          0.269        0.291       -0.022          8.2
s > -1001           0.271        0.269        0.002         -0.7
s is 34.7           0.132        0.124        0.008         -6.1
s == 34.7           0.332        0.362       -0.030          9.0
s > 34.7            0.339        0.336        0.003         -0.9
s is 'a b c'        0.125        0.119        0.006         -4.8
s == 'a b c'        0.268        0.291       -0.023          8.6
s > 'a b c'         0.275        0.273        0.002         -0.7
s is True           0.171        0.164        0.007         -4.1
s == True           0.317        0.315        0.002         -0.6
s > True            0.338        0.316        0.022         -6.5




----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2004-03-21 09:33

Message:
Logged In: YES 
user_id=44345

I spent a fair amount of time yesterday refining and running a 
shell script (attached) to compare the before and after times for 
various comparisons of simple objects.  Here's the output:


s = 'abc'
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.375        0.329        0.046        -12.3
s == 'abc'          0.491        0.493       -0.002          0.4
s > 'abc'           0.491        0.493       -0.002          0.4
s is 4              0.375        0.333        0.042        -11.2
s == 4              1.200        1.190        0.010         -0.8
s > 4               1.200        1.190        0.010         -0.8
s is -1001          0.378        0.332        0.046        -12.2
s == -1001          1.200        1.190        0.010         -0.8
s > -1001           1.200        1.180        0.020         -1.7
s is 34.7           0.370        0.325        0.045        -12.2
s == 34.7           1.620        1.590        0.030         -1.9
s > 34.7            1.600        1.590        0.010         -0.6
s is 'a b c'        0.369        0.328        0.041        -11.1
s == 'a b c'        0.475        0.476       -0.001          0.2
s > 'a b c'         0.559        0.563       -0.004          0.7
s is True           0.531        0.491        0.040         -7.5
s == True           1.400        1.390        0.010         -0.7
s > True            1.400        1.380        0.020         -1.4

s = 4
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.369        0.325        0.044        -11.9
s == 'abc'          1.200        1.190        0.010         -0.8
s > 'abc'           1.200        1.190        0.010         -0.8
s is 4              0.353        0.353        0.000          0.0
s == 4              0.352        0.355       -0.003          0.9
s > 4               0.354        0.350        0.004         -1.1
s is -1001          0.347        0.350       -0.003          0.9
s == -1001          0.350        0.353       -0.003          0.9
s > -1001           0.346        0.345        0.001         -0.3
s is 34.7           0.367        0.327        0.040        -10.9
s == 34.7           0.773        0.769        0.004         -0.5
s > 34.7            0.771        0.772       -0.001          0.1
s is 'a b c'        0.370        0.327        0.043        -11.6
s == 'a b c'        1.200        1.190        0.010         -0.8
s > 'a b c'         1.200        1.190        0.010         -0.8
s is True           0.534        0.492        0.042         -7.9
s == True           0.905        0.911       -0.006          0.7
s > True            0.904        0.913       -0.009          1.0

s = None
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.368        0.327        0.041        -11.1
s == 'abc'          0.962        0.950        0.012         -1.2
s > 'abc'           0.959        0.955        0.004         -0.4
s is 4              0.371        0.332        0.039        -10.5
s == 4              0.932        0.922        0.010         -1.1
s > 4               0.936        0.927        0.009         -1.0
s is -1001          0.370        0.330        0.040        -10.8
s == -1001          0.932        0.923        0.009         -1.0
s > -1001           0.935        0.925        0.010         -1.1
s is 34.7           0.368        0.325        0.043        -11.7
s == 34.7           1.110        1.110        0.000          0.0
s > 34.7            1.110        1.110        0.000          0.0
s is 'a b c'        0.370        0.325        0.045        -12.2
s == 'a b c'        0.963        0.948        0.015         -1.6
s > 'a b c'         0.961        0.949        0.012         -1.2
s is True           0.529        0.490        0.039         -7.4
s == True           1.110        1.110        0.000          0.0
s > True            1.120        1.110        0.010         -0.9

s = -1000
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.371        0.326        0.045        -12.1
s == 'abc'          1.200        1.190        0.010         -0.8
s > 'abc'           1.200        1.190        0.010         -0.8
s is 4              0.349        0.350       -0.001          0.3
s == 4              0.347        0.353       -0.006          1.7
s > 4               0.349        0.347        0.002         -0.6
s is -1001          0.348        0.352       -0.004          1.1
s == -1001          0.349        0.352       -0.003          0.9
s > -1001           0.346        0.348       -0.002          0.6
s is 34.7           0.366        0.326        0.040        -10.9
s == 34.7           0.769        0.771       -0.002          0.3
s > 34.7            0.766        0.777       -0.011          1.4
s is 'a b c'        0.367        0.328        0.039        -10.6
s == 'a b c'        1.210        1.190        0.020         -1.7
s > 'a b c'         1.200        1.190        0.010         -0.8
s is True           0.536        0.490        0.046         -8.6
s == True           0.887        0.887        0.000          0.0
s > True            0.890        0.892       -0.002          0.2

s = 34.2
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.369        0.327        0.042        -11.4
s == 'abc'          1.630        1.620        0.010         -0.6
s > 'abc'           1.640        1.620        0.020         -1.2
s is 4              0.372        0.332        0.040        -10.8
s == 4              0.791        0.795       -0.004          0.5
s > 4               0.797        0.798       -0.001          0.1
s is -1001          0.375        0.331        0.044        -11.7
s == -1001          0.792        0.792        0.000          0.0
s > -1001           0.790        0.791       -0.001          0.1
s is 34.7           0.367        0.482       -0.115         31.3
s == 34.7           1.080        0.536        0.544        -50.4
s > 34.7            0.560        0.621       -0.061         10.9
s is 'a b c'        0.387        0.337        0.050        -12.9
s == 'a b c'        1.760        1.710        0.050         -2.8
s > 'a b c'         1.710        1.680        0.030         -1.8
s is True           0.614        0.509        0.105        -17.1
s == True           1.050        1.020        0.030         -2.9
s > True            1.060        1.020        0.040         -3.8

s = 'a b c'
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.379        0.345        0.034         -9.0
s == 'abc'          0.542        0.494        0.048         -8.9
s > 'abc'           0.586        0.593       -0.007          1.2
s is 4              0.430        0.344        0.086        -20.0
s == 4              1.260        1.230        0.030         -2.4
s > 4               1.370        1.230        0.140        -10.2
s is -1001          0.431        0.372        0.059        -13.7
s == -1001          1.250        1.640       -0.390         31.2
s > -1001           1.240        1.260       -0.020          1.6
s is 34.7           0.383        0.337        0.046        -12.0
s == 34.7           1.770        1.680        0.090         -5.1
s > 34.7            1.670        1.660        0.010         -0.6
s is 'a b c'        0.423        0.376        0.047        -11.1
s == 'a b c'        0.506        0.510       -0.004          0.8
s > 'a b c'         0.517        0.564       -0.047          9.1
s is True           0.550        0.514        0.036         -6.5
s == True           1.470        1.640       -0.170         11.6
s > True            1.450        1.430        0.020         -1.4

s = object()
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.389        0.379        0.010         -2.6
s == 'abc'          1.220        1.370       -0.150         12.3
s > 'abc'           1.220        2.600       -1.380        113.1
s is 4              0.427        0.349        0.078        -18.3
s == 4              1.080        1.620       -0.540         50.0
s > 4               1.060        1.070       -0.010          0.9
s is -1001          0.437        0.343        0.094        -21.5
s == -1001          1.070        1.130       -0.060          5.6
s > -1001           1.060        1.090       -0.030          2.8
s is 34.7           0.419        0.338        0.081        -19.3
s == 34.7           1.710        1.520        0.190        -11.1
s > 34.7            1.520        1.540       -0.020          1.3
s is 'a b c'        0.380        0.347        0.033         -8.7
s == 'a b c'        2.020        1.210        0.810        -40.1
s > 'a b c'         1.260        1.210        0.050         -4.0
s is True           0.622        0.515        0.107        -17.2
s == True           1.220        1.220        0.000          0.0
s > True            1.210        1.210        0.000          0.0

s = []
operation          before        after        delta         %chg
---------          ------        -----        -----         ----
s is 'abc'          0.369        0.326        0.043        -11.7
s == 'abc'          1.220        1.200        0.020         -1.6
s > 'abc'           1.220        1.200        0.020         -1.6
s is 4              0.372        0.332        0.040        -10.8
s == 4              1.160        1.150        0.010         -0.9
s > 4               1.150        1.150        0.000          0.0
s is -1001          0.371        0.334        0.037        -10.0
s == -1001          1.150        1.140        0.010         -0.9
s > -1001           1.150        1.150        0.000          0.0
s is 34.7           0.368        0.326        0.042        -11.4
s == 34.7           1.500        1.480        0.020         -1.3
s > 34.7            1.490        1.490        0.000          0.0
s is 'a b c'        0.366        0.325        0.041        -11.2
s == 'a b c'        1.220        1.200        0.020         -1.6
s > 'a b c'         1.220        1.200        0.020         -1.6
s is True           0.531        0.484        0.047         -8.9
s == True           1.360        1.350        0.010         -0.7
s > True            1.350        1.350        0.000          0.0

I fully expected that the "is" tests would be faster and without 
question the "==" and ">" tests would be slower.  I was quite 
surprised that this wasn't always the case.  The above tests were 
run on an 800MHz Powerbook G4 running Mac OSX 10.2.8.  I don't 
have immediate access in Intel hardware, though I'll try to run 
these tests on cygwin this week.  

I'd be happy to be shown that my shell script isn't measuring what 
I think it's measuring as well.

Skip


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-03-20 13:27

Message:
Logged In: YES 
user_id=80475

Even "is" and "is not" are not helped by more than a couple
of cycles.  This fragment essentially inlines part of code
for cmp_outcome().  Only the function call is saved.

It does slow down other code paths by introducing an
unpredictable branch.   

If the inlining were considered important, then the whole of
cmp_outcome() should be inlined.  Then, all comparisons save
a single call/return pair.  The cost is further increasing
the size of the eval loop.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-03-20 12:45

Message:
Logged In: YES 
user_id=31435

Well, there's little question that this will speed "is" and "is 
not", but it also slows all other cases by the cost of the 
switch-and-branch to determine that they're not the favored 
cases.  So why should we believe that speeding "is" and "is 
not" is more important than slowing other cases?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=918462&group_id=5470



More information about the Patches mailing list