[Python-Dev] proposal+patch: sys.gettickeraccumulation()

Ralf W. Grosse-Kunstleve rwgk at cci.lbl.gov
Sat Nov 13 05:20:03 CET 2004


Proposal:

  >>> sys.gettickeraccumulation.__doc__
  'getgettickeraccumulation() -> current count of bytecodes processed by the interpreter.'

Target environments:

  Python-based systems with (many) extension modules.

Motivation:

  A fast, easy, and non-intrusive method for estimating how much time
  is spent in Python code and how much time in extension modules.

  E.g. if numarray is used it is often difficult to estimate how much
  time is spent in the numarray extension and how much in
  application-specific Python code. Is it worth investing time
  reimplementing the application-specific code in C or C++? If most of
  the time is already spent in the numarray extension the answer will
  be no.

Method:

  Determination of "micro-seconds per tick:"

    time.time() / sys.gettickeraccumulation() * 1.e6

  A "tick" is the processing of one "Python virtual instruction."
  See also "setcheckinterval" here:
     http://www.python.org/doc/current/lib/module-sys.html

  A pure Python program will spent the bulk of the time interpreting
  bytecode. This will lead to a small value for "micro-seconds per
  tick." In contrast, if almost all the time is spent in extension
  modules, a calculation may run a long time without increasing
  sys.gettickeraccumulation(). For example:

                     time      ticks  time/tick in micro-seconds
      0.00% Python  0.008         20    405.908
     10.00% Python  0.072     444680      0.163
     20.00% Python  0.138     889340      0.155
     30.00% Python  0.203    1334000      0.152
    ...
    100.00% Python  0.659    4446620      0.148

  With higher resolution:

                     time      ticks  time/tick in micro-seconds
      0.00% Python  6.758         20 337893.748
      0.10% Python  0.728     444680      1.638
      0.20% Python  0.795     889340      0.894
      0.30% Python  0.837    1334000      0.627
      0.40% Python  0.923    1778660      0.519
      0.50% Python  0.978    2223320      0.440
      0.60% Python  1.091    2667980      0.409
      0.70% Python  1.111    3112640      0.357
      0.80% Python  1.191    3557300      0.335
      0.90% Python  1.236    4001960      0.309
      1.00% Python  1.315    4446620      0.296

  In real applications we observed times/tick of around 10 on the same
  platform, indicating that significant performance increases could
  only be achieved through extensive recoding in a compiled language.
  On the other hand if we see values smaller than 1 we know that
  significant performance increases are achievable with a reasonable
  effort.

Implementation:

  A full patch based on Python-2.4b2 is attached. The essence is this
  additional line in Python/ceval.c:

+			_Py_TickerAccumulation += _Py_CheckInterval - _Py_Ticker;

  Note that _Py_Ticker and _Py_CheckInterval exist already in the
  Python distribution. The impact of the additional code on the runtime
  performance of a pure Python program is minute. On Xeon/Linux the
  factor is smaller than 1.00005. The factor is even smaller if
  extension modules are used.

  The full patch, the patched files, and a complete Python distribution
  including the patched files can be found here:

    http://cci.lbl.gov/~rwgk/python/Python-2.4b2_ticker_patch
    http://cci.lbl.gov/~rwgk/python/Python-2.4b2_ticker_patched_files.tar.gz
    http://cci.lbl.gov/~rwgk/python/Python-2.4b2_ticker.tar.gz


diff -u -r Python-2.4b2/Include/ceval.h Python-2.4b2_ticker/Include/ceval.h
--- Python-2.4b2/Include/ceval.h	2004-10-10 19:40:35.000000000 -0700
+++ Python-2.4b2_ticker/Include/ceval.h	2004-11-12 18:16:28.000000000 -0800
@@ -68,6 +68,11 @@
 
 /* this used to be handled on a per-thread basis - now just two globals */
 PyAPI_DATA(volatile int) _Py_Ticker;
+#ifndef HAVE_LONG_LONG
+PyAPI_DATA(volatile long) _Py_TickerAccumulation;
+#else
+PyAPI_DATA(volatile PY_LONG_LONG) _Py_TickerAccumulation;
+#endif
 PyAPI_DATA(int) _Py_CheckInterval;
 
 /* Interface for threads.
diff -u -r Python-2.4b2/Objects/longobject.c Python-2.4b2_ticker/Objects/longobject.c
--- Python-2.4b2/Objects/longobject.c	2004-09-19 23:14:54.000000000 -0700
+++ Python-2.4b2_ticker/Objects/longobject.c	2004-11-12 18:20:53.000000000 -0800
@@ -38,6 +38,7 @@
 
 #define SIGCHECK(PyTryBlock) \
 	if (--_Py_Ticker < 0) { \
+		_Py_TickerAccumulation += _Py_CheckInterval - _Py_Ticker; \
 		_Py_Ticker = _Py_CheckInterval; \
 		if (PyErr_CheckSignals()) { PyTryBlock; } \
 	}
diff -u -r Python-2.4b2/PC/os2emx/python24.def Python-2.4b2_ticker/PC/os2emx/python24.def
--- Python-2.4b2/PC/os2emx/python24.def	2004-10-10 19:40:50.000000000 -0700
+++ Python-2.4b2_ticker/PC/os2emx/python24.def	2004-11-12 18:16:47.000000000 -0800
@@ -743,6 +743,7 @@
   "_Py_CheckRecursionLimit"
   "_Py_CheckInterval"
   "_Py_Ticker"
+  "_Py_TickerAccumulation"
 
 ; From python24_s.lib(compile)
   "PyCode_New"
diff -u -r Python-2.4b2/Python/ceval.c Python-2.4b2_ticker/Python/ceval.c
--- Python-2.4b2/Python/ceval.c	2004-10-10 19:40:50.000000000 -0700
+++ Python-2.4b2_ticker/Python/ceval.c	2004-11-12 18:22:55.000000000 -0800
@@ -373,6 +373,7 @@
 	pendinglast = j;
 
 	_Py_Ticker = 0;
+	_Py_TickerAccumulation = 0;
 	things_to_do = 1; /* Signal main loop */
 	busy = 0;
 	/* XXX End critical section */
@@ -476,6 +477,11 @@
    per thread, now just a pair o' globals */
 int _Py_CheckInterval = 100;
 volatile int _Py_Ticker = 100;
+#ifndef HAVE_LONG_LONG
+volatile long _Py_TickerAccumulation = 0;
+#else
+volatile PY_LONG_LONG _Py_TickerAccumulation = 0;
+#endif
 
 PyObject *
 PyEval_EvalCode(PyCodeObject *co, PyObject *globals, PyObject *locals)
@@ -776,6 +782,7 @@
                                    a try: finally: block uninterruptable. */
                                 goto fast_next_opcode;
                         }
+			_Py_TickerAccumulation += _Py_CheckInterval - _Py_Ticker;
 			_Py_Ticker = _Py_CheckInterval;
 			tstate->tick_counter++;
 #ifdef WITH_TSC
diff -u -r Python-2.4b2/Python/sysmodule.c Python-2.4b2_ticker/Python/sysmodule.c
--- Python-2.4b2/Python/sysmodule.c	2004-08-12 11:19:17.000000000 -0700
+++ Python-2.4b2_ticker/Python/sysmodule.c	2004-11-12 18:51:14.000000000 -0800
@@ -442,6 +442,20 @@
 "getcheckinterval() -> current check interval; see setcheckinterval()."
 );
 
+static PyObject *
+sys_gettickeraccumulation(PyObject *self, PyObject *args)
+{
+#ifndef HAVE_LONG_LONG
+	return PyInt_FromLong(_Py_TickerAccumulation + _Py_CheckInterval - _Py_Ticker);
+#else
+	return PyLong_FromLongLong(_Py_TickerAccumulation + _Py_CheckInterval - _Py_Ticker);
+#endif
+}
+
+PyDoc_STRVAR(gettickeraccumulation_doc,
+"gettickeraccumulation() -> current count of bytecodes processed by the interpreter."
+);
+
 #ifdef WITH_TSC
 static PyObject *
 sys_settscdump(PyObject *self, PyObject *args)
@@ -763,6 +777,8 @@
 	 setcheckinterval_doc}, 
 	{"getcheckinterval",	sys_getcheckinterval, METH_NOARGS,
 	 getcheckinterval_doc}, 
+	{"gettickeraccumulation", sys_gettickeraccumulation, METH_NOARGS,
+	 gettickeraccumulation_doc}, 
 #ifdef HAVE_DLOPEN
 	{"setdlopenflags", sys_setdlopenflags, METH_VARARGS, 
 	 setdlopenflags_doc},


More information about the Python-Dev mailing list