Optimizing C module

Paul Prescod paul at prescod.net
Wed Feb 9 15:56:15 EST 2000


The slowest thing in both PyExpat and Perl's equivalent is the "cross
over" between C and Python. I can think of a couple of ways to speed
that up but I want to know if they are feasible.

The basic pattern of the code I need to speed up is:

sometype Foo( somearg1, somearg2, ... ){
	sometype real_rc;
	pyargs = Py_BuildVale( someformat, somearg1, somearg2, ... );

	rc = PyEval_CallObject( callback, pyargs );
	Py_XDECREF( pyargs );

	real_rc = SomeConversion( rc );
	Py_XDECREF( rc );
	return real_rc;
}

The Py_BuildValue can't be so fast because it is parsing strings and
creating new heap objects. I'm wondering if I could keep a fixed-length
args tuple in my back pocket and just fill in the details for each
callback. I would really love to hang on to the string and int objects
in the array too. How about I could check the refcount after the
function call and only generate a "new one" if the refcount is >1. If
the refcount is 1, I would mutate the string and int objects under the
covers.

The next question is whether I can go even farther. Is there something I
can safely cache that is created in PyEval_CallObject and/or
eval_code2() (e.g. frame objects) so that the whole thing is just a
matter of setting a couple of values and jumping? 

I'm sure stackless Python has something to do with all of this but I
missed the talk at IPC8. It looks to me like eval_code2 would need to be
broken up to allow me to set up and reuse my own frame object. That's
probably part of what stackless does (the part I want!).
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"The calculus and the rich body of mathematical analysis to which it
gave rise made modern science possible, but it was the algorithm that
made possible the modern world." 
        - from "Advent of the Algorithm" David Berlinski
	http://www.opengroup.com/mabooks/015/0151003386.shtml




More information about the Python-list mailing list