Re: [SciPy-dev] Thoughts on weave improvements
Hey Pat, This is a very rough outline of what I thought of in the airplane rides home. All "sketch" of a solution below has holes, but is a start at merging pycod and weave. The basic concept is to create a new class or type that wraps a function. When a class wrapper object is called from Python, it will try and compile a new version of the function for the given types. This uses pycod and weave to do this. Instead of passing in the types (IntType, etc) the types are determined from the calling argument types. weaves catalog will be used to keep up with all the available compiled functions. The C pointer will also be kept around. f2py and friends could be made to recognize this type of wrapper object and ask it for the C function pointer instead of having to make python callbacks. We should work this all out in Python first and then move it to a C type later for speed. And, here are my notes from 30000 feet... A code fragment ready for compilation: foo_code = """ def foo(a,b): return a + b """ Generating a compiled function, foo: foo = weave.cod(foo_code) foo is now a callable function_object that handles all the specialization issues. I think this is how Psycho does its thing also -- perhaps in a fancier way. For now, it'll be a Python class with dictionaries to handle argument type issues. This is way slow, but it'll work as a proof of concept. When foo is called with unkwown types, it attempts to specialize and compile this function for the call types. The function is compiled to both a pure C function as well as a Python function that wraps it. The pure C function is never called directly by Python code, but it is useful to keep around to pass into other C/Fortran functions that need to call the function (like map or many Fortran optimization methods). For a and b as integers, the C code in the generated extension module would look something like: int c_foo(int a, int b) { return a + b; } PyObject* foo(PyObject* self, PyObject* args) { int result; PyObject* result_val = NULL; try { Py::Tuple t_args(args); int a = py_to_int(t_args[0]); int b = py_to_int(t_args[1]); // here is the function call to the real C function. result = c_foo(a,b); result_val = Py::new_reference_to(Py::Int(result)); } except(...) { result_val = Py::Null(); exception_occurred = 1; } if (result_val == NULL && !exception_occured) { result_val = Py_None; Py_XINCREF(result_val); } return result_val; } The function_object will just add the extension function to a list of available Python functions. Whenever the function is called as a Python function, it'll try each function in turn with the given arguments. If the function call fails with a ConversionError, the next function in the list is called. If we get to the end of the list without it working, then we compile a new function for the specified types and stick it on the front of the list. We also put the ptr for the underlying C function in the C function dictionary. func_obj = weave.pycompile(code) class function_object: def __init__(self,code): # not sure we need this, but I'll keep it for now. self.name = get_name(code) # store the code so we can compile it for various types. # we might store the bytecode also, but I doubt that is really useful. self.code = code # cataloging of C functions # key = type signature, value = actual function pointer self.c_funcs = {} # cataloging of Python functions # need to compile the function and add it as last option to call. self.py_funcs = [] self.py_cached = None def compile(self,*args): """ Compile a specialized version of the code for the given function types. This should be persisted also. Need to look at weave to see how we can make its functionality into a class -- maybe catalog is already well suited for this. """ # This is Pat Miller's function -- it needs to return: # function name: (maybe not completely necessary) # # type_info: some variable that tells the types of all the arguments # for the function. This could be as simple as a tuple # of strings, or, later, an array of integers that # specify types. This would be better for handling fast # in C code. # # wrapper_code: The code of the wrapper function i.e. the extension # function that returns a PyObject*. It is really # only in charge of type conversions and calls the # c_code function to do all the work. # # c_code: This is the heart of the function. It holds the C version # of the byte code for the given type. name, type_info, wrapper_code, c_code = translate_to_c(self.code,*args) # This is a slightly different that weave.inline because it returns a # Python extension function, and c_func a pointer. wrapper_func, c_func = weave.compile_wrap(name,wrapper_code, c_code) # now, stick catalog both the C and the Python functions for later use. self.py_catalog(wrapper_func) self.c_catalog(type_info, c_func) def py_catalog(self,func): """ Add a function to the python function list. This doesn't associate functions with a type information. This is for speed right now. Discovering types and then looking them up in a dictionary would be slow. In C, it might be a lot faster, so storing type information might be much more useful. """ self.py_funcs.insert(0,func) # cache for a fast calling. self.py_cached = func def c_catalog(self,type_info, func): """ Add a C funcion pointer to the function list. This is gonna be looked up in C if it is used (by map or something else), and then used many times, In C, the type_info can be fast to discover and it absolutely has to be so that a C func with the correct signature is called. """ self.c_funcs[type_info] = func def get_c_ptr(self,type_info): """ Grab the c_ptr for the given type signature. I guess if we ask for this, and it doesn't exist, we should build it by calling compile(). Do this later. This will be grabbed inside C wrapper functions and passed to a function like map or something like an optimization function. """ return self.c_funcs.get(type_info,None) def call_from_list(self,*args): """ Call each function in the list one after another until one with the correct signature is found. If all the functions fail, then through a conversion error. """ success = 0 for func in py_funcs: try: result = apply(func,*args) success = 1 except ConversionError: pass if not success: raise ConversionError retun result def __call__(self,*args): """ Call the function. Try the cached extension function first. If it fails because it has the incorrect type, try calling functions from the list of available extention functions. If all of these fail, compile an new version of the functions based on the current types, cache the resulting functions, and then call it. If it fails, we're out of options and we call the Python code directly. If this throws an exception, the user will get it. """ try: # Try calling the cached (last used function) result = apply(self.py_cached,*args) except ConversionError: # The cached function failed because it didn't have the correct # argument types. Now walk through all the functions try: self.call_from_list(*args) except ConversionError: try: # We walked through all the compiled functions, and they # all failed. Now try to compile a new one for the correct # version. self.compile(*args) apply(self.py_cached,*args) except: # If the compilation failed or the function call failed for # any reason, punt. Try executing the actual code as a # final resort. apply(self.python_version,*args) The above methodolgy is pretty slow for cache misses, but I think it could be sped up quite a bit by moving it to C and keeping track of types using some fast hash with type_info (in some byte format, not strings) as the key, and functions as the values. How would weave have to change to support this? Well, I think we need to encapsulate some of the work that is in inline in a class so we get a little reuse. For the time being, this'll cost us an extra Python function call, but it is worth it for the design process. Later we'll move to C and get rid of that expensive call. Jermey commented that C function calls were noticably expensive also, but I think we'll have to live with this. (Hmmm. Maybe the code is already pretty well structured. catalog and ext_tools might fit together quite well to handle both code generation and cataloging. Need to look.) The biggest change is that we need to put the C code that actually does all the work within a separate C function and then call it from the wrapper function instead of inserting the code directly within the wrapper. This isn't really that hard, but it does require another "code template" to be added to each of the type conversion classes. It'll also require a some change to the "function template". We should work to make this stuff used as similarly as possible across all the code, but I think that C functions for inline are gonna require pass by reference so that variables changed in the function are also changed in the wrappper, and standard extension functions are gonna require pass by value. This is probably pretty easy to deal with. Tasks: 1. Add machinery for returning changed variables from C to Python through the frameobject. (from the list, it looks like Pat already has looked at this. :) 2. Convert generation code so that it puts C code in a separate function. (Perhaps this should be optional so that people can save the call if they want to????) see ya, eric -- Eric Jones <eric at enthought.com> Enthought, Inc. [www.enthought.com and www.scipy.org] (512) 536-1057
eric wrote:
Hey Pat,
This is a very rough outline of what I thought of in the airplane rides home. All "sketch" of a solution below has holes, but is a start at merging pycod and weave. The basic concept is to create a new class or type that wraps a function. When a class wrapper object is called from Python, it
---------------------------------------------------------- Comments on the interface.... I prefer a weave method called accelerate rather than the mystifying COD, This way, we could add in bytecode optimization and function inlining and clever bits like that through the same interface. ---------------------------------------------------------- For individual functions, the interface might look like: import weave def f(a,b): << some big hairy thing >> f = weave.accelerate(f) ---------------------------------------------------------- I think we should allow users to specify signatures if they want to... ---------------------------------------------------------- f = weave.accelerate(f,signatures=[[FloatType],[IntType]]) where signature is a suggested list of types to compile for. If you wanted to allow other types it could be: f = weave.accelerate(f,signatures=[[FloatType],[IntType],[None]]) where None indicates "Any type is OK". The default for signatures would be equivalent to [[None]], so is equivalent to Eric's scheme. Having a None signature means that as a last resort, the system will call the original code object which we can carefully squirrel away. This means never having to say you're sorry on a call! The default implementation for weave.accelerate is def accelerate(self,f, signatures=None): return f on systems where we don't have a compiler.
participants (2)
-
eric
-
Pat Miller