"Christopher" == Christopher Fonnesbeck <chris@fonnesbeck.org> writes:
Christopher> I am already using pieces of SciPy in my Markov chain Christopher> Monte Carlo package (PyMC), mostly for plotting Christopher> functionality. I would also like to exploit the Christopher> distributions implemented in scipy.stats, but they Christopher> are far too slow for use in statistical simulation Christopher> applications like MCMC, where millions of random Christopher> draws may be taken. Therefore, I am thinking of Christopher> implementing many of these distributions (at least Christopher> the common ones) as C or Fortran extensions. I am Christopher> unsure whether to use Fortran through f2py for this Christopher> task, or C through weave.inline (for example). I have Christopher> used both in the past for various tasks, and was Christopher> generally happy with both. Any suggestions? I've wrote some code to fill Numeric arrays with levy distributions from gsl. GSL has many, many distributions - http://www.gnu.org/software/gsl/manual/gsl-ref_19.html#SEC284 Here's a sample test script from levy import randlevy, seed gamma = 0.003501 alpha = 1.341702 seed(1235) x = randlevy(100000, alpha, gamma) print min(x), max(x) Seeding is optional - the rng is seeded by the clock if you don't explicitly call seed. Below is the module code which you can use as a template if you're interested. I think it would be nice to extend this code to include all the gsl distributions, but haven't taken the time. pygsl wraps these distributions, but suffers from the repeated call overhead that is hurting you. It would be a natural extension of pygsl to include initializing numeric arrays from the distributions. #include "Python.h" #include "Numeric/arrayobject.h" #include <stdio.h> #include <time.h> #include <gsl/gsl_rng.h> #include <gsl/gsl_randist.h> #include <gsl/gsl_errno.h> static PyObject *ErrorObject; // get x[i] from a 1d array of type xtype #define get1d(x,i,xtype) \ *(xtype *)(x->data+i*x->strides[0]) int _levymodRNGSeeded = 0; gsl_rng * _levymodRNG; static PyObject * seed(PyObject *self, PyObject *args) { const gsl_rng_type * T; int N; T = gsl_rng_default; _levymodRNG = gsl_rng_alloc(T); gsl_rng_env_setup(); if (args==Py_None) { //printf("Default seed\n"); time_t t1; (void) time(&t1); srand48((long) t1); N = (int)t1; } else if (!PyArg_ParseTuple(args, "i", &N)) return NULL; //printf("Setting seed %d\n", N); gsl_rng_set (_levymodRNG, N); _levymodRNGSeeded = 1; Py_INCREF(Py_None); return Py_None; } static PyObject * randlevy(PyObject *self, PyObject *args) { if (_levymodRNGSeeded == 0) { seed(self, Py_None); } int dimensions[1]; int i; PyArrayObject *x; int N; float alpha; float gamma; if (!PyArg_ParseTuple(args, "iff", &N, &alpha, &gamma)) return NULL; dimensions[0] = N; x = (PyArrayObject *)PyArray_FromDims(1,dimensions,PyArray_FLOAT); for (i=0; i<N; ++i) { get1d(x,i,float) = gsl_ran_levy(_levymodRNG, gamma, alpha); } return Py_BuildValue("N", x); } static struct PyMethodDef _levy_methods[] = { {"randlevy", (PyCFunction)randlevy, METH_VARARGS, "Return a numeric array of levy numbers."}, {"seed", (PyCFunction)seed, METH_VARARGS, "Seed the random number generator."}, {NULL, NULL} /* sentinel */ }; DL_EXPORT(void) init_levy(void) { PyObject *m, *d; /* Create the module and add the functions */ m = Py_InitModule4("_levy", _levy_methods, "Fill Numeric arrays with random numbers ", (PyObject*)NULL,PYTHON_API_VERSION); /* Import the array object */ import_array(); /* Add some symbolic constants to the module */ d = PyModule_GetDict(m); ErrorObject = PyString_FromString("_levy.error"); PyDict_SetItemString(d, "error", ErrorObject); /* Check for errors */ if (PyErr_Occurred()) Py_FatalError("can't initialize module _levy"); }