numarray.random_array number generation in C code
Dear People, I want to write some C++ code to link with Python, using the Boost.Python interface. I need to generate random numbers in the C++ code, and I was wondering as to the best way of doing this. Note that it is important that the random number generation interoperate seamlessly with Python, in the sense that the behavior of the calls to the RNG is the same whether calls are made at the C level or the Python level. I hope the reasons why this is important are obvious. I was thinking that the method should go like this. 1) When C/C++ code called, reads seed from python random state. 2) Does its stuff. 3) Writes seed back to python level when it exits. After doing a little investigation of the numarray.random_array python library and associated extension modules, it seems possible that the answer is simpler than I had supposed. However, I would appreciate it if someone would tell me if my understanding is incorrect in some places. Summary: It seems that I can just call all the C entry point routines defined in ranlib.h, without worrying about getting or setting seeds. Rationale: The structure of this random number facility has three parts, all files in Packages/RandomArray2/Src. 1) low-level C routines: Packages/RandomArray2/Src/com.c and Packages/RandomArray2/Src/ranlib.c. com.c: basic RNG stuff; getting and setting seeds etc. ranlib.c: Random number generator algorithms for different distributions etc. 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. This interfaces the stuff in com.c and ranlib.c. 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. This wraps the C interface. In most cases it does not do much else besides some basic argument error checking.
From my perspective, the important thing is that the random number seed is only defined at C level as a static object, all the RNG stuff happens at C level, and the Python code just calls the C code as necessary. (I'm sketchy about the details of what is defined as the seed etc.)
This is in contrast with the R RNG facility (the only other RNG facility I am familiar with), which uses macros SetRNGstate() and GetRNGstate() to read and write the seed, which is defined at R level. Therefore, the upshot is that the C routines in ranlib.h read and write the same seed as the python level functions do, so no special action is necessary with regard to the seed. Is this correct? In any case, it would be nice if something like the above was documented, so lost souls like myself don't have to go trawling through the source code to figure out what is going on. Of course it is nice that the source code is available, otherwise even that would be impossible. R documents this stuff in the "Writing R Extensions" manual, online at http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray manual could have a small section about this too. Regards, Faheem.
Faheem Mitha wrote:
Dear People,
I want to write some C++ code to link with Python, using the Boost.Python interface. I need to generate random numbers in the C++ code, and I was wondering as to the best way of doing this.
Note that it is important that the random number generation interoperate seamlessly with Python, in the sense that the behavior of the calls to the RNG is the same whether calls are made at the C level or the Python level. I hope the reasons why this is important are obvious.
I was thinking that the method should go like this.
1) When C/C++ code called, reads seed from python random state.
2) Does its stuff.
3) Writes seed back to python level when it exits.
After doing a little investigation of the numarray.random_array python library and associated extension modules, it seems possible that the answer is simpler than I had supposed. However, I would appreciate it if someone would tell me if my understanding is incorrect in some places.
Summary: It seems that I can just call all the C entry point routines defined in ranlib.h, without worrying about getting or setting seeds.
Rationale:
The structure of this random number facility has three parts, all files in Packages/RandomArray2/Src.
1) low-level C routines: Packages/RandomArray2/Src/com.c and Packages/RandomArray2/Src/ranlib.c.
com.c: basic RNG stuff; getting and setting seeds etc. ranlib.c: Random number generator algorithms for different distributions etc.
2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.
This interfaces the stuff in com.c and ranlib.c.
3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.
This wraps the C interface. In most cases it does not do much else besides some basic argument error checking.
From my perspective, the important thing is that the random number seed is only defined at C level as a static object, all the RNG stuff happens at C level, and the Python code just calls the C code as necessary. (I'm sketchy about the details of what is defined as the seed etc.)
This is in contrast with the R RNG facility (the only other RNG facility I am familiar with), which uses macros SetRNGstate() and GetRNGstate() to read and write the seed, which is defined at R level.
Therefore, the upshot is that the C routines in ranlib.h read and write the same seed as the python level functions do, so no special action is necessary with regard to the seed.
Is this correct?
In any case, it would be nice if something like the above was documented, so lost souls like myself don't have to go trawling through the source code to figure out what is going on. Of course it is nice that the source code is available, otherwise even that would be impossible.
R documents this stuff in the "Writing R Extensions" manual, online at http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray manual could have a small section about this too.
Regards, Faheem.
I'm not sure I understand what you want to do. Do you want to link directly to the extension code from your C++ code? If so I'm wondering why. It would make the most sense if the C++ code needed obtain small numbers of random numbers in some iterative loop, and you wish to use the same random number library that that numarray is using. Otherwise, I would normally obtain the random number array in python, then call the C++ extension. Perhaps I didn't read carefully enough. Normally linking to an extension module involves some hacks that I'm not sure were done for the randomarray module (the gory details are in the python docs for extension modules), Todd can check on that, I'm not sure I will have time (a superficial check seems to indicate that it doesn't support direct linking, though one could link to the underlying library I suppose). As an aside, it is likely that a better module can be done as some have suggested, we just took what Numeric had at the time. Doing that is not a high priority with us at the moment (anyone else want to tackle that?). Right now integration with scipy is our biggest priority so things like this will have to take a back seat for a while. Furthermore, we did what we needed to to port these modules from Numeric, but that didn't necessarily make us experts in how they worked. I wish we were, but we've generally been directing our energy elsewhere. I'd presume that the sensible way for the module to work is to initialize its seed from a time-based seed in the absence of any other seed initialization, and to keep the seed state in the extension module, but I could be wrong. Perry
On Tue, 5 Oct 2004, Perry Greenfield wrote:
I'm not sure I understand what you want to do. Do you want to link directly to the extension code from your C++ code?
Yes.
If so I'm wondering why. It would make the most sense if the C++ code needed obtain small numbers of random numbers in some iterative loop, and you wish to use the same random number library that that numarray is using.
I need to obtain an arbitrary (not known in advance) number of random numbers in the C++ code. I'm thinking of using the same random number library mostly because I assumed that using the same seed across the python/C interface would be supported. This is how it works in R (the only other place I have used this). Also, I had been using the same routines in the Python code I'm trying to convert to C++, so it would be a relatively smooth transfer. If I was to use a pure C/C++ library, I'd have to worry about copying the seed back and forth between Python and C. Is this what I'll have to do then?
Otherwise, I would normally obtain the random number array in python, then call the C++ extension.
Yes, this is what everyone suggests. But in my case, the number of random variates required is not known in advance. I get the feeling this situation does not arise very often for most people, but I work with stochastic processes which terminate according to some stopping criterion, and that is the standard situation in this case. Also generating these numbers in Python would give rise to serious performance issues.
Perhaps I didn't read carefully enough. Normally linking to an extension module involves some hacks that I'm not sure were done for the randomarray module (the gory details are in the python docs for extension modules), Todd can check on that, I'm not sure I will have time (a superficial check seems to indicate that it doesn't support direct linking, though one could link to the underlying library I suppose).
Hmm. Well, this is unwelcome news. You mean I cannot link to ranlib.so? I assumed that including the ranlib.h header and linking my C++ module against ranlib.so would be enough. I suppose that was too optimistic.
As an aside, it is likely that a better module can be done as some have suggested, we just took what Numeric had at the time. Doing that is not a high priority with us at the moment (anyone else want to tackle that?). Right now integration with scipy is our biggest priority so things like this will have to take a back seat for a while.
Furthermore, we did what we needed to to port these modules from Numeric, but that didn't necessarily make us experts in how they worked. I wish we were, but we've generally been directing our energy elsewhere. I'd presume that the sensible way for the module to work is to initialize its seed from a time-based seed in the absence of any other seed initialization, and to keep the seed state in the extension module, but I could be wrong.
Yes. That is how R does it, anyway. Specifically, you declare the seed static, and then it persists across the Python/C interface. That is what I thought you had in the numarray code. Would it be hard to make it work like this? I'm no expert either. Faheem.
On Tue, 2004-10-05 at 21:10, Perry Greenfield wrote:
Faheem Mitha wrote:
Dear People,
I want to write some C++ code to link with Python, using the Boost.Python interface. I need to generate random numbers in the C++ code, and I was wondering as to the best way of doing this.
Note that it is important that the random number generation interoperate seamlessly with Python, in the sense that the behavior of the calls to the RNG is the same whether calls are made at the C level or the Python level. I hope the reasons why this is important are obvious.
I was thinking that the method should go like this.
1) When C/C++ code called, reads seed from python random state.
2) Does its stuff.
3) Writes seed back to python level when it exits.
After doing a little investigation of the numarray.random_array python library and associated extension modules, it seems possible that the answer is simpler than I had supposed. However, I would appreciate it if someone would tell me if my understanding is incorrect in some places.
Summary: It seems that I can just call all the C entry point routines defined in ranlib.h, without worrying about getting or setting seeds.
Rationale:
The structure of this random number facility has three parts, all files in Packages/RandomArray2/Src.
1) low-level C routines: Packages/RandomArray2/Src/com.c and Packages/RandomArray2/Src/ranlib.c.
com.c: basic RNG stuff; getting and setting seeds etc. ranlib.c: Random number generator algorithms for different distributions etc.
2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.
This interfaces the stuff in com.c and ranlib.c.
3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.
This wraps the C interface. In most cases it does not do much else besides some basic argument error checking.
From my perspective, the important thing is that the random number seed is only defined at C level as a static object, all the RNG stuff happens at C level, and the Python code just calls the C code as necessary. (I'm sketchy about the details of what is defined as the seed etc.)
This is in contrast with the R RNG facility (the only other RNG facility I am familiar with), which uses macros SetRNGstate() and GetRNGstate() to read and write the seed, which is defined at R level.
Therefore, the upshot is that the C routines in ranlib.h read and write the same seed as the python level functions do, so no special action is necessary with regard to the seed.
Is this correct?
In any case, it would be nice if something like the above was documented, so lost souls like myself don't have to go trawling through the source code to figure out what is going on. Of course it is nice that the source code is available, otherwise even that would be impossible.
R documents this stuff in the "Writing R Extensions" manual, online at http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray manual could have a small section about this too.
Regards, Faheem.
I'm not sure I understand what you want to do. Do you want to link directly to the extension code from your C++ code? If so I'm wondering why. It would make the most sense if the C++ code needed obtain small numbers of random numbers in some iterative loop, and you wish to use the same random number library that that numarray is using. Otherwise, I would normally obtain the random number array in python, then call the C++ extension. Perhaps I didn't read carefully enough. Normally linking to an extension module involves some hacks that I'm not sure were done for the randomarray module (the gory details are in the python docs for extension modules), Todd can check on that,
I checked and there's no C level export of the ranlib interface, at least not in the "hacked" sense of an extension module C-API where the linkage is made indirect via an API pointer and bizarre macros.
I'm not sure I will have time (a superficial check seems to indicate that it doesn't support direct linking, though one could link to the underlying library I suppose).
Ordinary C linkage to numarray.random_array.ranlib2 may be supported since as an extension it is also a shared library, but I've never tried it myself and I wonder if it would actually work. If anyone has tried something like that I'd be interested in hearing how it turned out. Without a really compelling reason, I'd avoid it myself. Regards, Todd
participants (3)
-
Faheem Mitha -
Perry Greenfield -
Todd Miller