Hello everyone, For my internship<http://morepypy.blogspot.com/2012/12/pypy-related-internship-at-ncar.html>, I am working on implementing a solver for partial differential equations in RPython <https://github.com/seanfisk/rpython-stencil-language>. I am investigating the possibility of parallelizing the code using multi-threading. I have found the RPython's basic threading library, rpython/rlib/rthread.py<https://bitbucket.org/pypy/pypy/src/a75c05b8580ec08fb12777332492254c57ebe0aa/rpython/rlib/rthread.py?at=default>, and am attempting to get a "Hello world!" threading program up and running. I have successfully been able to start threads that do really simple things like printing literal messages. However, I would now like to be able to pass parameters to the threads. The only example I can seem to find of this library in use is the threading support in PyPy (at pypy/module/thread/os_thread.py<https://bitbucket.org/pypy/pypy/src/a75c05b8580ec08fb12777332492254c57ebe0aa/pypy/module/thread/os_thread.py?at=default>). I‘ve read through the code a number of times and am using it as a reference. However, there are some features I don’t need and some things that simply won't run in my interpreter (e.g., everything involving spaces). Can anybody point me in the right direction as to passing parameters to threads? I know I need something similar to the Bootstrapper to synchronize my parameters, but everything I've tried so far has either not synchronized or segfaulted. The current code is segfaulting around rthread.gc_thread_die(). I can do some more digging on this if necessary. Thanks in advance for help anyone might offer. ~ Sean $ python --version Python 2.7.3 (5acfe049a5b0, May 21 2013, 13:47:22) [PyPy 2.0.2 with GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] Here is the current code: # rthread_hello.py## Compile with: rpython --thread rthread_hello.py from rpython.rlib import rthread # "Library" code class Arguments(object): pass class FunctionData(object): args = Arguments() func = None lock = None _function_data = FunctionData() def _thread_func_wrapper(): rthread.gc_thread_start() func = _function_data.func args = _function_data.args _function_data.lock.release() func(args) rthread.gc_thread_die() def run_in_thread(func, args): _function_data.func = func _function_data.args = args if _function_data.lock is None: _function_data.lock = rthread.allocate_lock() _function_data.lock.acquire(True) rthread.gc_thread_prepare() rthread.start_new_thread(_thread_func_wrapper, ()) # "User" code def func_to_call(args): print 'Within the function' print args.first print args.second def main(argv): for i in xrange(5): args = Arguments() # Create some dummy arguments. args.first = i args.second = i * 10 # Start the thread. run_in_thread(func_to_call, args) return 0 def target(x, y): return main, None
Hi Sean, On Fri, Jul 5, 2013 at 8:30 PM, Sean Fisk <seanfisk@gmail.com> wrote:
For my internship, I am working on implementing a solver for partial differential equations in RPython. I am investigating the possibility of parallelizing the code using multi-threading.
Sorry, RPython is the wrong tool. It doesn't support multithreading any more than CPython and PyPy: it relies on the GIL to avoid crashing, including (but not limited to) everything related to the GC. The best you can do is dividing the problem among multiple processes. A bientôt, Armin.
On Fri, Jul 05, 2013 at 11:17:15PM +0200, Armin Rigo wrote:
Hi Sean,
On Fri, Jul 5, 2013 at 8:30 PM, Sean Fisk <seanfisk@gmail.com> wrote:
For my internship, I am working on implementing a solver for partial differential equations in RPython. I am investigating the possibility of parallelizing the code using multi-threading.
Sorry, RPython is the wrong tool. It doesn't support multithreading any more than CPython and PyPy: it relies on the GIL to avoid crashing, including (but not limited to) everything related to the GC. The best you can do is dividing the problem among multiple processes.
Hi Sean, Your internship sounds very exciting! RPython is frowned upon for general coding (I believe because it is not a stable supported language target, it requires great care for managing the various resources, and provides little benefit over targetting pypy). I would modestly suggest that instead you should implement your solver on pypy and use your new found rpython skills to optimise the part of the code which is performing poorly (or at least feed back concise benchmarks into pypy). I think that the work on numpypy would have a considerable overlap with your needs. Ping me or the list if you need help, also tell us more about your project and its goals! njh
Hi Sean, On Fri, Jul 5, 2013 at 11:17 PM, Armin Rigo <arigo@tunes.org> wrote:
Sorry, RPython is the wrong tool. It doesn't support multithreading (...)
I've updated the FAQ entry here to try to explain a bit more precisely the reason and motivations for which we, the "core PyPy group" of people, keep answering the question "How do I use RPython to..." with "Don't!". http://doc.pypy.org/en/latest/faq.html#do-i-have-to-rewrite-my-programs-in-r... A bientôt, Armin.
Nathan and Armin: Thank you both for the advice! Sorry, I didn't make it clear: my project is actually an interpreter for a stencil-based language <https://stencil-language.readthedocs.org/en/latest/lang_spec.html>that is used for solving partial differential equations. So I think that RPython is the right tool to an extent. It worked very well for writing the interpreter (I also used Alex Gaynor's rply <https://github.com/alex/rply>). However, now I would like to parallelize the solver (i.e., the matrix operations). I sounds as if this is where RPython stops being the right tool for the job, since threading is useful only for concurrency and not speed, as in Python. Thank you for letting me know before I spent too much time on it! I suppose I should have guessed from the code I was writing. I am seeing two options, on which I welcome comment: - Continue using RPython and use rffi<http://doc.pypy.org/en/latest/rffi.html>to make calls to C, where I implement the solver function for high speed, optionally with threading or message-passing. I've used pthreads, OpenMP, and MPI a number of times before, but combining that with rffi might be a little over my head. - Switch to using PyPy with multiprocessing and/or numpypy. Happily, my code runs under PyPy and CPython albeit with some RPython libraries and some RPython-specific workarounds (e.g., boxes<https://github.com/seanfisk/rpython-stencil-language/blob/master/stencil_lang/structures.py#L43> ). I will talk to my advisor, and other ideas are certainly welcome! Thank you, Sean On Sun, Jul 7, 2013 at 1:42 AM, Armin Rigo <arigo@tunes.org> wrote: Hi Sean,
On Fri, Jul 5, 2013 at 11:17 PM, Armin Rigo <arigo@tunes.org> wrote:
Sorry, RPython is the wrong tool. It doesn't support multithreading (...)
I've updated the FAQ entry here to try to explain a bit more precisely the reason and motivations for which we, the "core PyPy group" of people, keep answering the question "How do I use RPython to..." with "Don't!".
http://doc.pypy.org/en/latest/faq.html#do-i-have-to-rewrite-my-programs-in-r...
A bientôt,
Armin.
-- Sean Fisk
Hi Sean, On Sun, Jul 7, 2013 at 10:23 PM, Sean Fisk <seanfisk@gmail.com> wrote:
Sorry, I didn't make it clear: my project is actually an interpreter for a stencil-based language that is used for solving partial differential equations. So I think that RPython is the right tool to an extent. It worked very well for writing the interpreter (I also used Alex Gaynor's rply). However, now I would like to parallelize the solver (i.e., the matrix operations).
I sounds as if this is where RPython stops being the right tool for the job, since threading is useful only for concurrency and not speed, as in Python. Thank you for letting me know before I spent too much time on it! I suppose I should have guessed from the code I was writing.
Sorry for not answering you earlier. The point is that, indeed, you are right and I have no obvious solution to propose. RPython is really designed to write high-level interpreters in, and so it doesn't worry about real multicore usage. (In fact, the STM approach currently worked on is about giving multicore usage anyway, but without changing the basic model of a GIL, which is not what you need.) It all depends on what the performance characteristics of your interpreter are. Do you spend a lot of time in the interpreter itself, or rather in library code? In the latter case, RPython is not the best approach --- e.g. its JIT is useless, and it's a case where multithreading would really help. I'm guessing that your interpreter is somewhere in the middle...
I am seeing two options, on which I welcome comment:
Continue using RPython and use rffi to make calls to C, where I implement the solver function for high speed, optionally with threading or message-passing. I've used pthreads, OpenMP, and MPI a number of times before, but combining that with rffi might be a little over my head.
Switch to using PyPy with multiprocessing and/or numpypy. Happily, my code runs under PyPy and CPython albeit with some RPython libraries and some RPython-specific workarounds (e.g., boxes).
If you want to run several regular Python threads and still get multicore usage, one of the reasonably easy approaches right now would be to write pure Python code containing calls to C functions written with CFFI's verify(). These calls are done with the GIL released (both on CPython and PyPy). If instead you'd rather have a single Python thread driving calls to C functions that internally spawn several threads, then indeed OpenMP or MPI seems better. In this case it's easier to embed into an RPython program with rffi. You basically have to write C code that exposes some simple single-threaded API to the RPython program. You would use threads only internally, on the C side. A bientôt, Armin.
On Tue, Jul 30, 2013 at 3:38 AM, Armin Rigo <arigo@tunes.org> wrote:
Hi Sean,
On Sun, Jul 7, 2013 at 10:23 PM, Sean Fisk <seanfisk@gmail.com> wrote:
Sorry, I didn't make it clear: my project is actually an interpreter for a stencil-based language that is used for solving partial differential equations. So I think that RPython is the right tool to an extent. It worked very well for writing the interpreter (I also used Alex Gaynor's rply). However, now I would like to parallelize the solver (i.e., the matrix operations).
I sounds as if this is where RPython stops being the right tool for the job, since threading is useful only for concurrency and not speed, as in Python. Thank you for letting me know before I spent too much time on it! I suppose I should have guessed from the code I was writing.
Sorry for not answering you earlier. The point is that, indeed, you are right and I have no obvious solution to propose. RPython is really designed to write high-level interpreters in, and so it doesn't worry about real multicore usage. (In fact, the STM approach currently worked on is about giving multicore usage anyway, but without changing the basic model of a GIL, which is not what you need.)
It all depends on what the performance characteristics of your interpreter are. Do you spend a lot of time in the interpreter itself, or rather in library code? In the latter case, RPython is not the best approach --- e.g. its JIT is useless, and it's a case where multithreading would really help. I'm guessing that your interpreter is somewhere in the middle...
I am in the process of profiling and running benchmarks right now, but that is absolutely something I would need to find out before starting on any parallel solution.
I am seeing two options, on which I welcome comment:
Continue using RPython and use rffi to make calls to C, where I implement the solver function for high speed, optionally with threading or message-passing. I've used pthreads, OpenMP, and MPI a number of times before, but combining that with rffi might be a little over my head.
Switch to using PyPy with multiprocessing and/or numpypy. Happily, my code runs under PyPy and CPython albeit with some RPython libraries and some RPython-specific workarounds (e.g., boxes).
If you want to run several regular Python threads and still get multicore usage, one of the reasonably easy approaches right now would be to write pure Python code containing calls to C functions written with CFFI's verify(). These calls are done with the GIL released (both on CPython and PyPy).
If instead you'd rather have a single Python thread driving calls to C functions that internally spawn several threads, then indeed OpenMP or MPI seems better. In this case it's easier to embed into an RPython program with rffi. You basically have to write C code that exposes some simple single-threaded API to the RPython program. You would use threads only internally, on the C side.
These are both very reasonable suggestions. Thank you for clearly spelling out my best options. Unfortunately, my internship is coming to an end, and I am unsure whether I will get the chance to implement them. If I do implement them, however, I'll make a point to write about my experience so that others can benefit from these solutions. Thank you for the answer, Armin. - Sean
A bientôt,
Armin.
participants (3)
-
Armin Rigo
-
Nathan Hurst
-
Sean Fisk