Automatic SIMD vectorization

Hello all! I just read with great interest the blog post "Automatic SIMD vectorization support in PyPy". Please, I have a few questions: - Does regular Python code benefit from the vectorization? I mean, the article on one hand says "it is not specifically targeted for the NumPy library" but on the other it says "Any interpreter (written in RPython)". - I would like to write a vector class as much suitable for PyPy as possible, what approach should I take in order to implement it? For example, what would suit PyPy JIT the best: class Vector3d: def __init__(a, x, y, z): (a.x, a.y, a.z) = x, y, z def __add__(a, b): return Vector3d(a.x + b.x, a.y+b.y, a.z+b.z) def add1(a, b): (ax, ay, az) = a (bx, by, bz) = b return [ax + bx, ay + by, az + bz] def add2(a, b): (ax, ay, az) = a (bx, by, bz) = b return (ax + bx, ay + by, az + bz) def add3((ax, ay, az), (bx, by, bz)): return (ax + bx, ay + by, az + bz) def add3: ??? - Is NumPyPy going to be included with regular PyPy download/install? Thanks a lot in advance!

Hi, glad you liked the post! See the answers below... On 10/20/2015 04:20 PM, Tuom Larsen wrote:
Hello all!
I just read with great interest the blog post "Automatic SIMD vectorization support in PyPy".
Please, I have a few questions:
- Does regular Python code benefit from the vectorization? I mean, the article on one hand says "it is not specifically targeted for the NumPy library" but on the other it says "Any interpreter (written in RPython)".
Speaking about 'regular' Python code, there is potential !BUT! only if enough time is spent in numeric code. What I meant in the article was: E.g. you have a vector construct in your language (like the one the R language has) you could use the optimization to vectorize operations on the variables that represent. Take a look at my test virtual machine implementing a small subset of R. https://bitbucket.org/plan_rich/vecopt-test-vm
- I would like to write a vector class as much suitable for PyPy as possible, what approach should I take in order to implement it? For example, what would suit PyPy JIT the best:
class Vector3d: def __init__(a, x, y, z): (a.x, a.y, a.z) = x, y, z def __add__(a, b): return Vector3d(a.x + b.x, a.y+b.y, a.z+b.z)
def add1(a, b): (ax, ay, az) = a (bx, by, bz) = b return [ax + bx, ay + by, az + bz]
def add2(a, b): (ax, ay, az) = a (bx, by, bz) = b return (ax + bx, ay + by, az + bz)
def add3((ax, ay, az), (bx, by, bz)): return (ax + bx, ay + by, az + bz)
def add3: ???
I have made some tests with this already. You would need to use the array module. Python lists would also work, but they leave behind some instructions that are not well optimized. It is described in this post: http://pypyvecopt.blogspot.co.at/2015/08/gsoc-vec-little-brother-of-numpy-ar... The missing piece is then the --jit vec_all=1 parameter, that you must specify on the command line. Be aware: 1) vec_all=1 parameter might lead to a crash, I have tested it, but it occurred to me that it is not really ready for production. I'm still working on this, thus it is disabled by default. 2) that with a very low number of vector elements (e.g 3) the overhead to iterate the first iteration is quite significant. So I do not think that there is much you can get just executing 1 vector add in parallel on x86.
- Is NumPyPy going to be included with regular PyPy download/install?
NumPyPy is included in a normal PyPy release version
Thanks a lot in advance! _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
Cheers, Richard

Awesome, thank you! On Tue, Oct 20, 2015 at 4:41 PM, Richard Plangger <planrichi@gmail.com> wrote:
Hi,
glad you liked the post! See the answers below...
On 10/20/2015 04:20 PM, Tuom Larsen wrote:
Hello all!
I just read with great interest the blog post "Automatic SIMD vectorization support in PyPy".
Please, I have a few questions:
- Does regular Python code benefit from the vectorization? I mean, the article on one hand says "it is not specifically targeted for the NumPy library" but on the other it says "Any interpreter (written in RPython)".
Speaking about 'regular' Python code, there is potential !BUT! only if enough time is spent in numeric code. What I meant in the article was: E.g. you have a vector construct in your language (like the one the R language has) you could use the optimization to vectorize operations on the variables that represent.
Take a look at my test virtual machine implementing a small subset of R. https://bitbucket.org/plan_rich/vecopt-test-vm
- I would like to write a vector class as much suitable for PyPy as possible, what approach should I take in order to implement it? For example, what would suit PyPy JIT the best:
class Vector3d: def __init__(a, x, y, z): (a.x, a.y, a.z) = x, y, z def __add__(a, b): return Vector3d(a.x + b.x, a.y+b.y, a.z+b.z)
def add1(a, b): (ax, ay, az) = a (bx, by, bz) = b return [ax + bx, ay + by, az + bz]
def add2(a, b): (ax, ay, az) = a (bx, by, bz) = b return (ax + bx, ay + by, az + bz)
def add3((ax, ay, az), (bx, by, bz)): return (ax + bx, ay + by, az + bz)
def add3: ???
I have made some tests with this already. You would need to use the array module. Python lists would also work, but they leave behind some instructions that are not well optimized. It is described in this post:
http://pypyvecopt.blogspot.co.at/2015/08/gsoc-vec-little-brother-of-numpy-ar...
The missing piece is then the --jit vec_all=1 parameter, that you must specify on the command line.
Be aware:
1) vec_all=1 parameter might lead to a crash, I have tested it, but it occurred to me that it is not really ready for production. I'm still working on this, thus it is disabled by default.
2) that with a very low number of vector elements (e.g 3) the overhead to iterate the first iteration is quite significant. So I do not think that there is much you can get just executing 1 vector add in parallel on x86.
- Is NumPyPy going to be included with regular PyPy download/install?
NumPyPy is included in a normal PyPy release version
Thanks a lot in advance! _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
Cheers, Richard _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev

You're welcome. I forgot to mention that I'm happy to help if vec_all crashes. So if you give it a shot and get stuck get in touch with me here (or issues on bitbucket)! Cheers, Richard On 10/20/2015 05:08 PM, Tuom Larsen wrote:
Awesome, thank you!
On Tue, Oct 20, 2015 at 4:41 PM, Richard Plangger <planrichi@gmail.com> wrote:
Hi,
glad you liked the post! See the answers below...
On 10/20/2015 04:20 PM, Tuom Larsen wrote:
Hello all!
I just read with great interest the blog post "Automatic SIMD vectorization support in PyPy".
Please, I have a few questions:
- Does regular Python code benefit from the vectorization? I mean, the article on one hand says "it is not specifically targeted for the NumPy library" but on the other it says "Any interpreter (written in RPython)".
Speaking about 'regular' Python code, there is potential !BUT! only if enough time is spent in numeric code. What I meant in the article was: E.g. you have a vector construct in your language (like the one the R language has) you could use the optimization to vectorize operations on the variables that represent.
Take a look at my test virtual machine implementing a small subset of R. https://bitbucket.org/plan_rich/vecopt-test-vm
- I would like to write a vector class as much suitable for PyPy as possible, what approach should I take in order to implement it? For example, what would suit PyPy JIT the best:
class Vector3d: def __init__(a, x, y, z): (a.x, a.y, a.z) = x, y, z def __add__(a, b): return Vector3d(a.x + b.x, a.y+b.y, a.z+b.z)
def add1(a, b): (ax, ay, az) = a (bx, by, bz) = b return [ax + bx, ay + by, az + bz]
def add2(a, b): (ax, ay, az) = a (bx, by, bz) = b return (ax + bx, ay + by, az + bz)
def add3((ax, ay, az), (bx, by, bz)): return (ax + bx, ay + by, az + bz)
def add3: ???
I have made some tests with this already. You would need to use the array module. Python lists would also work, but they leave behind some instructions that are not well optimized. It is described in this post:
http://pypyvecopt.blogspot.co.at/2015/08/gsoc-vec-little-brother-of-numpy-ar...
The missing piece is then the --jit vec_all=1 parameter, that you must specify on the command line.
Be aware:
1) vec_all=1 parameter might lead to a crash, I have tested it, but it occurred to me that it is not really ready for production. I'm still working on this, thus it is disabled by default.
2) that with a very low number of vector elements (e.g 3) the overhead to iterate the first iteration is quite significant. So I do not think that there is much you can get just executing 1 vector add in parallel on x86.
- Is NumPyPy going to be included with regular PyPy download/install?
NumPyPy is included in a normal PyPy release version
Thanks a lot in advance! _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
Cheers, Richard _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
participants (2)
-
Richard Plangger
-
Tuom Larsen