You can test this code: import time def test(n): m = 10 vals = [] keys = [] for i in xrange(m): vals.append(i) keys.append('a%s'%i) d = None for i in xrange(n): d = dict(zip(keys, vals)) return d if __name__ == '__main__': st = time.time() print test(1000000) print 'use:', time.time() - st -- HomePage: cat.appspot.com
Hi, On Tue, Apr 30, 2013 at 5:26 AM, cat street <gamcat@gmail.com> wrote:
You can test this code: (...)
For no good reason it seems that on this example CPython is quite a bit faster on Linux64 than on Linux32. PyPy is also a bit faster on Linux64 but not by such a large margin. In my tests (PyPy vs CPython) it ends up the same on Linux32, and on Linux64 PyPy is a bit slower (20%?). I think it's good enough given the type of code (completely unoptimizable as far as I can tell, unless we go for "we can kill the whole loop in this benchmark", which is usually a bit pointless in real code). If others want to look in detail at JIT traces, feel free to. A bientôt, Armin.
This is a kind of example where our GC card marking does not quite work. I think the improve-rdict branch should improve this kind of code quite a bit (but I still have to finish it) On Tue, Apr 30, 2013 at 6:51 PM, Armin Rigo <arigo@tunes.org> wrote:
Hi,
On Tue, Apr 30, 2013 at 5:26 AM, cat street <gamcat@gmail.com> wrote:
You can test this code: (...)
For no good reason it seems that on this example CPython is quite a bit faster on Linux64 than on Linux32. PyPy is also a bit faster on Linux64 but not by such a large margin. In my tests (PyPy vs CPython) it ends up the same on Linux32, and on Linux64 PyPy is a bit slower (20%?). I think it's good enough given the type of code (completely unoptimizable as far as I can tell, unless we go for "we can kill the whole loop in this benchmark", which is usually a bit pointless in real code). If others want to look in detail at JIT traces, feel free to.
A bientôt,
Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
I don't think this is a GC case. I think this is a case of loops with only a few iterations aren't fast enough. Alex On Tue, Apr 30, 2013 at 3:13 PM, Maciej Fijalkowski <fijall@gmail.com>wrote:
This is a kind of example where our GC card marking does not quite work. I think the improve-rdict branch should improve this kind of code quite a bit (but I still have to finish it)
On Tue, Apr 30, 2013 at 6:51 PM, Armin Rigo <arigo@tunes.org> wrote:
Hi,
On Tue, Apr 30, 2013 at 5:26 AM, cat street <gamcat@gmail.com> wrote:
You can test this code: (...)
For no good reason it seems that on this example CPython is quite a bit faster on Linux64 than on Linux32. PyPy is also a bit faster on Linux64 but not by such a large margin. In my tests (PyPy vs CPython) it ends up the same on Linux32, and on Linux64 PyPy is a bit slower (20%?). I think it's good enough given the type of code (completely unoptimizable as far as I can tell, unless we go for "we can kill the whole loop in this benchmark", which is usually a bit pointless in real code). If others want to look in detail at JIT traces, feel free to.
A bientôt,
Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
-- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084
Hi Alex, On Wed, May 1, 2013 at 12:24 AM, Alex Gaynor <alex.gaynor@gmail.com> wrote:
I don't think this is a GC case. I think this is a case of loops with only a few iterations aren't fast enough.
Dudes, can anyone look seriously at the benchmark? :-) The core of this benchmark is a loop that does 1'000'000 times "dict(zip(keys, vals))", where keys and vals are lists of length 10. A bientôt, Armin.
On Wed, May 1, 2013 at 10:56 AM, Armin Rigo <arigo@tunes.org> wrote:
Hi Alex,
On Wed, May 1, 2013 at 12:24 AM, Alex Gaynor <alex.gaynor@gmail.com> wrote:
I don't think this is a GC case. I think this is a case of loops with only a few iterations aren't fast enough.
Dudes, can anyone look seriously at the benchmark? :-)
The core of this benchmark is a loop that does 1'000'000 times "dict(zip(keys, vals))", where keys and vals are lists of length 10.
oops, indeed, I looked but then I swapped numbers in my mind
I read the benchmark, it's the loop inside of `zip()` which has very few iterations. Alex On Wed, May 1, 2013 at 1:56 AM, Armin Rigo <arigo@tunes.org> wrote:
Hi Alex,
On Wed, May 1, 2013 at 12:24 AM, Alex Gaynor <alex.gaynor@gmail.com> wrote:
I don't think this is a GC case. I think this is a case of loops with only a few iterations aren't fast enough.
Dudes, can anyone look seriously at the benchmark? :-)
The core of this benchmark is a loop that does 1'000'000 times "dict(zip(keys, vals))", where keys and vals are lists of length 10.
A bientôt,
Armin.
-- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084
Hi Alex, On Wed, May 1, 2013 at 4:08 PM, Alex Gaynor <alex.gaynor@gmail.com> wrote:
I read the benchmark, it's the loop inside of `zip()` which has very few iterations.
Ah oh. Sorry. I forgot that zip() is implemented at app-level. Could it be helpful to have a faster version of zip() specialized to two arguments? It would avoid the loop of length 2 that we do for each pair of items. A bientôt, Armin.
Yes, we have a specialized map for 2 arguments, a specialized zip makes sense. (Or figuring out how to specialize that loop for N-arguments where N is ~smallish so the inner loop is unrolled at app level, that's harder, but probably worthwhile n the long run). Alex On Wed, May 1, 2013 at 10:19 AM, Armin Rigo <arigo@tunes.org> wrote:
Hi Alex,
On Wed, May 1, 2013 at 4:08 PM, Alex Gaynor <alex.gaynor@gmail.com> wrote:
I read the benchmark, it's the loop inside of `zip()` which has very few iterations.
Ah oh. Sorry. I forgot that zip() is implemented at app-level.
Could it be helpful to have a faster version of zip() specialized to two arguments? It would avoid the loop of length 2 that we do for each pair of items.
A bientôt,
Armin.
-- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero GPG Key fingerprint: 125F 5C67 DFE9 4084
On 05/01/2013 07:22 PM, Alex Gaynor wrote:
Yes, we have a specialized map for 2 arguments, a specialized zip makes sense. (Or figuring out how to specialize that loop for N-arguments where N is ~smallish so the inner loop is unrolled at app level, that's harder, but probably worthwhile n the long run).
In general, it'd be very useful to have a way to say the equivalent of @unroll_safe at applevel, although then it could be used very badly if you don't know exactly what you are doing. I think that cfbolz once started a branch to give hints from applevel, but then he never finished. Is that correct? ciao, Anto
Re-Hi, On Wed, May 1, 2013 at 7:19 PM, Armin Rigo <arigo@tunes.org> wrote:
Could it be helpful to have a faster version of zip() specialized to two arguments? It would avoid the loop of length 2 that we do for each pair of items.
Done in ffe6fdf3a875. The zip() function is now apparently more than 4 times faster when called with two smallish lists :-) Thanks cat street for the original report. Your benchmark is more than 2 times faster now (the dict() is still taking the same time). A bientôt, Armin.
participants (5)
-
Alex Gaynor
-
Antonio Cuni
-
Armin Rigo
-
cat street
-
Maciej Fijalkowski