Performance challenge of mutating class variables

I mentioned this issue in a private conversation with Carl and he suggested that I send the details to the mailinglist. The following blog post is a nice summary of a significant performance problem in IronPython's early implementation with good links to the key discussions: http://ironpython-urls.blogspot.com/2009/05/python-jython-and-ironpython.htm... Robert Smallshire implemented OWL Basic in IronPython and experienced some massive slowdowns relative to CPython and Jython. IronPython developer Dino Diehland investigated the problem and wrote an detailed blog post about the problem and the fix implemented in IronPython. The problem was due to the user program's use of a class variable as a counter, which caused "unfortunate" behavior in the IronPython implementation -- mutating the variable invalidated the cache. PyPy shows a good speedup compared with CPython already, but it is an interesting corner case. The smaller testcase created by Robert might be a good benchmark to add to PyPy Speed Center (http://www.smallshire.org.uk/sufficientlysmall/2009/05/22/ironpython-2-0-and...). - David

Hi David, On Fri, May 13, 2011 at 10:10 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
PyPy has indeed mostly the same behavior, but it is still faster than CPython on the example you give. It is because you don't do anything with the Nodes. If we modify the example (http://paste.pocoo.org/show/388541/) to read all '_children' attributes after all Nodes have been created, then PyPy is very slow. In this example it is 4x slower than CPython when called with 2000000. Thanks for pushing us in trying to find a solution for this problem :-) A bientôt, Armin.

Hi Carl Friedrich, On Sat, May 14, 2011 at 8:15 PM, Carl Friedrich Bolz <cfbolz@gmx.de> wrote:
Did you try this with PyPy 1.5 or a recent nightly? I thought that the new type __dict__ implementation should have fixed the problem?
Indeed, with a recent PyPy things are much better. My example still takes 1.5x time the time it takes on CPython 2.7, though... A bientôt, Armin.

apologies, folks, but did somebody change the settings recently on this list? i suddenly start receiving messages when i am subscribed as "not to receive messages" - whoever has changed the options - without my permission - please could they undo what they have done, so that my wishes are respected and i remain subscribed to this list *without* receiving messages? many thanks, l.

Hi Luke, On Sun, May 15, 2011 at 7:11 PM, Luke Kenneth Casson Leighton <lkcl@lkcl.net> wrote:
apologies, folks, but did somebody change the settings recently on this list?
This list has moved to python.org. Sorry about the settings of the subscribers, which have not all been transferred. I re-disabled mails for you, but if anyone else has a similar problem, please visit http://mail.python.org/mailman/listinfo/pypy-dev . A bientôt, Armin.

Hi David, On Sun, May 15, 2011 at 5:21 PM, Armin Rigo <arigo@tunes.org> wrote:
Well, I cannot reproduce this number... I have no clue how I got it :-( The original code you posted is actually not relevant to the JIT: it contains no loop. The fact is that it is still 2-3x faster on PyPy than CPython, for GC reasons: our GC is better at handling quickly growing heaps, and the allocated objects are much smaller anyway. The code I posted used to be really slow in pypy-1.5, because it was trying again and again to JIT it. Now we fixed the issue, and it is again much faster than CPython (this time, about 10x). A bientôt, Armin.

Hi Maciej, On Tue, May 17, 2011 at 1:43 PM, Maciej Fijalkowski <fijall@gmail.com> wrote:
Might be CPython 2.6 vs CPython 2.7? There were significant GC changes in 2.7, especially when it comes to O(n^2) behavior.
I must have messed up my measures. It's true that Python 2.7 is twice as fast as Python 2.6 in this particular example, but PyPy is still much faster than both. A bientôt, Armin.

On 17/05/11 14:11, Maciej Fijalkowski wrote:
Which reminds me that we should *really* start running python 2.7 as a baseline python
for this, it should be "enough" to upgrade ubuntu on tannit, which comes with python2.7. Unless we decide that we want to stay with the LTS release. ciao, Anto

On Tue, May 17, 2011 at 02:13:56PM +0200, Antonio Cuni wrote:
If you want to stay with an LTS, you can get Python 2.7 from this "quasi-official" PPA: https://launchpad.net/~pythoneers/+archive/lts There's also another PPA, called "deadsnakes", with more Python versions available: https://launchpad.net/~fkrull/+archive/deadsnakes Marius Gedminas -- Mosher's Law of Software Engineering: Don't worry if it doesn't work right. If everything did, you'd be out of a job.

Hi David, On Fri, May 13, 2011 at 10:10 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
PyPy has indeed mostly the same behavior, but it is still faster than CPython on the example you give. It is because you don't do anything with the Nodes. If we modify the example (http://paste.pocoo.org/show/388541/) to read all '_children' attributes after all Nodes have been created, then PyPy is very slow. In this example it is 4x slower than CPython when called with 2000000. Thanks for pushing us in trying to find a solution for this problem :-) A bientôt, Armin.

Hi Carl Friedrich, On Sat, May 14, 2011 at 8:15 PM, Carl Friedrich Bolz <cfbolz@gmx.de> wrote:
Did you try this with PyPy 1.5 or a recent nightly? I thought that the new type __dict__ implementation should have fixed the problem?
Indeed, with a recent PyPy things are much better. My example still takes 1.5x time the time it takes on CPython 2.7, though... A bientôt, Armin.

apologies, folks, but did somebody change the settings recently on this list? i suddenly start receiving messages when i am subscribed as "not to receive messages" - whoever has changed the options - without my permission - please could they undo what they have done, so that my wishes are respected and i remain subscribed to this list *without* receiving messages? many thanks, l.

Hi Luke, On Sun, May 15, 2011 at 7:11 PM, Luke Kenneth Casson Leighton <lkcl@lkcl.net> wrote:
apologies, folks, but did somebody change the settings recently on this list?
This list has moved to python.org. Sorry about the settings of the subscribers, which have not all been transferred. I re-disabled mails for you, but if anyone else has a similar problem, please visit http://mail.python.org/mailman/listinfo/pypy-dev . A bientôt, Armin.

Hi David, On Sun, May 15, 2011 at 5:21 PM, Armin Rigo <arigo@tunes.org> wrote:
Well, I cannot reproduce this number... I have no clue how I got it :-( The original code you posted is actually not relevant to the JIT: it contains no loop. The fact is that it is still 2-3x faster on PyPy than CPython, for GC reasons: our GC is better at handling quickly growing heaps, and the allocated objects are much smaller anyway. The code I posted used to be really slow in pypy-1.5, because it was trying again and again to JIT it. Now we fixed the issue, and it is again much faster than CPython (this time, about 10x). A bientôt, Armin.

Hi Maciej, On Tue, May 17, 2011 at 1:43 PM, Maciej Fijalkowski <fijall@gmail.com> wrote:
Might be CPython 2.6 vs CPython 2.7? There were significant GC changes in 2.7, especially when it comes to O(n^2) behavior.
I must have messed up my measures. It's true that Python 2.7 is twice as fast as Python 2.6 in this particular example, but PyPy is still much faster than both. A bientôt, Armin.

On 17/05/11 14:11, Maciej Fijalkowski wrote:
Which reminds me that we should *really* start running python 2.7 as a baseline python
for this, it should be "enough" to upgrade ubuntu on tannit, which comes with python2.7. Unless we decide that we want to stay with the LTS release. ciao, Anto

On Tue, May 17, 2011 at 02:13:56PM +0200, Antonio Cuni wrote:
If you want to stay with an LTS, you can get Python 2.7 from this "quasi-official" PPA: https://launchpad.net/~pythoneers/+archive/lts There's also another PPA, called "deadsnakes", with more Python versions available: https://launchpad.net/~fkrull/+archive/deadsnakes Marius Gedminas -- Mosher's Law of Software Engineering: Don't worry if it doesn't work right. If everything did, you'd be out of a job.
participants (7)
-
Antonio Cuni
-
Armin Rigo
-
Carl Friedrich Bolz
-
David Edelsohn
-
Luke Kenneth Casson Leighton
-
Maciej Fijalkowski
-
Marius Gedminas