On 13.02.13 08:42, Lennart Regebro wrote:
Something is needed - a patch for PyPy or for the documentation I guess.
Not arguing that it wouldn't be good, but I disagree that it is needed.
This is only an issue when you, as in your proof, have a loop that does concatenation. This is usually when looping over a list of strings that should be concatenated together. Doing so in a loop with concatenation may be the natural way for people new to Python, but the "natural" way to do it in Python is with a ''.join() call.
s = ''.join(('X' for x in xrange(x)))
Is more than twice as fast in Python 2.7 than your example. It is in fact also slower in PyPy 1.9 than Python 2.7, but only with a factor of two:
Python 2.7: time for 10000000 concats = 0.887 Pypy 1.9: time for 10000000 concats = 1.600
(And of course s = 'X'* x takes only a bout a hundredth of the time, but that's cheating. ;-)
This is not about how to write efficient concatenation and not for me. It is also not about a constant factor, which I don't really care about but in situations where speed matters.
This is about a possible algorithmic trap, where code written for CPython may behave well with some roughly O(n) behavior, and by switching to PyPy you get a surprise when the same code now has O(n**2) behavior. Such runtime explosions can damage the trust in PyPy, with code sitting in some module which you even did not write but "pip install"-ed it.
So this is important to know, especially for newcomers, and for people who are giving advice to them. For algorithmic compatibility, there should no longer be a feature with this drastic side effect, if that cannot be supported by all other dialects.
To avoid such hidden traps in larger code bases, documentation is needed that clearly gives a warning saying "don't do that", like CS students learn for most other languages.
cheers - chris