[pypy-dev] efficient string concatenation (yep, from 2004)

Maciej Fijalkowski fijall at gmail.com
Wed Feb 13 08:35:28 CET 2013


Hi Christian.

We have it, just not enabled by default. --objspace-with-strbuf I think

On Wed, Feb 13, 2013 at 1:53 AM, Christian Tismer <tismer at stackless.com> wrote:
> Hi friends,
>
> efficient string concatenation has been a topic in 2004.
> Armin Rigo proposed a patch with the name of the subject,
> more precisely:
>
> [Patches] [ python-Patches-980695 ] efficient string concatenation
> on sourceforge.net, on 2004-06-28.
>
> This patch was finally added to Python 2.4 on 2004-11-30.
>
> Some people might remember the larger discussion if such a patch should be
> accepted at all, because it changes the programming style for many of us
> from "don't do that, stupid" to "well, you may do it in CPython", which has
> quite
> some impact on other implementations (is it fast on Jython, now?).
>
> It changed for instance my programming and teaching style a lot, of course!
>
> But I think nobody but people heavily involved in PyPy expected this:
>
> Now, more than eight years after that patch appeared and made it into 2.4,
> PyPy (!) still does _not_ have it!
>
> Obviously I was mislead by other optimizations, and the fact that
> this patch was from a/the major author of PyPy who invented the initial
> patch for CPython. That this would be in PyPy as well sooner or later was
> without question for me. Wrong... ;-)
>
> Yes, I agree that for PyPy it is much harder to implement without the
> refcounting trick, and probably even more difficult in case of the JIT.
>
> But nevertheless, I tried to find any reference to this missing crucial
> optimization,
> with no success after an hour (*).
>
> And I guess many other people are stepping in the same trap.
>
> So I can imagine that PyPy looses some of its speed in many programs,
> because
> Armin's great hack did not make it into PyPy, and this is not loudly
> declared
> somewhere. I believe the efficiency of string concatenation is something
> that people assume by default and add it to the vague CPython compatibility
> claim, if not explicitly told otherwise.
>
> ----
>
> Some silly proof, using python 2.7.3 vs PyPy 1.9:
>
> $ cat strconc.py
> #!env python
>
> from timeit import default_timer as timer
>
> tim = timer()
>
> s = ''
> for i in xrange(100000):
>      s += 'X'
>
> tim = timer() - tim
>
> print 'time for {} concats = {:0.3f}'.format(len(s), tim)
>
>
> $ python strconc.py
> time for 100000 concats = 0.028
> $ pypy strconc.py
> time for 100000 concats = 0.804
>
>
> Something is needed - a patch for PyPy or for the documentation I guess.
>
> This is not just some unoptimized function in some module, but it is used
> all over the place and became a very common pattern since introduced.
>
> How ironic that a foreseen problem occurs _now_, and _there_ :-)
>
> cheers -- chris
>
>
> (*)
> http://pypy.readthedocs.org/en/latest/cpython_differences.html
> http://pypy.org/compat.html
> http://pypy.org/performance.html
>
> --
> Christian Tismer             :^)   <mailto:tismer at stackless.com>
> Software Consulting          :     Have a break! Take a ride on Python's
> Karl-Liebknecht-Str. 121     :    *Starship* http://starship.python.net/
> 14482 Potsdam                :     PGP key -> http://pgp.uni-mainz.de
> phone +49 173 24 18 776  fax +49 (30) 700143-0023
> PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
>       whom do you want to sponsor today?   http://www.stackless.com/
>
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> http://mail.python.org/mailman/listinfo/pypy-dev
>


More information about the pypy-dev mailing list