[pypy-issue] [issue866] [PATCH] join is slow about 2x slower than Python

Justin Peel tracker at bugs.pypy.org
Thu Sep 8 07:58:51 CEST 2011

New submission from Justin Peel <peelpy at gmail.com>:

I've attached my first attempt at speeding up string's join method. I'm creating 
an issue so that I can get some feedback. First, I used the following script for 

x = ['abcdef']*100
for i in xrange(100000):

Benchmark results:

             CPython 2.7:  0.22 seconds

Pypy nightly from Sept 3:  0.44 seconds

         Pypy with patch:  0.35 seconds

The attached patch does two things for speed: 1) the checks for if the separator 
isn't empty and if the index is not 0 are taken out of the loop by making two 
separate loops 2) changed from using w_str, a multi-method, to calling unwrap on 
the strings.

The first speed-up is the lesser of the two speed-ups and while it makes for a 
little more code, I thought it wrong to have extra unnecessary checks inside of 
the loop. The speed-up for this was only 3-5%.

The second speed-up is important because str_w is a slow multi-method. We 
already know from using space.isinstance that the objects are all strings, so I 
saw no reason to use the multi-method when unwrap is so much faster. Is there a 
reason that the str_w multi-method must be used? Unwrap appears to be 
implemented in the optional RopeObjects, StringBufferObjects and 
StringJoinObjects, but not for the StringSliceObjects. Do I need to implement 
unwrap for StringSliceObjects then? It should only be like 3 lines.

As far as how to get further speed improvements for join, space.isinstance is 
really quite slow. However, we should get a fast-path when listmultiobjects are 
finished. I think that this will get us most of the rest of the way to joining 
being at least as fast as CPython.

The other part that I see possibly being sped up in the StringBuilder part. With 
a debug build of pypy from trunk without this patch and using callgrind, the 
appending to StringBuilder was accountable for 26% of the total time, but only 
11% of that is spent memcpy. The rest of the time is spent making sure that the 
string array doesn't need to grow, getting the correct addresses for the source 
and the destination, and calculating the number of bytes to be copied. Maybe the 
StringBuilder could save the current address to be copied to for appending so 
that it doesn't have to be calculated each time? Of course, if the string array 
is grown then the address could be faulty, but we can still speed up the case 
where the array isn't grown. Anyway, it is just an idea so feel free to shoot it 
down (and hopefully suggest a better one).

files: strjoin.patch
messages: 3108
nosy: justinpeel, pypy-issue
priority: feature
status: unread
title: [PATCH] join is slow about 2x slower than Python

PyPy bug tracker <tracker at bugs.pypy.org>

More information about the pypy-issue mailing list