[Python-3000] characters data type

Guido van Rossum guido at python.org
Sat May 6 02:45:00 CEST 2006


On 5/4/06, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Guido van Rossum" <guido at python.org> wrote:
> > Can you please post the benchmarking code?
>
> No problem.

OK, so I have the advantage of a time machine... Or at least the p3yk
(sic) branch. I created a perhaps more reasonable benchmark; you can
see it in SVN here:

http://svn.python.org/view/sandbox/trunk/sio/bench_cat.py?rev=45918&view=log

Run with the latest Python 3.0, this shows clearly that the += idiom
is slower than the join idiom, although not by that much. It also
shows that the straightforward mutable bytes implementation in 3.0
consistently performs better than the string implementation --
probably due to being able to resize the buffer without moving the
object. Or maybe I missed something?

The benchmark concatenates 100,000 strings with a size uniformly
chosen from range(N) where N ("size" below) is varied from 10 to 1000
for different test runs. That means the final string varies between
0.5 MB to 50 MB. Each result represents the best of 3 runs where each
run executes the above concatenation loop 10 times.

Here are the results (on a 1.67 GHz PowerBook):

------ size = 10 ------
bytes+=    0.401
bytes.join 0.221
str+=      0.552
str.join   0.279
------ size = 20 ------
bytes+=    0.419
bytes.join 0.236
str+=      0.565
str.join   0.305
------ size = 50 ------
bytes+=    0.518
bytes.join 0.340
str+=      0.713
str.join   0.405
------ size = 100 ------
bytes+=    0.654
bytes.join 0.454
str+=      0.894
str.join   0.580
------ size = 200 ------
bytes+=    0.878
bytes.join 0.642
str+=      1.179
str.join   0.823
------ size = 500 ------
bytes+=    1.678
bytes.join 1.462
str+=      2.466
str.join   1.631
------ size = 1000 ------
bytes+=    3.051
bytes.join 2.762
str+=      4.220
str.join   2.822

Gotta run,

--
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list