2.6, 3.0, and truly independent intepreters

Terry Reedy tjreedy at udel.edu
Fri Oct 24 23:39:16 EDT 2008


Glenn Linderman wrote:

> For example, Python presently has a rather stupid algorithm for string 
> concatenation.

Python the language has syntax and semantics.  Python implementations 
have algorithms that fulfill the defined semantics.

> It allocates only the exactly necessary space for the 
> concatenated string.  This is a brilliant move, when you realize that 
> strings are immutable, and once allocated can never change, but the 
> operation
> 
> for line in mylistofstrings:
>    string = string + line
> 
> is basically O(N-squared) as a result.  The better algorithm would 
> double the size of memory allocated for string each time there is not 
> enough room to add the next line, and that reduces the cost of the 
> algorithm to O(N).

If there is more than one reference to a guaranteed immutable object, 
such as a string, the 'stupid' algorithm seem necessary to me.  In-place 
modification of a shared immutable would violate semantics.

However, if you do

string = ''
for line in strings:
   string =+ line

so that there is only one reference and you tell the interpreter that 
you don't mind the old value being updated, then I believe in 2.6, if 
not before, CPython does overallocation and in-place extension.  (I am 
not sure about s=s+l.)  But this is just ref-counted CPython.

Terry Jan Reedy




More information about the Python-list mailing list