How naive is Python?

John Machin sjmachin at lexicon.net
Mon Jan 15 09:06:48 CET 2007


skip at pobox.com wrote:
> John> How naive (in the sense that compiler people use the term) is the
>     John> current Python system?  For example:
>
>     John>       def foo() :
>     John>           s = "This is a test"
>     John>           return(s)
>
>     John>          s2 = foo()
>
>     John> How many times does the string get copied?
>
> Never.  s and s2 just refer to the same object (strings are immutable).
> Executing this:
>
>     def foo() :
>         print id("This is a test")
>         s = "This is a test"
>         print id(s)
>         return(s)
>
>     s2 = foo()
>     print id(s2)
>
> prints the same value three times.
>
>     John> Or, for example:
>
>     John>       s1 = "Test1"
>     John>       s2 = "Test2"
>     John>       s3 = "Test3"
>     John>       s = s1 + s2 + s3
>
>     John> Any redundant copies performed, or is that case optimized?
>
> Not optimized.  You can see that using the dis module:
>
>   4           0 LOAD_CONST               1 ('Test1')
>               3 STORE_FAST               0 (s1)
>
>   5           6 LOAD_CONST               2 ('Test2')
>               9 STORE_FAST               1 (s2)
>
>   6          12 LOAD_CONST               3 ('Test3')
>              15 STORE_FAST               2 (s3)
>
>   7          18 LOAD_FAST                0 (s1)
>              21 LOAD_FAST                1 (s2)
>              24 BINARY_ADD
>              25 LOAD_FAST                2 (s3)
>              28 BINARY_ADD
>              29 STORE_FAST               3 (s)
>              32 LOAD_CONST               0 (None)
>              35 RETURN_VALUE
>
> The BINARY_ADD opcode creates a new string.
>
>     John> How about this?
>
>     John>       kcount = 1000
>     John>       s = ''
>     John>       for i in range(kcount) :
>     John>           s += str(i) + ' '
>
>     John> Is this O(N) or O(N^2) because of recopying of "s"?
>
> O(N).  Here's a demonstration of that:
>
>     #!/usr/bin/env python
>
>     from __future__ import division
>
>     def foo(kcount):
>         s = ''
>         for i in xrange(kcount) :
>             s += str(i) + ' '
>
>     import time
>
>     for i in xrange(5,25):
>         t = time.time()
>         foo(2**i)
>         t = time.time() - t
>         print 2**i, t, t/2**i
>
> On my laptop t roughly doubles for each iteration and prints around 5e-06
> for t/2**i in all cases.

Sorry, Skip, but I find that very hard to believe. The foo() function
would take quadratic time if it were merely adding on pieces of
constant size -- however len(str(i)) is not a constant, it is
O(log10(i)), so the time should be super-quadratic. Following are the
results I got with Python 2.5, Windows XP Pro, 3.2Ghz Intel dual-core
processor with the last line changed to print i, 2**i, t, t/2**i coz I
can't do log2() in my head :-)

[snip]
18 262144 0.889999866486 3.39508005709e-006
19 524288 3.21900010109 6.13975544184e-006
20 1048576 12.2969999313 1.17273330034e-005
21 2097152 54.25 2.58684158325e-005
22 4194304 229.0 5.45978546143e-005
[k/b interrupt]

Cheers,
John




More information about the Python-list mailing list