when do two names cease to refer to the same string object?
Steven D'Aprano
steve at REMOVETHIScyber.com.au
Fri Mar 3 06:19:09 EST 2006
On Thu, 02 Mar 2006 20:45:10 -0500, John Salerno wrote:
> To test this out a wrote a little script as an exercise:
>
> for num in range(1, 10):
> x = 'c' * num
> y = 'c' * num
>
> if x is y:
> print 'x and y are the same object with', num, 'characters'
> else:
> print 'x and y are not the same object at', num, 'characters'
> break
>
> But a few questions arise:
>
> 1. As it is above, I get the 'not the same object' message at 2
> characters. But doesn't Python only create one instance of small strings
> and use them in multiple references? Why would a two character string
> not pass the if test?
Watch this:
>>> "aaaaaa" is "aaaaaa"
True
>>> "aaaaaa" is "aaa" + "aaa"
False
Does this give you a hint as to what is happening?
Some more evidence:
>>> "aaaaa"[0:1] is "aaaaa"[0:1]
True
>>> "aaaaa"[0:2] is "aaaaa"[0:2]
False
> 2. If I say x = y = 'c' * num instead of the above, the if test always
> returns true. Does that mean that when you do a compound assignment like
> that, it's not just setting each variable to the same value, but setting
> them to each other?
Yes. Both x and y will be bound to the same object, not just two objects
with the same value. This is not an optimization for strings, it is a
design decision for all objects:
>>> x = y = []
>>> x.append(1)
>>> y
[1]
> Finally, I'd like to see how others might write a script to do this
> exercise.
filename = "string_optimization_tester.py"
s = "if '%s' is not '%s':\n raise ValueError('stopped at n=%d')\n"
f = file(filename, "w")
for n in range(1000):
f.write(s % ("c"*n, "c"*n, n))
f.write("""if 'ccc' is not 'c'*3:
print 'Expected failure failed correctly'
else:
print 'Expected failure did not happen'
""")
f.write("print 'Done!'\n")
f.close()
execfile(filename)
--
Steven.
More information about the Python-list
mailing list