Strange behavior in string interpolation of constants
מיקי מונין
mickey.munin at gmail.com
Mon Oct 16 19:39:54 EDT 2017
Hello, I am working on an article on python string formatting. As a part of
the article I am researching the different forms of python string
formatting.
While researching string interpolation(i.e. the % operator) I noticed
something weird with string lengths.
Given two following two functions:
def simple_interpolation_constant_short_string():
return "Hello %s" % "World!"
def simple_interpolation_constant_long_string():
return "Hello %s. I am a very long string used for research" % "World!"
Lets look at the bytecode generated by them using the dis module
The first example produces the following bytecode:
9 0 LOAD_CONST 3 ('Hello World!')
2 RETURN_VALUE
It seems very normal, it appears that the python compiler optimizes the
constant and removes the need for the string interpolation
However the output of the second function caught my eye:
12 0 LOAD_CONST 1 ('Hello %s. I am a very long
string used for research')
2 LOAD_CONST 2 ('World!')
4 BINARY_MODULO
6 RETURN_VALUE
This was not optimized by the compiler! Normal string interpolation was
used!
Based on some more testing it appears that for strings that would result in
more than 20 characters no optimization is done, as evident by these
examples:
def expected_result():
return "abcdefghijklmnopqrs%s" % "t"
Bytecode:
15 0 LOAD_CONST 3 ('abcdefghijklmnopqrst')
2 RETURN_VALUE
def abnormal_result():
return "abcdefghijklmnopqrst%s" % "u"
Bytecode:
18 0 LOAD_CONST 1 ('abcdefghijklmnopqrst%s')
2 LOAD_CONST 2 ('u')
4 BINARY_MODULO
6 RETURN_VALUE
I am using Python 3.6.3
I am curios as to why this happens. Can anyone shed further light on this
behaviour?
More information about the Python-list
mailing list