[Tutor] Surprised that print("a" "b") gives "ab"

eryk sun eryksun at gmail.com
Sun Mar 6 15:48:21 EST 2016


On Sun, Mar 6, 2016 at 9:52 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sun, Mar 06, 2016 at 01:03:01AM -0600, boB Stepp wrote:
>
>> get your semantics point, but are there two string objects created in
>> both approaches or does the first in fact create only a single object?
>>  If the first truly only creates a single object, then it seems that
>> this is a more efficient approach.
>
> In practical terms, in CPython today, there is no difference between the
> two, or if there is any difference, it's undetectible: by the time the
> compiler has generated the byte-code, the concatenation has been
> performed:
>
> py> import dis
> py> code = compile("""
> ... a = 'ab' 'cd'
> ... b = 'ab' + 'cd'
> ... """, "", "exec")
> py> dis.dis(code)
>   2           0 LOAD_CONST               0 ('abcd')
>               3 STORE_NAME               0 (a)
>
>   3           6 LOAD_CONST               4 ('abcd')
>               9 STORE_NAME               1 (b)
>              12 LOAD_CONST               3 (None)
>              15 RETURN_VALUE

Note that when CPython's bytecode optimizer folds binary operations on
operands that are sized constants, such as strings, the size of the
resulting constant is limited to 20 items (e.g. characters in the case
of strings). In either case, the original constants are still stored
in the resulting code object (and when marshalled in .pyc files) for
simplicity since otherwise it would have to track when a constant is
no longer referenced by the final code.

    >>> code = compile('"9876543210" + "0123456789";'
    ...                '"4321098765" + "56789012345"',
    ...                '<string>', 'exec')
    >>> code.co_consts
    ('9876543210', '0123456789', '4321098765', '56789012345',
     None, '98765432100123456789')
    >>> dis.dis(code)
      1           0 LOAD_CONST               5 ('98765432100123456789')
                  3 POP_TOP
                  4 LOAD_CONST               2 ('4321098765')
                  7 LOAD_CONST               3 ('56789012345')
                 10 BINARY_ADD
                 11 POP_TOP
                 12 LOAD_CONST               4 (None)
                 15 RETURN_VALUE

Note that "98765432100123456789" is listed after None in co_consts
because it was appended in the optimization stage.


More information about the Tutor mailing list