Checking interned string after stringobjects concat?

Hi all: I notice that if concatenating two stringobjects, PVM will not check the dictionary of interned string. For example:
a = "qwerty" b = "qwe" c = "rty" d = b+c id(a) 4572089736 id(d) 4572111176 e = "".join(["qwe","rty"]) id(e) 4546460280
But if concatenating two string directly, PVM would check the dictionary:
a = "qwerty" b = "qwe"+"rty" id(a) 4546460112 id(b) 4546460112
It happens in Py2 and Py3 both. Is it necessary for fixing this bug or not? Cheers! --- Yinbin

On Sat, Apr 21, 2018 at 8:25 PM, Yinbin Ma <mayinbing12@gmail.com> wrote:
Hi all:
I notice that if concatenating two stringobjects, PVM will not check the dictionary of interned string. For example:
a = "qwerty" b = "qwe" c = "rty" d = b+c id(a) 4572089736 id(d) 4572111176 e = "".join(["qwe","rty"]) id(e) 4546460280
But if concatenating two string directly, PVM would check the dictionary:
a = "qwerty" b = "qwe"+"rty" id(a) 4546460112 id(b) 4546460112
It happens in Py2 and Py3 both. Is it necessary for fixing this bug or not?
What you're seeing there is actually the peephole optimizer at work. Your assignment to 'b' here is actually the exact same thing as 'a', by the time you get to execution. If you're curious about what's happening, check out the dis.dis() function and have fun! :) ChrisA

On 21 April 2018 at 11:42, Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Apr 21, 2018 at 8:25 PM, Yinbin Ma <mayinbing12@gmail.com> wrote:
Hi all:
I notice that if concatenating two stringobjects, PVM will not check the dictionary of interned string. For example:
a = "qwerty" b = "qwe" c = "rty" d = b+c id(a) 4572089736 id(d) 4572111176 e = "".join(["qwe","rty"]) id(e) 4546460280
But if concatenating two string directly, PVM would check the dictionary:
a = "qwerty" b = "qwe"+"rty" id(a) 4546460112 id(b) 4546460112
It happens in Py2 and Py3 both. Is it necessary for fixing this bug or not?
What you're seeing there is actually the peephole optimizer at work. Your assignment to 'b' here is actually the exact same thing as 'a', by the time you get to execution. If you're curious about what's happening, check out the dis.dis() function and have fun! :)
To clarify, though, this is not a bug. The language doesn't guarantee that the two strings will have the same id, just that they will be equal (in the sense of ==). Paul

21.04.18 13:42, Chris Angelico пише:
What you're seeing there is actually the peephole optimizer at work.
Since 3.7 constant folding is the AST optimizer work. The end result is the same in most cases though. Other optimization takes place here too. Constants strings that look like identifiers (short string consisting of ASCII alphanumerical characters) are interned in the code object constructor.

On Sat, Apr 21, 2018 at 10:29 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
21.04.18 13:42, Chris Angelico пише:
What you're seeing there is actually the peephole optimizer at work.
Since 3.7 constant folding is the AST optimizer work. The end result is the same in most cases though.
Other optimization takes place here too. Constants strings that look like identifiers (short string consisting of ASCII alphanumerical characters) are interned in the code object constructor.
Ah, sorry, my bad. Anyhow, it's part of compile-time optimization, which means that it runs the exact same code for both assignments. ChrisA

21.04.18 17:47, Chris Angelico пише:
On Sat, Apr 21, 2018 at 10:29 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
21.04.18 13:42, Chris Angelico пише:
What you're seeing there is actually the peephole optimizer at work.
Since 3.7 constant folding is the AST optimizer work. The end result is the same in most cases though.
Other optimization takes place here too. Constants strings that look like identifiers (short string consisting of ASCII alphanumerical characters) are interned in the code object constructor.
Ah, sorry, my bad. Anyhow, it's part of compile-time optimization, which means that it runs the exact same code for both assignments.
Don't blame yourself for missing details of the implementation of the version that is not released yet. ;-)

But to the OP, this is not considered a bug. On Sat, Apr 21, 2018, 07:59 Serhiy Storchaka <storchaka@gmail.com> wrote:
On Sat, Apr 21, 2018 at 10:29 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
21.04.18 13:42, Chris Angelico пише:
What you're seeing there is actually the peephole optimizer at work.
Since 3.7 constant folding is the AST optimizer work. The end result is
same in most cases though.
Other optimization takes place here too. Constants strings that look
21.04.18 17:47, Chris Angelico пише: the like
identifiers (short string consisting of ASCII alphanumerical characters) are interned in the code object constructor.
Ah, sorry, my bad. Anyhow, it's part of compile-time optimization, which means that it runs the exact same code for both assignments.
Don't blame yourself for missing details of the implementation of the version that is not released yet. ;-)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
participants (5)
-
Chris Angelico
-
Guido van Rossum
-
Paul Moore
-
Serhiy Storchaka
-
Yinbin Ma