[Python-ideas] checking for identity before comparing built-in objects

Oscar Benjamin oscar.j.benjamin at gmail.com
Thu Oct 4 18:48:59 CEST 2012


On 4 October 2012 17:05, MRAB <python at mrabarnett.plus.com> wrote:
> On 2012-10-04 16:53, Steven D'Aprano wrote:
>>
>> On 05/10/12 01:08, Victor Stinner wrote:
>>>
>>> 2012/10/4 Steven D'Aprano<steve at pearwood.info>:
>>>>
>>>> On 04/10/12 21:48, Max Moroz wrote:
>>>>>
>>>>>
>>>>> It seems that built-in classes do not short-circuit `__eq__` method
>>>>> when the objects are identical, at least in CPython:
>>>>>
>>>>>       f = frozenset(range(200000000))
>>>>>       f1 = f
>>>>>       f1 == f # this operation will take about 1 sec on my machine
>>>>
>>>>
>>>>
>>>> You shouldn't over-generalize. Some built-ins do short-circuit __eq__
>>>> when the objects are identical. I believe that strings and ints both
>>>> do. Other types might not.
>>>
>>>
>>> This optimization is not implemented for Unicode strings.
>>
>>
>> That does not match my experience. In Python 3.2, I generate a large
>> unicode string, and an equal but not identical copy:
>>
>> s = "aЖcdef"*100000
>> t = "a" + s[1:]
>> assert s is not t and s == t
>>
>>
>> Using timeit, s == s is about 10000 times faster than s == t.
>>
> In Python 3.3 I get a similar result.

This was discussed not long ago in a different thread. Here is the line:
http://hg.python.org/cpython/file/bd8afb90ebf2/Objects/unicodeobject.c#l10508

As I understood it that line is the reason that comparisons for
interned strings are faster.


Oscar



More information about the Python-ideas mailing list