python string comparison oddity

Hrvoje Niksic hniksic at xemacs.org
Thu Jun 19 07:04:29 EDT 2008


Faheem Mitha <faheem at email.unc.edu> writes:

> Yes, but why is '-' and 'foo' cached, and not '--'? Do you know what
> the basis of the choice is?

Caches such as intern dictionary/set and one-character cache are
specific to the implementation (and also to its version version,
etc.).  In this case '-' is a 1-character string, all of which are
cached.  Python also interns strings that show up in Python source as
literals that can be interpreted as identifiers.  It also reuses
string literals within a single expression.  None of this should be
relied on, but it's interesting to get insight into the implementation
by examining the different cases:

>>> '--' is '--'
True              # string repeated within an expression is simply reused

>>> a = '--'
>>> b = '--'
>>> a is b
False             # not cached

>>> a = '-'
>>> b = '-'
>>> a is b
False             # all 1-character strings are cached

>>> a = 'flobozz'
>>> b = 'flobozz'
>>> a is b
True              # flobozz is a valid identifier, so it's cached

>>> a = 'flo-bozz'
>>> b = 'flo-bozz'
>>> a is b
False



More information about the Python-list mailing list