python3 byte decode

Cameron Simpson cs at cskk.id.au
Sat Nov 4 21:06:24 EDT 2017


On 04Nov2017 01:47, Chris Angelico <rosuav at gmail.com> wrote:
>On Fri, Nov 3, 2017 at 8:24 PM, Ali Rıza KELEŞ <ali.r.keles at gmail.com> wrote:
>> Yesterday, while working with redis, i encountered a strange case.
>>
>> I want to ask why is the following `True`
>>
>> ```
>> "s" is b"s".decode()
>> ```
>>
>> while the followings are `False`?
>>
>> ```
>> "so" is b"so".decode()
>> "som" is b"som".decode()
>> "some" is b"some".decode()
>> ```
>>
>> Or vice versa?
>>
>> I read that `is` compares same objects, not values. So my question is
>> why "s" and b"s".decode() are same objects, while the others aren't?
>>
>> My python version is 3.6.3.
>
>You shouldn't be comparing string objects with 'is'. Sometimes two
>equal strings will be identical, and sometimes they won't. All you're
>seeing is that the interpreter happened to notice and/or cache this
>particular lookup.

To be more clear here, usually when humans say "identical" they mean having 
exactly the same value or attributes. 

Here, Chris means that the two strings are actually the same object rather than 
two equivalent objects. "is" tests the former (the same object). "==" is for 
testing the latter (the objects have the same value).

For speed and memory reasons, Python notices small values of strings and ints, 
and allocates them only once. So When your write:

  a = "s"
  b = "s"

Python will reuse the same str object for both. Because of this, not only is "a 
== b" (i.e. they have the same value) but also "a is b" (a and b refer to the 
same object). But this is not guarrenteed, and certainly for larger values 
Python doesn't bother. Eg:

  a = "ghghghghghg"
  b = "ghghghghghg"

Here they will have the same value but be different objects. So "==" will still 
return True, but "is" would return False.

You should usually be using "==" to compare things. "is" has its place, but it 
is usually not what you're after.

In your example code, b"s".decode() returns the string value "s", and Python is 
internally deciding to reuse the existing "s" from the left half of your 
comparison. It can do this because strings are immutable. (For example, "+=" on 
a string makes a new string).

Hoping this is now more clear,
Cameron Simpson <cs at cskk.id.au> (formerly cs at zip.com.au)



More information about the Python-list mailing list