[Tutor] Clarification questions about how Python uses references.

Thu Jun 24 00:13:14 EDT 2021

On Wed, Jun 23, 2021 at 10:50 PM Cameron Simpson <cs at cskk.id.au> wrote:
>
> On 23Jun2021 17:18, boB Stepp <robertvstepp at gmail.com> wrote:
> >I continue to attempt to refine my understanding of how Python uses
> >identifiers to reference objects.  Below is an interpreter session
> >that I have a couple of questions about:
> >
> >>>> a = 'banana'
> >>>> b = 'banana'
> >>>> a is b
> >True
> >>>> a = 1000000000
> >>>> b = 1000000000
> >>>> a is b
> >False
> >>>> a = 'This is a much longer string than previously.  I wonder what the result will be?'
> >>>> b = 'This is a much longer string than previously.  I wonder what the result will be?'
> >>>> a is b
> >False

> >But the string example is new to me.  It appears that Python caches
> >smaller strings.  Is this true?  If yes, what might the caching limits
> >be?
>
> Yes.
>
> Unsure. Maybe read the source. I don't think the language specifies that
> this must occur.

Ugh.  Peter's last pointing to the source and links to a couple of
articles are still on my reading list.  Parsing C code when I've never
studied C is hard slogging for me.  But being ever curious I will
probably getting around to looking at string cache limits, to no
useful point I'm sure!

> >On to lists.  My current understanding is that lists don't actually
> >contain the objects themselves, but, instead, references to those
> >objects.  Is this correct?
>
> Yes. Just like _any_ variable or container.
>
> >How could I prove this to myself in the
> >interpreter?
>
>     >>> L1 = [1]
>     >>> L2 = [2]
>     >>> LL1 = [L1, L1]
>     >>> LL1
>     [[1], [1]]
>     >>> LL3 = [L1, L2]
>     >>> LL3
>     [[1], [2]]
>     >>> L1.append(2)
>     >>> LL1
>     [[1, 2], [1, 2]]
>     >>> LL3
>     [[1, 2], [2]]
>     >>> LL1[0] is L1
>     True

I actually did some similar experimenting prior to posting, but for
some reason talked myself out of what I thought I was demonstrating.
Sleepy I guess.

> >Does this translate to tuples and sets?
>
> Of course. They just store references.
>
> >Even though
> >tuples are immutable they can contain mutable objects.
>
> Sure. Immutable means you can't change the references, not necessarily
> that the references are to other immutable things.

Ah!  This is a key point that has not clicked with me till now!  It's
the _references_ that can't be changed.  Thanks!!

> >Playing around
> >in the interpreter it looks like even if sets contain tuples, no
> >mutable elements can be in the tuples.  Is this in general correct?
>
> That's because set elements, like dict keys, need to be hashable and
> stable. Collisions in sets and dicts rely on the objects' hash functions
> staying the same, because the hash function governs which slot in the
> internal hash table contains the reference.
>
> If the hash of an object changes, the interpreter will look in a
> different hash slot. badness ensures.
>
> The explicitly immutable types like str or tuple have a __hash__ method.
> The contract you obey if you implement a __hash__ method yourself is
> that it is stable regardless of what happens to the object. So most
> mutable types don't provide a __hash__, which is how you know you can't
> use them in a set: try to put one in a set and when the set tries to
> access the hash code it fails.

Hashes/hashing have not made it into my education yet.  I am hoping
that this intro CSc book I am reading will eventually address this.
Plus it seems that there are other contexts/meanings as in things like
checksums using MD5 hashing or some such, et al.

Another thing to learn...

Thanks!
boB Stepp