[Tutor] 3 simple, pithy and short questions (fwd)

Fri Nov 21 19:28:14 EST 2003

> > 1.) Where is the difference between "processing" some functon (let say
> > "str" function) like str(a) and like a.__str__()  ??
> >
> > See two examples below:
> >
> >
> >>>>a = "ww"
> >>>>str(a)
> >
> > 'ww'
> >
> >
> >>>>a.__str__()
> >
> > 'ww'

Hi Tadey,

The only difference between:

   str(a)

and a.__str__() is one of usage, of the intended audience.

The first one:

    str(a)

is intended for us programmers to use.  __str__(), on the other hand, is
intended to be used by Python: the underscores are there to highlight this
function as slightly "special" or "magical" to the system, and you often
shouldn't have to look at it too closely.

In fact, a while back, numbers did not have __str__() defined for them.
In Python 1.52, for example:

###
[dyoo at tesuque dyoo]$ python1.5
Python 1.5.2 (#1, Jan 31 2003, 10:58:35)  [GCC 2.96 20000731 (Red Hat
Linux 7.3 2 on linux-i386
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> x = 42
>>> str(x)
'42'
>>> x.__str__()
Traceback (innermost last):
  File "<stdin>", line 1, in ?
AttributeError: 'int' object has no attribute '__str__'
###

In the old days, __str__() didn't work on numbers, although str() did,
because str() had some special cases hardcoded in it to handle the
numbers.

In recent years, the __str__()  method was added to numbers to make them
more uniform with the other Python types and classes.  So now __str__()
should work on everything, and underneath the surface, str() may even
call __str__() to get a string representation of an object.  But it's
still a lot better to avoid calling the magic method __str__() directly,
and use str() instead.

> > 2.) What hash function does ?? I noticed, that in case if some random
> > string is assigned to variable, it returns some long/huge number ....
> >
> > See example below:
> >
> >
> >>>>b = "r"
> >>>>hash(b)
> >
> > 1707142003
>
> The dictionary gets use the hash values in finding the object to be
> retrieved.  In some languages dictionaries would be called hash tables.
> The key used for storing something in a dictionary needs to support the
> computation of a hash value.

That large "hash" value is used as part of an computer science technique
called "hashing".  Hashing is not unique to computer science though: take
a close look at your English dictionary: it uses similar principles!

Hashing works very similarly to how dictionaries work: a English
dictionary has 26 categories based on the first letter of each word, so
each word maps to a particular category.

    'Alphabet' ----->  'A'
    'Python'   ----->  'P'

The categories are there to make it faster to look things up: when we want
to look for a definition, instead of having to look through the whole
dictionary from scratch, we can just pick the right chapter.  We still do
have to go through some pages, but it's a lot better than starting from
page 1.

Hashing works on the same principle, but instead of using 26 categories,
we can use a whole lot more.  And, just as in a real dictionary they're
keyed, but not by first letter, but by "hash value".  So:

###
>>> hash('Alphabet')
-1438023159
>>> hash('Python')
-539294296
###

The number isn't random: we'll get the same number every time:

###
>>> hash("Alphabet")
-1438023159
>>> hash("alphabet")
1273818537
###

Capitalization does matter, though.

In order to transform hash values into categories, we do a little
mathematical trick: we can divide the hash value by the number of
categories we're keeping track, and take the remainder: the remainder will
be the ultimate category number, or "hash bucket", for that item.  If we
use 26 categories, for example, then:

###
>>> hash("Alphabet") % 26
25
>>> hash("Python") % 26
18
>>> hash("tayiper") % 26
11
###

Python doesn't necessarily use 26 hash bucket categories.  But it uses
quite a few.  The goal of a good hash function is to try to evenly
distribute things among all of the categories, so as to make a nicely
balanced dictionary that makes it easy to look things up quickly.  The
Python dictionary type itself will use hash() quite extensively to
construct an efficient structure.

That being said, you probably don't have to worry about all this kind of
hash() stuff: most people do not explicitely call hash(), and let Python
itself take of the busywork.

> > 3.) About __getattribute__ function: In help it is "represented" by
> > a.getattribute('name') <=> a.name But whatever I typed, I couldn't get
> > some "reasonable" output, meaning I always got an error message ...

Some things have 'attributes', and other things have 'indices'.  It turns
out that tuples, like:

    a = ("ww", "rr", "gggg")

can be indexed by position number, but they're not accessible by
attribute.

###
>>> a = ("ww", "rr", "gggg")
>>> a[0]
'ww'
>>> a[2]
'gggg'
###

Attributes and indices serve a similar function, but they are subtly
different.

Indicing is used on things that are arranged in sequential order, like
lists and tuples.  Attributes are used on things that don't quite have a
prearranged order.  For example, if we were to represent a person as an
object:

###
>>> class Person:
...     def __init__(self, name, age):
...         self.name = name
...         self.age = age
...
>>> bart = Person('Bart Simpson', 10)
>>> lisa = Person('Lisa Simpson', 8)
###

then we can define --- and later look up --- attributes of each person:

###
>>> bart.name
'Bart Simpson'
>>> lisa.age
8
###

But if we try to get indices out of them, we shouldn't expect much:

###
>>> bart[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: Person instance has no attribute '__getitem__'
>>> lisa[1]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: Person instance has no attribute '__getitem__'
###

And if we think about it, we really shouldn't expect indicing to just
work!  When we say bart[0], are we trying to get at the name, or age,
or... ?  Python will not try to guess what we mean.

Likewise, if we have a list of things:

###
>>> even_numbers = range(0, 20, 2)
>>> even_numbers
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
###

It makes sense to take indices:

###
>>> even_numbers[5]
10
>>> even_numbers[3]
6
###

but taking attributes is less successful:

###
>>> even_numbers.5
  File "<stdin>", line 1
    even_numbers.5
                 ^
SyntaxError: invalid syntax
###

as attributes are not really supposed to be numbers: attributes are not
positional in nature.  (It is possible to force the system to add numeric
attributes... but you are not supposed to know that.  *grin*)

So the usage of attributes and indicies is meant to be different. Lists do
have some attributes, but they don't correspond to the elements in the
collection, but rather, to actions that we can perform on lists:

###
>>> dir(even_numbers)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__',
'__delslice__', '__doc__', '__eq__', '__ge__', '__getattribute__',
'__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__',
'__imul__', '__init__', '__le__', '__len__', '__lt__', '__mul__',
'__ne__', '__new__', '__reduce__', '__repr__', '__rmul__', '__setattr__',
'__setitem__', '__setslice__', '__str__', 'append', 'count', 'extend',
'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
>>>
>>>
>>> even_numbers.reverse
<built-in method reverse of list object at 0x812c38c>
>>>
>>> even_numbers.reverse()
>>>
>>> even_numbers
[18, 16, 14, 12, 10, 8, 6, 4, 2, 0]
###

So lists have attribute that correspond to methods like reverse() or
extend() or pop().

Tuples support even fewer attributes, and many of them are there only to
support Python:

###
>>> l = (42, 43)
>>> dir(l)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__',
'__eq__', '__ge__', '__getattribute__', '__getitem__', '__getslice__',
'__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__',
'__mul__', '__ne__', '__new__', '__reduce__', '__repr__', '__rmul__',
'__setattr__', '__str__']
###

The double-underscored attributes here also correspond to methods, but
they're not meant to be called by us directly.

Anyway, hope this clears some things up.  It looks like you're exploring
the Python language by using help().  But are you also looking through a
tutorial?

Talk to you later!