A Suggestion for Python Dictionary/Class

Wed Dec 27 11:33:05 EST 2000

<billtj at my-deja.com> wrote in message news:92cvte$b0i$1 at nnrp1.deja.com...
> Thanks for all the suggestions in making the Python class more like
> C/C++ struct or class.  However, it looks that the current solution is
> based on a "convention" to derive Python classes from a certain class.

"Convention"???

> I am not too familiar with how the Python interpreter works.  But based
> on my current understanding, the current Python class is implemented
> using dictionary, unlike in C/C++.  Therefore Python's
>
>     var_1.member_1.member_2.member_3
>
> will invoke three dictionary lookups, while C/C++'s

Four, actually: var_1 also has to be looked up (if it's a _local_
variable, then THAT 'look-up' is generally optimized away).

>     var_1.member_1.member_2.member_3
>
> is a (short) constant time operation, after compilation, of course.

Correct.  The compiler knows at compile-time the static types
of all objects involved -- enough of those types to know about
their *data* members, at least, since data can't be virtual in
C++; there may be virtual *bases* involved, which require levels
of indirections, but, if there are, the compiler will know -- it
will take longer, but the appropriate machinecode will have been
placed there by the compiler to minimize the inevitable overhead;
and -- a key idea of C++ -- the whole language is organized so
that you don't have to pay in terms of runtime for features you
are not actually using, at least in theory.

> My question is whether by defining a special class (call is struct?) in
> Python will result in better Python run-time performance.  For example,

Not by itself, no, because Python's typing is "latent", aka dynamic --
which is where so much of the language's power and flexibility come
from.  The compiler cannot determine statically, at compile-time,
what the runtime of, e.g., var_1 will be -- not generally, and, at
this time, it doesn't even *try*.

So, once it has looked-up (or solved more rapidly, if it's a local
variable) the object corresponding to var_1, it will call its
"get-attribute" C function passing it the string "member_1"; this
call, in itself, is an overhead quite comparable to a virtual-function
call in C++ (call it a few machine instructions -- ones which C++
itself is not paying at all when no virtual behavior is involved,
but you can see Python [as, say, Java] as being 'always virtual').

It is then up to the 'type-object' to define its own appropriate
'get-attribute' function that turns the string 'member_1' into a
Python object pointer.  If the object's type for var_1 is 'class
instance', its C-coded 'get-attribute' starts with a dictionary
lookup in the instance's dictionary, then, if that fails, with
one in the class-object's dictionary, then up the 'bases' DAG,
etc, etc.  See Python-2.0\Objects\classobject.c -- there are lots
of steps involved (e.g., checking for a __getattr__ method), but
none of them is complicated.

It's not hard to define a type with fewer abilities, such as
the 'struct' you propose (having it have a special syntax as
you wish would be harder, as you'd have to mess with Python
internals a little bit; but just adding the semantics, a snap).

Actually, since 'struct' is already the name of an important
builtin module (which lets your Python code handle actual
binary-layout structs, just like C or C++ code would), we had
better name it differently -- say, 'bill'.

>     struct MyStruct:
>         def __init__ (self):
>             self.member_1
>             self.member_2
>             self.member_3

That might become something like:

import bill

def MyStruct(v1=None, v2=None, v3=None):
    return bill.bill(member_1=v1, member_2=v2, member_3=v3)

Or, to save some memory in each bill-object, you might want
'bill.struct' to be the factory function for factory objects
that encode all knowledge about how a given object kind can
possibly be; each 'bill instance' object would then have the
overhead of a single pointer to its 'bill-kind' -- it's not
THAT much of an extra saving, and complexity zooms up fast.

> Because it is a "struct" instead of a "class", we cannot add any new
> member.  Can then the Python interpreter be optimized to handle struct's
> so that the run-time performance is better than a class?  Coming from

Nah -- though the bill object itself _might_ save you a few machine
cycles per access over using a class-instance object, deciding to
accord some special object-type a superior status, to the point of
changing the interpreter itself to honor that status, is something
you'll never, but NEVER, get past Guido.  That is how comes Python is
as powerful as it is while remaining simple -- NO ten zillion special
cases for 'convenience', '10% speed boosts', and other such ad-hockeries.

> C/C++ background and never saw the Python C code, I imagine that the

Since you're blessed with a C/C++ background -- why not download
and study the C sources?  They're blessedly CLEAN and SIMPLE -- you'll
wish all C code you ever had to maintain was written to such standards!-)

> interpreter allocates memory space for three labels (which are
> pointers/handles to Python objects?) and during
> interpretation/compilation, Python replaces the references for member_1,
> member_2, and member_3 simply to the pointers to the corresponding
> labels.  (I may not be precise in describing this, but hopefully you get
> the idea).  So no dictionary lookup is involved.

You might choose to do it differently than with Python dictionary
lookups (which are VERY FAST INDEED, as it happens), but you'd
still have (in your bill object) to translate string->offset at
runtime ('interned' strings, probably -- but that's designed to
make them even faster *for dictionary lookup purposes*...!).

If you really want to wring some real speed out of this scheme,
you have to get some measure of type-inference or static typing
into the Python compiler/interpreted; make the compiler able to
do what it already does for local variables, also for attributes
of some specially singled-out kind of object... but first it
has to 'pin down' a certain variable as SURE to refer at all
times to THAT kind of object and no other.  Two pretty hard
steps to get past Guido!-)

Alex