dicts,instances,containers, slotted instances, et cetera.

Wed Jan 28 23:44:32 EST 2009

On Wed, 28 Jan 2009 15:20:41 -0800, ocschwar wrote:

> On Jan 28, 4:50 pm, Aaron Brady <castiro... at gmail.com> wrote:
>> On Jan 28, 2:38 pm, ocsch... at gmail.com wrote:
>>
>> Hello, quoting myself from another thread today:
>>
>> There is the 'shelve' module.  You could create a shelf that tells you
>> the filename of the 5 other ones.  A million keys should be no problem,
>> I guess.  (It's standard library.)  All your keys have to be strings,
>> though, and all your values have to be pickleable.  If that's a
>> problem, yes you will need ZODB or Django (I understand), or another
>> relational DB.
>>
>> There is currently no way to store live objects.
> 
> 
> The problem is NOT archiving these objects. That works fine.
> 
> It's the computations I'm using these thigns for that are slow, and that
> failed to speed up using __slots__.

You've profiled and discovered that the computations are slow, not the 
archiving?

What parts of the computations are slow?

> What I need is something that will speed up getattr() or its equivalent,
> and to a lesser degree setattr() or its equivalent.

As you've found, __slots__ is not that thing.

>>> class Slotted(object):
...     __slots__ = 'a'
...     a = 1
...
>>> class Unslotted(object):
...     a = 1
...
>>> t1 = Timer('x.a', 'from __main__ import Slotted; x = Slotted()')
>>> t2 = Timer('x.a', 'from __main__ import Unslotted; x = Unslotted()')
>>>
>>> min(t1.repeat(10))
0.1138761043548584
>>> min(t2.repeat(10))
0.11414718627929688

One micro-optimization you can do is something like this:

for i in xrange(1000000):
    obj.y = obj.x + 3*obj.x**2
    obj.x = obj.y - obj.x
    # 12 name lookups per iteration

Becomes:

y = None
x = obj.x
try:
    for i in xrange(1000000):
        y = x + 3*x**2
        x = y - x
    # 6 name lookups per iteration
finally:
    obj.y = y
    obj.x = x

Unless you've profiled and has evidence that the bottleneck is attribute 
access, my bet is that the problem is some other aspect of the 
computation. In general, your intuition about what's fast and what's slow 
in Python will be misleading if you're used to other languages. E.g. in C 
comparisons are fast and moving data is slow, but in Python comparisons 
are slow and moving data is fast.

-- 
Steven