Creating an object that can track when its attributes are modified
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Fri Mar 8 13:50:58 EST 2013
On Wed, 06 Mar 2013 16:26:57 -0800, Ben Sizer wrote:
> On Thursday, 7 March 2013 00:07:02 UTC, Steven D'Aprano wrote:
[...]
>> Actually I lie. I would guess that the simple, most obvious way is
>> faster: don't worry about storing what changed, just store
>> *everything*. But I could be wrong.
>
> The use case I have is not one where that is suitable. It's not the
> snapshots that are important, but the changes between them.
I'm afraid that doesn't make much sense to me. You're performing
calculations, and stuffing them into instance attributes, but you don't
care about the result of the calculations, only how they differ from the
previous result?
I obviously don't understand the underlying problem you're trying to
solve.
>> Fortunately, Python development is rapid enough that you can afford to
>> develop this object the straightforward way, profile your application
>> to see where the bottlenecks are, and if it turns out that the simple
>> approach is too expensive, then try something more complicated.
>
> I don't see a more straightforward solution to the problem I have than
> the one I have posted. I said that a system that took snapshots of the
> whole object and attempted to diff them would probably perform worse,
> but it would probably be more complex too, given the traversal and
> copying requirements.
Yes, and I said that your intuition of what will be fast and what will be
slow is not necessarily trustworthy. Without testing, neither of us knows
for sure.
Given the code you showed in the original post, I don't see that
traversal and copying requirements are terribly complicated. You don't do
deep-copies of attributes, so a shallow copy of the instance __dict__
ought to be enough. Assuming you have a well-defined "start processing"
moment, just grab a snapshot of the dict, which will be fast, then do
your calculations, then call get_changes:
def snapshot(self):
self._snapshot = self.__dict__.copy()
def get_changes(self):
sentinel = object()
return dict( [ (k,v) for k,v in self.__dict__.iteritems()
if k == self._snapshot.get(k, sentinel) ] )
This doesn't support *deleting* attributes, but neither does your
original version.
Obviously I don't know for sure which strategy is fastest, but since your
version already walks the entire __dict__, this shouldn't be much slower,
and has a good chance of being faster.
(Your version slows down *every* attribute assignment. My version does
not.)
By the way, your original version describes the get_changes_and_clean()
method as cleaning the dirty *flags*. But the implementation doesn't
store flags. Misleading documentation is worse than no documentation.
But if you insist on the approach you've taken, you can simplify the
__setattr__ method:
def __setattr__(self, key, value):
# If the first modification to this attribute, store the old value
dirty = self._dirty_attributes
if key not in dirty:
dirty[key] = getattr(self, key, None)
# Set the new value
object.__setattr__(self, key, value)
You might try this (slightly) obfuscated version, which *could* be faster
still, although I doubt it.
def __setattr__(self, key, value):
# If the first modification to this attribute, store the old value
self._dirty_attributes.setdefault(key, getattr(self, key, None))
# Set the new value
object.__setattr__(self, key, value)
but if you really need to get every bit of performance, it's worth trying
them both and seeing which is faster.
(P.S. I trust you know to use timeit for timing small code snippets,
rather than rolling your own timing code?)
--
Steven
More information about the Python-list
mailing list