weird dict problem, how can this even happen?
joel.hedlund at gmail.com
Wed Dec 17 14:17:28 CET 2008
Steven D'Aprano wrote:
> On Tue, 16 Dec 2008 14:32:39 +0100, Joel Hedlund wrote:
>> Duncan Booth wrote:
>>> Alternatively give up on defining hash and __eq__ for FragmentInfo and
>>> rely on object identity instead.
>> Object identity wouldn't work so well for caching. Objects would always
>> be drawn as they appeared for the first time. No updates would be shown
>> until the objects were flushed from the cache.
> Perhaps I don't understand your program structure, but I don't see how
> that follows.
First off, please note that I consider my problem to be solved, many
thanks to c.l.p and especially Duncan Booth. But of course continued
discussion on this topic can be both enlightening and entertaining as
long as people are interested. So here goes:
I'm making a scientific program that visualizes data. It can be thought
of as a freely zoomable map viewer with a configurable stack of data
feature renderers, themselves also configurable to enhance different
aspects of the individual features. As you zoom in, it becomes possible
to render more types of features in a visually recognizable manner, so
consequentially more feature renderers become enabled. The determining
properties for the final image are then the coordinates and zoom level
of the view, and the current renderer stack configuration. Rendering may
be time consuming, so I cache the resulting bitmap fragments using a
"key object" called FragmentInfo that holds this information.
Some renderers may take a very long time to do their work, so in order
to keep the gui nice and responsive, I interrupt the rendering chain at
that point, put the remainder of the rendering chain in a job pool, and
postpone finishing up the rendering until there are cpu cycles to spare.
At that time, I go through the job pool and ask the cache which fragment
was most recently accessed. This fragment is the most likely to actually
be in the current view, and thus the most interesting for the user to
Now, when the user navigates the view to any given point, the gui asks
the cache if the bitmap fragments necessary to tile up the current view
have already been rendered, and if so, retrieves them straight from the
cache and paints them to the screen. And that's why object identity
wouldn't work. If the user changes something in the config for the
current renderer stack (=mutates the objects), the renderers still
retain the same object identities and therefore the old versions would
be retrieved from the cache, and no updates would be shown until the
fragments are flushed from the cache, and the fragment subsequently
I guess you could delve deep into the data members and pull out all
their object identities and hash wrt that if you'd really want to, but I
don't really see the point.
The stupid cache implementation that I originally employed used a dict
to store the FragmentInfo:BitmapFragment items plus a use tracker (list)
on the side. This obviously doesn't work, because as soon as the
renderer stack mutates, list and dict go out of sync and I can no longer
selectively flush old items from the dict, because it's not readily
apparent how to reconstruct the old keys. Purely using lists here is
vastly superior because I can just .pop() the least recently used items
from the tail and be done with them. Also for lists as small as this,
the cost in performance is at most negligible.
>> I've been experimenting with a list cache now and I can't say I'm
>> noticing any change in performance for a cache of 100 items.
> 100 items isn't very big though. If you have 50,000 items you may notice
> significant slow down :)
> If having many items in the cache is possible, you should consider using
> a binary search instead of a linear search through the cache. See the
> bisect module.
Thanks for the tip, but I don't forsee this cache ever needing to be
that big. ~100 is quite enough for keeping the gui reasonably responsive
in this case.
More information about the Python-list