creating garbage collectable objects (caching objects)

Simon Forman sajmikins at gmail.com
Sun Jun 28 14:19:26 EDT 2009


On Jun 28, 11:03 am, News123 <news... at free.fr> wrote:
> Hi.
>
> I started playing with PIL.
>
> I'm performing operations on multiple images and would like compromise
> between speed and memory requirement.
>
> The fast approach would load all images upfront and create then multiple
> result files. The problem is, that I do not have enough memory to load
> all files.
>
> The slow approach is to load each potential source file only when it is
> needed and to release it immediately after (leaving it up to the gc to
> free memory when needed)
>
> The question, that I have is whether there is any way to tell python,
> that certain objects could be garbage collected if needed and ask python
> at a later time whether the object has been collected so far (image has
> to be reloaded) or not (image would not have to be reloaded)
>
> # Fastest approach:
> imgs = {}
> for fname in all_image_files:
>     imgs[fname] = Image.open(fname)
> for creation_rule in all_creation_rules():
>     img = Image.new(...)
>     for img_file in creation_rule.input_files():
>         img = do_somethingwith(img,imgs[img_file])
>     img.save()
>
> # Slowest approach:
> for creation_rule in all_creation_rules():
>     img = Image.new(...)
>     for img_file in creation_rule.input_files():
>         src_img = Image.open(img_file)
>         img = do_somethingwith(img,src_img)
>     img.save()
>
> # What I'd like to do is something like:
> imgs = GarbageCollectable_dict()
> for creation_rule in all_creation_rules():
>     img = Image.new(...)
>     for img_file in creation_rule.input_files():
>         if src_img in imgs: # if 'm lucke the object is still there
>                 src_img = imgs[img_file]
>         else:
>                 src_img = Image.open(img_file)
>         img = do_somethingwith(img,src_img)
>     img.save()
>
> Is this possible?
>
> Thaks in advance for an answer or any other ideas of
> how I could do smart caching without hogging all the system's
> memory

Maybe I'm just being thick today, but why would the "slow" approach be
slow?  The same amount of I/O and processing would be done either way,
no?
Have you timed both methods?

That said, take a look at the weakref module Terry Reedy already
mentioned, and maybe the gc (garbage collector) module too (although
that might just lead to wasting a lot of time fiddling with stuff that
the gc is supposed to handle transparently for you in the first place.)



More information about the Python-list mailing list