Processing huge datasets
Thomas Guettler
guettli at thomas-guettler.de
Mon May 10 10:36:48 EDT 2004
Am Mon, 10 May 2004 12:00:03 +0000 schrieb Anders Søndergaard:
> Hi,
>
> I'm trying to process a large filesystem (+20 million files) and keep the
> directories along with summarized information about the files (sizes,
> modification times, newest file and the like) in an instance hierarchy
> in memory. I read the information from a Berkeley Database.
>
> I'm keeping it in a Left-Child-Right-Sibling instance structure, that I
> operate on recursively.
>
> First I banged my head on the recursion limit, which could luckily be
> adjusted.
> Now I simply get MemoryError.
>
> Is there a clever way of processing huge datasets in Python?
> How would a smart Python programmer advance the problem?
Hi Anders,
I use ZODB.
http://zope.org/Wikis/ZODB/FrontPage/guide/index.html
HTH,
Thomas
More information about the Python-list
mailing list