Tremendous slowdown due to garbage collection
pavlovevidence at gmail.com
Tue Apr 15 05:18:06 CEST 2008
On Apr 14, 4:27 pm, Aaron Watters <aaron.watt... at gmail.com> wrote:
> > A question often asked--and I am not a big a fan of these sorts of
> > questions, but it is worth thinking about--of people who are creating
> > very large data structures in Python is "Why are you doing that?"
> > That is, you should consider whether some kind of database solution
> > would be better. You mention lots of dicts--it sounds like some
> > balanced B-trees with disk loading on demand could be a good choice.
> Well, probably because you can get better
> than 100x improved performance
> if you don't involve the disk and use clever in memory indexing.
Are you sure it won't involve disk use? I'm just throwing this out
there, but if you're creating a hundreds of megabytes structure in
memory there's a chance the OS will swap it out to disk, which defeats
any improvements in latency you would have gotten.
However, that is for the OP to decide. The reason I don't like the
sort of question I posed is it's presumptuous--maybe the OP already
considered and rejected this, and has taken steps to ensure the in
memory data structure won't be swapped--but a database solution should
at least be considered here.
More information about the Python-list