Storing pairs of (int, int) in a database : which db to choose ?

Stormbringer andreif at mail.dntis.ro
Wed Dec 24 04:44:49 EST 2003


Paul Rubin <http://phr.cx@NOSPAM.invalid> wrote in message news:<7xr7yvjle5.fsf at ruckus.brouhaha.com>...
> andreif at mail.dntis.ro (Stormbringer) writes:
> > The only thing that bothers me a little is the speed for building the
> > index, I tried with around 5000 messages and I am not quite thrilled,
> > it's not _extremly_ slow but it has to be faster for what I need.
> > Perhaps I'll use the C++ version with some Python bindings.
> 
> Why not do some profiling first.  Maybe it's limited by i/o traffic
> rather than cpu cycles.  I don't know how Lupy works but the one time
> I messed with full text indexing, the bottleneck was definitely the
> random disk accesses needed for every word of each update.  The
> solution is to batch the updates.  Sorting is much less seek intensive
> than random updates.

Sounds like a poorly designed system the one you messed with.
Even if it was limited i/o traffic I am not sure how to do profiling
to find out if this is the case (although I do have doubts about this)
- any suggesions how ?

Andrei




More information about the Python-list mailing list