Storing pairs of (int, int) in a database : which db to choose ?

Samuel Walters swalters_usenet at yahoo.com
Tue Dec 30 23:50:40 CET 2003


On Wed, 24 Dec 2003 02:07:10 -0800, Stormbringer wrote:
> I wrote a similar version of what I wanted a couple of years ago not
> using a database, just C/C++ code, something like this (I think a
> similar method was suggested in this thread) : I was processing messages
> in batches, so for each 10000 messages or so I would make in memory a
> list of words and in what messages occur, and for each word I would
> write all messages that contain it as consecutive entries in a file.
> Then I would update the global word index (kept in another file), which
> for each work kept a linked list of zones in the other file where msgIds
> containing this word were.
> 
> Worked pretty well, but this was for a server I controlled, i.e. at all
> times I was sure I had enough space and I was controlling it. For an
> application to be deployed to end-users I need more safety. That is why
> I am playing with sql.

For future reference:
If you're like me, and have a lot of c/c++ code that you've built and are
happy with, pyrex ( http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ )
can come in handy for reusing that code within python.  Pyrex is easier to
write and maintain than the direct c-to-python API, and with a little bit
of practice, you can learn how to easily marshal data back and forth
between the two languages.  Essentially, it takes all the grunt-work out
of writing making c/c++ calls from python.  It's especially tasty when you
need to make use of some shared library that python has no interface to.

Sam Walters




More information about the Python-list mailing list