Databases and python
Bryan Olson
fakeaddress at nowhere.org
Fri Feb 17 06:18:02 EST 2006
Dan Stromberg wrote:
> Bryan Olson wrote:
[...]
>> Well, you could use simple files instead of fancy database tables.
>
> That's an interesting thought. Perhaps especially if australopithecine
> were saved in a filename like:
>
> ~/indices/au/st/ra/lo/pi/th/ec/in/e
Right, though the better filesystems can efficiently support large
numbers of files per directory.
>>Below is a demo of an alternate technique that uses bsddb B-Trees,
>>and puts both the word and the file-id in the key. I don't know
>>how efficient it is for real data, but at least the time won't grow
>>as Theta(n**2).
>
>
> Perhaps I'm missing something, but is it not roughly O(1) for
> individual insertions, but O(n*m) (n == number of files, m == number of
> words) for lookups?
Insertion into a B-Tree with n elements is Theta(lg(n)). Finding the
files associated with a word, call the number of them 'k', should
be Theta(lg(n) + k). That assumes words and files are roughly
constant size.
--
--Bryan
More information about the Python-list
mailing list