Is there any library for indexing binary data?

甜瓜 littlesweetmelon at gmail.com
Thu Mar 25 01:53:45 EDT 2010


Well, Database is not proper because 1. the table is very big (~10^9
rows) 2. we should support very fast *simple* query that is to get
value corresponding to single key (~10^7 queries / second).

Currently, I have implemented a specific algorithm to deal with my
problem. However, I want to employ some library to simplify codings,
otherwise I have to write my own code for each big table. It is
possible that, after using indexing library, the program cannot run as
fast as homemade code. But if it can greatly simplify my job and can
provide satisfied speed (eg 10^5~10^6 queries / second), the indexing
library is still a good choice for me.

--
ShenLei

2010/3/25 Gabriel Genellina <gagsl-py2 at yahoo.com.ar>:
> En Thu, 25 Mar 2010 00:28:58 -0300, 甜瓜 <littlesweetmelon at gmail.com>
> escribió:
>
>> Recently, I am finding a good library for build index on binary data.
>> Xapian & Lucene for python binding focus on text digestion rather than
>> binary data. Could anyone give me some recommendation? Is there any
>> library for indexing binary data no matter whether it is written in
>> python?
>>
>> In my case, there is a very big datatable which stores structured
>> binary data, eg:
>> struct Item
>> {
>>    long id; // used as key
>>    double value;
>> };
>>
>> I want to build the index on "id" field to speed on searching. Since
>> this datatable is not constant, the library should support incremental
>> indexing. If there is no suitable library, I have to do the index by
>> myself...
>
> What about a database?
>
> --
> Gabriel Genellina
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



More information about the Python-list mailing list