[Tracker-discuss] Speeding up indexer_rdbms

"Martin v. Löwis" martin at v.loewis.de
Wed Aug 22 00:07:08 CEST 2007


I found that the creation of two additional postgres indices, namely

create index words_textid_idx on __words(_textid);
create index textids_class_itemid_prop_idx on __textids (_class,
_itemid, _prop);

speeds up the indexing quite a bit, on change (search speed itself is
not affected). The biggest visible speedup is on "roundup-admin
reindex", which needs to delete all words for a given textid before
readding them, but it also helps on import and creation of new messages,
where it first checks whether the new (class,item,prop) triple is
really new, or a modification of an existing one. As a consequence,
these operations go down from O(N**2) (in the number of properties
to be added/reindexed) to O(N logN).

With these indexes, I can import the SF tracker on my machine in 40
minutes (assuming all files have been downloaded).

Regards,
Martin


More information about the Tracker-discuss mailing list