need advices... accessing a huge collection

GrelEns grelens at NOSPAMyahoo.NOTNEEDEDfr
Thu Oct 23 04:58:31 EDT 2003


hello,

having almost 1,000 tar.gz files in different directories (could not change
that) and these archives contain over 1,000,000 text files. I would like to
build a tool to access as quickly as possible any or sub-collection of these
text files to serve them by http upon user request.

does anyone have ideas on the good way to do it ?

(i was thinking of a mapping in a dictionary whose keys would be filename,
value - path to archive containing it, and extract all the files from a same
archive at the same time)

i also was wondering which is fastest :
- upon each user request, re-building a dictionary from reading key/value
from a file,
- or on the first request building a hard-coded python dictionary and then
importing it,
- or maybe other suggestions (storing in a database...) ?

thanx






More information about the Python-list mailing list