Large amount of files to parse/organize, tips on algorithm?
Tue Sep 2 20:28:39 CEST 2008
cnb <circularfunc at yahoo.se> writes:
> For each file I construct a list of reviews and then for each new file
> I merge the reviews so that in the end have a list of reviewers and
> for each reviewer all their reviews.
> What is the fastest way to do this?
Scan through all the files sequentially, emitting records like
(movie, reviewer, review)
Then use an external sort utility to sort/merge that output file
on each of the 3 columns. Beats writing code.
More information about the Python-list