Data/File Structure and Algorithm for Retrieving Sorted Data Chunk Efficiently
janeaustine50 at hotmail.com
Fri Oct 8 05:19:25 CEST 2004
William Park <opengeometry at yahoo.ca> wrote in message news:<2sgc66F1enc5nU1 at uni-berlin.de>...
> Jane Austine <janeaustine50 at hotmail.com> wrote:
> > Hello.
> > I have student_names and their scores, which are between 0..100.
> > I want to retrieve in the sorted order in terms of score, Nth
> > student's name to N+10th student's name along with their scores.
> > (suppose a web page showing top 10 students and there are "next page"
> > link and if you follow the link there shows next top 10 students with
> > "prev" and "next" page links.)
> > The list is dynamically resized -- elements are added or deleted. The
> > list could go fairly long enough that pulling all the data on memory
> > at once is not a good choice.
> > What algorithm and data/file structure are most efficient? (the
> > environment is "cgi" based web environment)
> > Simplest approach would be using btree from bsddb. But there are only
> > "first"/"last" and "next"/"prev" moves. If I have to retrieve 20..29th
> > students, I have got to "first" and then "next" 19 times before I
> > finally get to 20th student, which is a waste of time(and memory).
> > Maybe I could keep the cursor position among sessions if I were using
> > a standalone server. However, it is in CGI environment.
> > What can I do instead? What data structure allows you to access
> > nth..n+10th data from the sorted list in a constant time?
> This is an example of "premature optimization". How many student do you
> have anyways? 10,000 students with 100 bytes for name and score comes
> to 1MB. Just keep the list as textfile, and read a block at time.
Thank you for reminding me Knuth's words. However, there are other
information on each records, and the whole list could grow upto
Another problem is, the list update action could take a long time and
big memory. The list could shrink and expand. Do I have to sort them
all in memory and then update the textfile _everytime_ it's changed?
More information about the Python-list