[Tutor] A file containing a string of 1 billion random digits.

Sun Jul 18 08:29:07 CEST 2010

"Richard D. Moores" <rdmoores at gmail.com> wrote

> I don't fully understand your "you always have two [chunks] in 
> memory
> at once". Does that take care of the overlap I think I need? I don't
> want you or anyone to give me the the whole answer, but could you
> clarify a bit about keeping 2 chunks in memory?

Luke gave a good illustration, I'll try to be a bit more clear than
my previous explanation.

Think of the buffer as being a window into the file.
We read the file in chunks and load these into the buffer.
Lets say 3 chunks per buffer.
When we first start up we read chunks 1,2,3 into the buffer.
We start processing chunk 1, then move into chunk 2. By the
time we are 50% through chunk 2 we no longer need chunk 1
so we move it out of memory and read in chunk 4. We move
chunk 2 & 3 to the top of the buffer and append chunk 4.
Now we continue processing until we hit chunk 3 and go
down 50%, we no longer need chunk 2 so delete it and
read chunk 5, move 3 and 4 to the top and load chunk 5
into the buffer.

And so it goes on.
The buffer is a continuous bit of memory holding 3 chunks
of data contiguously so you can process over the boundaries
but your program only has to store 3 chunks worth at a time.

I hope thats clearer.

It might help to think how this worked for an old style
text editor. The chunks represent a screens worth of display.
You only need to display the middle chunk. As you cursor
off the bottom of the screen you pull up the next chunk
in memory (and in the background read the next again
from file and lose the top chunk). This resulted in a user
experience that looked like the whole file was in memory
but in fact only about 6Kb was (3 screens full).

HTH,

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/