[Tutor] reading random line from a file
a_n_lal at yahoo.com
Fri Jul 20 07:50:31 CEST 2007
The function random.randint(a,b) include both ends
i.e. b is also included. Thus for file with single
line a=0,b=1 my algo will give an IndexError.
Significance of number 4096 :
file is stored in blocks of size 2K/4K/8K (depending
upon the machine). file seek for an offset goes block
by block rather than byte by byte. Hence for file size
< 4096 (assuming you have 4K block size), you will
anyway end up scanning it entirely so as well load it
up in memory.
Luke suggestion for Index:
I think its an implicit need to give equal probability
to each line. Taking an example - suppose we are
trying to find "quote of the day" from a dictionary of
quotations which may contain 100s of thousands of
quotes. We would like to see a new one each time on
invocation rather than favour the longest one.
So, creating an index is the right solution. But I
just want to add that since index creation is quite a
laborious task (in terms of CPU/time) one should do it
only once (or till file is changed). Thus it should be
kept on disk and ensure that index is re-created in
case file changes. I would like suggestions on index
--- Luke Paireepinart <rabidpoobear at gmail.com> wrote:
> bhaaluu wrote:
> > Greetings,
> > Thanks for including the complete source code!
> > It really helps to have something that works to
> look at.
> > I modified an earlier version of this to run on my
> > computer (GNU/Linux; Python 2.4.3).
> I think the best strategy for this problem would be
> to build an index of
> the offset of the start of each line, and then
> randomly select from this
> that makes each line equally probable, and you can
> set up your class so
> that the index is only built on the first call to
> the function.
> Tutor maillist - Tutor at python.org
Choose the right car based on your needs. Check out Yahoo! Autos new Car Finder tool.
More information about the Tutor