[Tutor] Simple Question...

Bill Mill bill.mill at gmail.com
Sat Oct 16 23:30:54 CEST 2004

```To do this without loading the file into memory, and without relying
on wc (which ought to be very fast even with large files, if you need
that), you could do:

import random

def linecount(f):
"""count the number of lines in the file then rewind it"""
l = ' '
c = 0
while l:
if l[-1:] == '\n': c +=1    #I don't think the 'if' is necessary,
#but for safety's sake we'll leave it
f.seek(0)
return c

def getrandline(f):
"""get a random line from f (assumes file pointer is at beginning)"""
lines = linecount(f)
r = random.randint(0,int(lines))
for i in range(1, r): f.readline()

anybody have a faster implementation?

Peace
Bill Mill
bill.mill at gmail.com

On Sat, 16 Oct 2004 16:52:16 -0400, R. Alan Monroe
<amonroe at columbus.rr.com> wrote:
> >         This will work perfectly if your file is small enough to fit in your
> > computer's memory. If you want a function that does this on large
> > files, you'll have to use something in those lines:
>
> > import random
>
> > def randomLineFromBigFile(fileName, numLines):
> >         whatLine = random.randint(1, numLines)  # choose a random line number
> >         source = open(fileName, 'r')
> >         i = 0
> >         for line in source:
> >                 i += 1
> >                 if i == whatLine: return line
> >         return None
>
> >         This function uses very little (and a constant amount of) memory. The
> > downside is that you have to know the total number of lines in the file
> > (that's the numLines argument) before calling it. It's not a very hard
> > thing to do.
>
> Wouldn't it be much quicker to do something like this?
>
> import os.path
> import random
>
> size = os.path.getsize('test.txt')
> print size
>
> randline = random.randint(1, size)
> print randline
>
> testfile = open('test.txt', 'r')
> testfile.seek(randline)
> print testfile.readline()  #read what is likely half a line
> print testfile.readline()  #read the next whole line
> testfile.close()
>
> You'd just need to add some exception handling in the event you tried
> to read off the end of the file.
>
> Alan
>
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
```