[Tutor] reading random line from a file

Tiger12506 keridee at jayco.net
Mon Jul 16 14:54:31 CEST 2007


Perhaps ~this~ is what you are worried about performance-wise?
Image Name    Mem Usage
-----------------------------
python.exe        11,096 K

That's not too bad considering ~this~
explorer.exe       14,356 K
svchost.exe        24,000 K

And I worry about the mp3 player I wrote in C using 2,520 K
I keep thinking I could cut that down if I mess with the compiler settings 
;-)

I wouldn't worry about it too much. Reading the whole file in at once is a 
performance issue when you are dealing with millions and millions of lines 
of text. An example is DNA sequences. Or databases.

JS

> max baseman wrote:
>> cool thanks
>>
>> oh for performance eventualy i would like the file to contain many quotes
> Using readlines isn't exactly going to cause a performance bottleneck.
> I used the following code
> #make the file.py
> f = file("temp.txt","w")
> x = 100000
> while x > 0:
>    f.write("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n")
>    x -= 1
> f.close()
> #-------
> this creates a file with a whole lot of lines of 'a's.
> 100,000 lines, to be exact, and 4,200,000 bytes.
>
> In other words, this is a fair approximation for if you had, say, 25,000
> quotes (since your quotes are likely to be, on average, longer than the
> amount of 'a's I used.)
> I think you'll agree that that's quite a few quotes.
>
> Now how long does it take to use readlines() on this file?
>
> #test performance.py
> import timeit
> string = "f = file('temp.txt','r');f.readlines();f.close()"
> temp = timeit.Timer(stmt=string)
> print "1000 iterations took: " + str(temp.timeit(1000))
> #-----
> what this code does is opens, reads all the text of the file, and closes
> the file.
> We call timeit with 1000 as the argument, so it repeats this process
> 1000 times.
>
> The output of this program on my machine is:
> 1000 iterations took: 51.0771701431
>
> In other words, if you have 25,000 quotes, you could read all of them
> into memory in 51.07717/1000 (approximately)
> or 0.05107 seconds.  And I'm skeptical that you would even have that
> many quotes.
> So, like i said before, I doubt this will cause any significant
> performance problem in pretty much any normal situation.
>
> Also, by the way - please reply to me on-list so that others get the
> benefit of our conversations.
> -Luke



More information about the Tutor mailing list