what happens when the file begin read is too big for all lines to be read with "readlines()"
fdu.xiaojf at gmail.com
Sun Nov 20 11:49:42 CET 2005
bonono at gmail.com wrote:
>Xiao Jianfeng wrote:
>> First, I must say thanks to all of you. And I'm really sorry that I
>> describe my problem clearly.
>> There are many tokens in the file, every time I find a token, I have
>> the data on the next line and do some operation with it. It should be easy
>> for me to find just one token using the above method, but there are
>> My method was:
>> f_in = open('input_file', 'r')
>> data_all = f_in.readlines()
>> for i in range(len(data_all)):
>> line = data[i]
>> if token in line:
>> # do something with data[i + 1]
>> Since my method needs to read all the file into memeory, I think it
>>may be not
>> efficient when processing very big file.
>> I really appreciate all suggestions! Thanks again.
>something like this :
>for x in fh:
> if not has_token(x): continue
> else: process(fh.next())
>you can also create an iterator by iter(fh), but I don't think that is
>using the "side effect" to your advantage. I was bite before for the
>iterator's side effect but for your particular apps, it becomes an
Thanks all of you!
I have compared the two methods,
(1). "for x in fh:"
(2). read all the file into memory firstly.
I have tested the two methods on two files, one is 80M and the second
one is 815M.
The first method gained a speedup of about 40% for the first file, and
of about 25% for the second file.
Sorry for my bad English, and I hope I haven't made people confused.
More information about the Python-list