Search and write to .txt file

Tue Aug 11 10:22:49 EDT 2009

Piet van Oostrum wrote:
>>>>>> Helvin <helvinlui at gmail.com> (H) wrote:
>>>>>>             
>
>   
>> H> Hi everyone,
>> H> I am writing some python script that should find a line which contains
>> H> '1' in the data.txt file, then be able to move a certain number of
>> H> lines down, before replacing a line. At the moment, I am able to find
>> H> the line '1', but when I use f.seek to move, and then rewrite, what I
>> H> write goes to the end of the .txt file, instead of being adjusted by
>> H> my f.seek.
>>     
>
>   
>> H> Do you know what way I should take?
>>     
>
>   
>> H> Data.txt is a file of 3 lines:
>> H>    line1
>> H>    line2
>> H>    line3
>>     
>
>   
>> H> Code:
>>     
>
>   
>> H>    with open('data.txt', 'r+') as f:
>> H>        firstread = f.readlines()   # Take a snapshot of initial file
>>     
>
>   
>> H>        f.seek(0,0)    # Go back to beginning and search
>> H>        for line in f:
>> H>            print line
>> H>            if line.find('1'):
>> H>                print 'line matched'
>> H>                f.seek(1,1)       # Move one space along
>> H>                f.write('house\n')     # f.write overwrites the exact
>> H> number of bytes.
>> H>                break                    # leave loop once '1' is found
>>     
>
> Mixing an iterator on the file with direct calls (seek/write) isn't
> going to work. The iterator does read ahead which causes the file
> position not to be what you think it is.
>
> See:
>
>   
>>>> with open('data.txt', 'r+') as f:
>>>>         
> ...   for line in f:
> ...       print line, f.tell()
> ... 
> line1
> 18
> line2
> 18
> line3
> 18
>
>   
In addition to the buffering involved in the read loop, trying to 
position ahead some number of lines would be rather error prone, since 
this is a text file, with varying length lines, and the \n character 
might occupy one byte on some OS, and two bytes on others (Windows).  If 
you feel you must do it in-place, then switch the file mode to binary, 
and use read(), not readline(), keeping track of your own position at 
all times.

As was already suggested by Kushal, if the file is small enough to just 
use readlines() and manipulate that list, I'd do that.  If not, I'd scan 
through the file, creating a new one, creating a new one as you go, then 
rename the new one back when finished.  Actually, I'd create a new one 
even in the first case, in case of a crash while rewriting the file.

DaveA