Replace in large text file ?

Nobody nobody at nowhere.com
Sun Jun 6 15:55:21 EDT 2010


On Sat, 05 Jun 2010 16:35:42 +0100, MRAB wrote:

>>> In plain language what I wish to do is:
>>>
>>> Remove all comma's
>>> Replace all @ with comma's

>> input_file = open("some_huge_file.txt", "r")
>> output_file = open("newfilename.txt", "w")
>> for line in input_file:

> I'd probably process it in larger chunks:
> 
>      CHUNK_SIZE = 1024 ** 2 # 1MB at a time
>      input_file = open("some_huge_file.txt", "r")
>      output_file = open("newfilename.txt", "w")
>      while True:
>          chunk = input_file.read(CHUNK_SIZE)

This is fine for the exact problem at hand. The moment the problem evolves
into replacing a sequence of two or more characters, processing
line-by-line eliminates the problem where the chunk boundary occurs in the
middle of the sequence.




More information about the Python-list mailing list