Replace in large text file ?
Nobody
nobody at nowhere.com
Sun Jun 6 15:55:21 EDT 2010
On Sat, 05 Jun 2010 16:35:42 +0100, MRAB wrote:
>>> In plain language what I wish to do is:
>>>
>>> Remove all comma's
>>> Replace all @ with comma's
>> input_file = open("some_huge_file.txt", "r")
>> output_file = open("newfilename.txt", "w")
>> for line in input_file:
> I'd probably process it in larger chunks:
>
> CHUNK_SIZE = 1024 ** 2 # 1MB at a time
> input_file = open("some_huge_file.txt", "r")
> output_file = open("newfilename.txt", "w")
> while True:
> chunk = input_file.read(CHUNK_SIZE)
This is fine for the exact problem at hand. The moment the problem evolves
into replacing a sequence of two or more characters, processing
line-by-line eliminates the problem where the chunk boundary occurs in the
middle of the sequence.
More information about the Python-list
mailing list