[Tutor] simple text replace

Dave Angel davea at ieee.org
Mon Jul 27 00:41:35 CEST 2009


j booth wrote:
> Hello,
>
> I am scanning a text file and replacing words with alternatives. My
> difficulty is that all occurrences are replaced (even if they are part of
> another word!)..
>
> This is an example of what I have been using:
>
>     for line in fileinput.FileInput("test_file.txt",inplace=1):
>   
>>         line = line.replace(original, new)
>>         print line,
>>         fileinput.close()
>>     
>
>
> original and new are variables that have string values from functions..
> original finds each word in a text file and old is a manipulated
> replacement. Essentially, I would like to replace only the occurrence that
> is currently selected-- not the rest. for example:
>
> python is great, but my python knowledge is limited! regardless, I enjoy
>   
>> pythonprogramming
>>     
>
>
> returns something like:
>
> snake is great, but my snake knowledge is limited! regardless, I enjoy
>   
>> snakeprogramming
>>     
>
>
> thanks so much!
>
>   
Not sure what you mean by "currently selected," you're processing a line 
at a time, and there are multiple legitimate occurrences of the word in 
the line.

The trick is to define what you mean by "word."  replace() has no such 
notion.  So we want to write a function such as:

given three strings, line, inword, and outword.  Find all occurrences of 
inword in the line, and replace all of them with outword.  The 
definition of word is a group of alphabetic characters (a-z perhaps) 
that is surrounded by non-alphabetic characters.

The approach that I'd use is to prepare a translated copy of the line as 
follows:   Replace each non-alphabetic character with a space.  Also 
insert a space at the beginning and one at the end.  Now, take the 
inword, and similarly add spaces at begin and end.  Now search this 
modified line for all occurrences of this modified inword, and make a 
list of the indices where it is found.  In your example line, there 
would be 2 items in the list.

Now, using the original line, use that list of indices to substitute the 
outword in the appropriate places.  Use slices to do it, preferably from 
right to left, so the indices will work even though the string is 
changing.  (The easiest way to do right to left is to reverse() the list.

DaveA



More information about the Tutor mailing list