newb: comapring two strings
johnzenger at gmail.com
johnzenger at gmail.com
Thu May 18 23:25:01 EDT 2006
manstey wrote:
> Hi,
>
> Is there a clever way to see if two strings of the same length vary by
> only one character, and what the character is in both strings.
You want zip.
def diffbyonlyone(string1, string2):
diffcount = 0
for c1, c2 in zip(string1, string2):
if c1 != c2:
diffcount += 1
if diffcount > 1:
return False
return diffcount == 1
print diffbyonlyone("yaqtil","yaqtel") # True
print diffbyonlyone("yiqtol","yaqtel") # False
If your strings are long, it might be faster/more memory efficient to
use itertools.izip instead.
> My next problem is, I have a list of 300,000+ words and I want to find
> every pair of such strings. I thought I would first sort on length of
> string, but how do I iterate through the following:
>
> str1
> str2
> str3
> str4
> str5
>
> so that I compare str1 & str2, str1 & str3, str 1 & str4, str1 & str5,
> str2 & str3, str3 & str4, str3 & str5, str4 & str5.
for index1 in xrange(len(words)):
for index2 in xrange(index1+1,len(words)):
if diffbyonlyone(words[index1], words[index2]):
print words[index1] + " -- " + words[index2]
...but by all means run that only on sets of words that you have
already identified, pursuant to some criteria like word length, to be
likely matches. Do the math; that's a lot of comparisons!
More information about the Python-list
mailing list