[Tutor] Understanding a linear runtime implementation of anagram detection
Spyros Charonis
s.charonis at gmail.com
Thu Dec 10 07:38:46 EST 2015
Dear All,
I am learning about analysis of algorithms (python 2.7.6). I am reading a
book (Problem solving with Algorithms and Data Structures) where Python is
the language used for implementations. The author introduces algorithm
analysis in a clear and understandable way, and uses an anagram detection
program as a template to compare different runtime implementations
(quadratic, log linear, linear). In the linear, and most efficient
implementation, the code is as follows (comments added by me):
def anagram_test2(s1,s2):""" Checks if two strings are anagrams of each other
Runs with O(n) linear complexity """
if (not s1) or (not s2):
raise TypeError, "Invalid input: input must be string"
return None
# Initialize two lists of counters
c1 = [0] * 26
c2 = [0] * 26
# Iterate over each string# When a char is encountered, # increment
the counter at # its correspoding position for i in range(len(s1)):
pos = ord(s1[i]) - ord("a")
c1[pos] += 1
for i in range(len(s2)):
pos = ord(s2[i]) - ord("a")
c2[pos] += 1
j = 0
hit = Truewhile j < 26 and hit:
if c1[j] == c2[j]:
j += 1
else:
hit = False
return hit
My questions are:
1)
Is it computationally more/less/equally efficient to use an explicit while
loop as it is to just do "return c1 === c2" (replacing the final code block
following the two for loops). I realize that this single line of code
performs an implicit for loop over each index to test for equality. My
guess is that because in other languages you may not be able to do this
simple test, the author wanted to present an example that could be adapted
for other languages, unless the explicit while loop is less expensive
computationally.
2)
How could I go about adapting this algorithm for multiple strings (say I
had 20 strings and wanted to check if they are anagrams of one another).
def are_anagrams(*args):
""" Accepts a tuple of strings and checks if
they are anagrams of each other """
# Check that neither of strings are null
for i in args:
if not i:
raise TypeError, "Invalid input"
return None
# Initialize a list of counters for each string
c = ( [] for i in range(len(args) ) ???
Many thanks in advance!
More information about the Tutor
mailing list