[Tutor] longest common substring
lina
lina.lastname at gmail.com
Mon Nov 14 05:17:40 CET 2011
On Mon, Nov 14, 2011 at 11:56 AM, lina <lina.lastname at gmail.com> wrote:
> On Mon, Nov 14, 2011 at 6:28 AM, Andreas Perstinger
> <andreas.perstinger at gmx.net> wrote:
>> On 2011-11-11 14:44, lina wrote:
>>>
>>> You are right, I did not think of this parts before. and actually the
>>> initiative wish was to find possible paths, I mean, possible
>>> substrings, all possible substrings. not the longest one, but at
>>> least bigger than 3.
>>
>> I had some time today and since you have changed your initial task (from
>> finding the longest common path to finding all common paths with a minimum
>> length) I've modified the code and came up with the following solution:
>>
>> def AllCommonPaths(list1, list2, minimum=3):
>> """ finds all common paths with a minimum length (default = 3)"""
>>
>> # First we have to initialize the necessary variables:
>> # M is an empty table where we will store all found matches
>> # (regardless of their length)
>>
>> M = [[0] * (len(list2)) for i in range(len(list1))]
>>
>> # length is a dictionary where we store the length of each common
>> # path. The keys are the starting positions ot the paths in list1.
>>
>> length = {}
>>
>> # result will be a list of of all found paths
>>
>> result =[]
>>
>> # Now the hard work begins:
>> # Each element of list1 is compared to each element in list2
>> # (x is the index for list1, y is the index for list2).
>> # If we find a match, we store the distance to the starting point
>> # of the matching block. If we are in the left-most column (x == 0)
>> # or in the upper-most row (y == 0) we have to set the starting
>> # point ourself because we would get negative indexes if we look
>> # for the predecessor cell (M[x - 1][y - 1]). Else, we are one
>> # element farther away as the element before, so we add 1 to its
>> # value.
>>
>> for x in range(len(list1)):
>> for y in range(len(list2)):
>> if list1[x] == list2[y]:
>> if (x == 0) or (y == 0):
>> M[x][y] = 1
>> else:
>> M[x][y] = M[x - 1][y - 1] + 1
>>
>> # To get everything done in one pass, we update the length of
>> # the found path in our dictionary if it is longer than the minimum
>> # length. Thus we don't have to get through the whole table a
>> # second time to get all found paths with the minimum length (we
>> # don't know yet if we are already at the end of the matching
>> # block).
>>
>> if M[x][y] >= minimum:
>> length[x + 1 - M[x][y]] = M[x][y]
>>
>>
>> # We now have for all matching blocks their starting
>> # position in list1 and their length. Now we cut out this parts
>> # and create our resulting list
How silly I was, it's nothing to do with x,y, since I used i and j,
it's crystal clear.
Thanks again for your time,
Best regards,
lina
>
> This is a very smart way to store their starting position as a key. My
> mind was choked about how to save the list as a key before.
>
>>
>> for pos in length:
>> result.append(list1[pos:pos + length[pos]])
>>
>> return result
>>
>> I've tried to explain what I have done, but I'm sure you will still have
>> questions :-).
>
> I am confused myself with this matrix/array, about how to define
> x-axis, y-axis.
>
> I must understand some parts wrong, for the following:
>>
>> Is this close to what you want?
>>
>> Bye, Andreas
>>
>> PS: Here's the function again without comments:
>>
>> def AllCommonPaths(list1, list2, minimum=3):
>> """ finds all common paths with a minimum length (default = 3)"""
>>
>> M = [[0] * (len(list2)) for i in range(len(list1))]
>
> is it correct that the list2 as the x-axis, the list1 as y-axis:?
>
>> length = {}
>> result =[]
>>
>> for x in range(len(list1)):
>
> Here for each row ,
>
>> for y in range(len(list2)):
>
> This loop go through each column of certain row then,
>
>> if list1[x] == list2[y]:
>> if (x == 0) or (y == 0):
>> M[x][y] = 1
>
> Here M[x][y] actually means the x-row? y-column, seems conflicts with
> the x-axis and y-axis. they took y-axis as x row, x-axis as y column.
>
>> else:
>> M[x][y] = M[x - 1][y - 1] + 1
>> if M[x][y] >= minimum:
>> length[x + 1 - M[x][y]] = M[x][y]
>>
>> for pos in length:
>> result.append(list1[pos:pos + length[pos]])
>>
>> return result
>
> I have no problem understanding the other parts, except the array and
> axis entangled in my mind.
>
>> _______________________________________________
>> Tutor maillist - Tutor at python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>>
>
More information about the Tutor
mailing list