[Tutor] Script to search in string of values from file A in file B

Dave Angel d at davea.name
Wed May 9 17:16:47 CEST 2012


On 05/09/2012 11:04 AM, Afonso Duarte wrote:
> 
> 
> -----Original Message-----
> From: Dave Angel [mailto:d at davea.name] 
> <SNIP>
>>
>> Please post your messages as plain-text.   The double-spacing I get is
>> very annoying.
> 
> Sorry for that my outlook mess-it-up

I'm sure there's a setting to say use plain-text.  In Thunderbird, i
tell it that any message to forums is to be plain-text.

> 
>> There's a lot you don't say, which is implied in your code.
>> Are the lines in file B.txt really alternating:
>>
>> key1
>> data for key1
>> key2
>> data for key2
>> ...
> 
> Sure, that's why I describe them in the email like that and didn't say that
> they weren't
> 
>> Are the key lines in file B.txt exact messages, or do they just
>> "contain" the key somewhere in the line? 
>>  Your code assumes the latter,
>> but the whole thing could be much simpler if it were always an exact match.
> 
> The entry in B has text before and after (the size of that text changes from
> entry to entry.

In other words, the line pairs are not like your sample, but more like:

trash  key1    more trash
Useful associated data for the previous key
trash2 key2    more trash
Useful associated ata for the previous key


> 
> 
>> Are the keys in A.txt unique?  If so, you could store them in a set, and
> make lookup basically >instantaneous.
> 
> That indeed I didn't refer, the entries from A are unique in B

Not what I asked.  Are the keys in A.txt ever present more than once in
A.txt ?  But then again, if the key line can contain garbage before
and/or after the key, then the set idea is moot anyway.

> 
> 
>> I think the real question you had was how to access the line following the
> key, once you matched the key.
> 
> True that is my real question (as the code above works just for the title
> line, I basically want to print the next line of the B.txt for each entry)
> 
>> Something like this should do it (untested)
>>
>> lines = iter( object )
>> for key in lines:
>>    linedata = lines.next()
>>    if key in  mydictionary:
>> 	print key, "-->", linedata
> 
> 
>> Main caveat I can see is the file had better have an even number of lines.
> 
> 
> That changes from file to file, and its unlikely i have all even number.

In that case, what do you use for data of the last key?


If you really have to handle the case where there is a final key with no
data, then you'll have to detect that case, and make up the data
separately.  That could be done with a try block, but this is probably
clearer:

rawlines = object.readlines()
if len(rawlines) %2 != 0:
    rawlines += ""      #add an extra line
lines = iter(rawlines)

for keyline in lines:
    linedata = lines.next()
    for word in searches:
        if word in keyline:
            print word, "-->", linedata


> 
> Thanks
> 
> 
> Afonso
> 
> 


-- 

DaveA


More information about the Tutor mailing list