[Tutor] Script to search in string of values from file A in file B

Dave Angel d at davea.name
Wed May 9 16:52:23 CEST 2012


On 05/09/2012 10:00 AM, Afonso Duarte wrote:
> Dear All,
>
>  
>
> I'm new to Python and started to use it to search text strings in big
> (>500Mb) txt files. 
>
> I have a list on text file (e.g. A.txt) that I want to use as a key to
> search another file (e.g. B.txt), organized in the following way:
>
>  
>
> A.txt:
>
>  
>
> Aaa
>
> Bbb
>
> Ccc
>
> Ddd
>
> .
>
> .
>
> .
>
>  
>
> B.txt
>
>  
>
> Bbb
>
> 1234
>
> Xxx
>
> 234
>
>  
>
>  
>
> I want to use A.txt to search in B.txt and have as output the original
> search entry (e.g. Bbb) followed by the line that follows it in the B.txt
> (e.g.  Bbb / 1234).
>
> I wrote the following script:
>
>  
>
>  
>
> object = open(B.txt', 'r')
>
> lista = open(A.txt', 'r')
>
> searches = lista.readlines()
>
> for line in object.readlines():
>
>      for word in searches:
>
>           if word in line: 
>
>                print line+'\n'
>
>  
>
>  
>
>  
>
> But from here I only get the searching entry and not the line afterwards, I
> tried to google it but I got lost and didn't manage to do it.
>
> Any ideas ? I guess that this is basic scripting but I just started .
>
>  
>
> Best 
>
>  
>
> Afonso
>
>
Please post your messages as plain-text.   The double-spacing I get is
very annoying.

There's a lot you don't say, which is implied in your code.

Are the lines in file B.txt really alternating:
 
key1
data for key1
key2
data for key2
...

Are the key lines in file B.txt exact messages, or do they just
"contain" the key somewhere in the line?   Your code assumes the latter,
but the whole thing could be much simpler if it were always an exact match.

Are the keys in A.txt unique?  If so, you could store them in a set, and
make lookup basically instantaneous.

I think the real question you had was how to access the line following
the key, once you matched the key.

Something like this should do it (untested)

lines = iter( object )
for key in lines:
    linedata = lines.next()
    if key in  mydictionary:
	print key, "-->", linedata



Main caveat I can see is the file had better have an even number of lines.

-- 

DaveA
//



More information about the Tutor mailing list