[Tutor] Script to search in string of values from file A in file B

Afonso Duarte aduarte at itqb.unl.pt
Wed May 9 17:04:25 CEST 2012



-----Original Message-----
From: Dave Angel [mailto:d at davea.name] 
Sent: woensdag 9 mei 2012 15:52
To: Afonso Duarte
Cc: tutor at python.org
Subject: Re: [Tutor] Script to search in string of values from file A in
file B

On 05/09/2012 10:00 AM, Afonso Duarte wrote:
> Dear All,
> I'm new to Python and started to use it to search text strings in big
> (>500Mb) txt files. 
> I have a list on text file (e.g. A.txt) that I want to use as a key to 
> search another file (e.g. B.txt), organized in the following way:
>
> A.txt:
>
> Aaa
> Bbb
> Ccc
> Ddd
> .
> .
> .
>
>  
>
> B.txt
> Bbb
> 1234
> Xxx
> 234
>
> I want to use A.txt to search in B.txt and have as output the original 
> search entry (e.g. Bbb) followed by the line that follows it in the 
> B.txt (e.g.  Bbb / 1234).
> I wrote the following script:
>
>  
>
>  
>
> object = open(B.txt', 'r')
> lista = open(A.txt', 'r')
> searches = lista.readlines()
> for line in object.readlines():
>      for word in searches:
>           if word in line: 
>                print line+'\n'

>  
>
> But from here I only get the searching entry and not the line 
> afterwards, I tried to google it but I got lost and didn't manage to do
it.
> Any ideas ? I guess that this is basic scripting but I just started .
>
> Best
>
> Afonso
>
>
>Please post your messages as plain-text.   The double-spacing I get is
>very annoying.

Sorry for that my outlook mess-it-up

>There's a lot you don't say, which is implied in your code.
>Are the lines in file B.txt really alternating:
> 
>key1
>data for key1
>key2
>data for key2
>...

Sure, that's why I describe them in the email like that and didn't say that
they weren't

>Are the key lines in file B.txt exact messages, or do they just
>"contain" the key somewhere in the line? 
>  Your code assumes the latter,
>but the whole thing could be much simpler if it were always an exact match.

The entry in B has text before and after (the size of that text changes from
entry to entry.


>Are the keys in A.txt unique?  If so, you could store them in a set, and
make lookup basically >instantaneous.

That indeed I didn't refer, the entries from A are unique in B


>I think the real question you had was how to access the line following the
key, once you matched the key.

True that is my real question (as the code above works just for the title
line, I basically want to print the next line of the B.txt for each entry)

>Something like this should do it (untested)
>
>lines = iter( object )
>for key in lines:
>    linedata = lines.next()
>    if key in  mydictionary:
>	print key, "-->", linedata


>Main caveat I can see is the file had better have an even number of lines.


That changes from file to file, and its unlikely i have all even number.

Thanks


Afonso


-- 

DaveA
//



More information about the Tutor mailing list