[Tutor] Searching for word in text file

Kent Johnson kent37 at tds.net
Tue Jul 17 01:52:36 CEST 2007

Vladimir Strycek wrote:
> Hi all,
> i need script which open two text files and take first word from the 
> first file, runn throught first fords of second file and give me result 
> if its there or not...
> what i have so far is:
> import re, string
> # Nacitanie suborov na porovnanie
> subor1 = open("snow.txt", "r")
> subor2 = open("anakin.txt", "r")
> def prehladaj(najdi):
>     for riadok1 in subor2.readlines():
>         z = riadok1.rsplit(" ")
>         if najdi in z[2]:
>             return z[3]
> def porovnaj():
>     for riadok in subor1.readlines():
>         x = riadok.rsplit(" ") #rozdelime do array dany riadok kde v 2 
> bude nazov a verzia v 3
>         print x[2] + "    " + x[3]
>         print prehladaj(x[2])
> #vytvorenie tabulky
> print "     Server: snow                                  Server: anakin"
> print "--------------------------------------------------------------------"
> print "   Name     Version                                   Version"
> porovnaj()
> subor1.close()
> subor2.close()
> the snow.txt looks like:
>   B3693AA C.03.86.00     HP GlancePlus/UX for s800 11i
>   B3901BA B.11.11.14     HP C/ANSI C Developer's Bundle for HP-UX (S800)
>   B3913DB C.03.65        HP aC++ Compiler (S800)
>   B4967AA C.03.86.00     HP MeasureWare Server Agent for s800 11i
>   B5458DA C.01.18.04     HP-UX Runtime Environment for Java*
>   B5725AA B.3.5.89       HP-UX Installation Utilities (Ignite-UX)
> etc...
> anakint.txt is the same but different versions of programs.... im not 
> sure why tmi script dont work ( only for first one )
> What i basicaly need is to look up if version of programs match on bouth 
> text files diff in linux wont work for it cause there are not the same 
> programs on the same lines...
> Any idea why mi script is not working or any suggestion for different 
> aproach ?

You don't say how it is failing, but one problem is that the 
subor2.readlines() in prehladaj() will only work the first time. I'm 
pretty sure that if you want to read the lines from the file again you 
will have to close and re-open the file.

If the files are long this approach will be slow. Possibly a better way 
to do this is to build a dict from the data in anakin.txt though this 
assums that field 2 in anakin.txt is unique and that you want to do 
exact match searching which is not what your program does. The keys can 
be  field 2 (that you search on) and the values field 3. For example,

subor2 = open("anakin.txt", "r")
d = {} # you can think of a better name, I'm sure
for riadok1 in subor2.readlines():
     z = riadok1.rsplit(" ")
     d[z[2]] = z[3]

Then the matching is
         if najdi in d:
             return d[najdi]
with no looping over subor2 needed.


More information about the Tutor mailing list