[Tutor] Nested loop of I/O tasks
Christian Witts
cwitts at compuscan.co.za
Wed Nov 25 07:38:09 CET 2009
Bo Li wrote:
> Dear Python
>
> I am new to Python and having questions about its usage. Currently I
> have to read two .csv files INCT and INMRI which are similar to this
>
> INCT
> NONAME 121.57 34.71 14.81 1.35 0 0 1
> Cella 129.25 100.31 27.25 1.35 1 1 1
> Chiasm 130.3 98.49 26.05 1.35 1 1 1
> FMagnum 114.89 144.94 -15.74 1.35 1 1 1
> Iz 121.57 198.52 30.76 1.35 1 1 1
> LEAM 160.53 127.6 -1.14 1.35 1 1 1
> LEAM 55.2 124.66 12.32 1.35 1 1 1
> LPAF 180.67 128.26 -9.05 1.35 1 1 1
> LTM 77.44 124.17 15.95 1.35 1 1 1
> Leye 146.77 59.17 -2.63 1.35 1 0 0
> Nz 121.57 34.71 14.81 1.35 1 1 1
> Reye 91.04 57.59 6.98 1.35 0 1 0
>
>
> INMRI
> NONAME 121.57 34.71 14.81 1.35 0 0 1
> Cella 129.25 100.31 27.25 1.35 1 1 1
> Chiasm 130.3 98.49 26.05 1.35 1 1 1
> FMagnum 114.89 144.94 -15.74 1.35 1 1 1
> Iz 121.57 198.52 30.76 1.35 1 1 1
> LEAM 160.53 127.6 -1.14 1.35 1 1 1
> LEAM 55.2 124.66 12.32 1.35 1 1 1
> LPAF 180.67 128.26 -9.05 1.35 1 1 1
> LTM 77.44 124.17 15.95 1.35 1 1 1
> Leye 146.77 59.17 -2.63 1.35 1 0 0
>
>
> My job is to match the name on the two files and combine the first
> three attributes together. So far I tried to read two files. But when
> I tried to match the pattern using nested loop, but Python stops me
> after 1 iteration. Here is what I got so far.
>
> INCT = open(' *.csv')
> INMRI = open(' *.csv')
>
> for row in INCT:
> name, x, y, z, a, b, c, d = row.split(",")
> print aaa,
> for row2 in INMRI:
> NAME, X, Y, Z, A, B, C, D = row2.split(",")
> if name == NAME:
> print aaa
>
>
> The results are shown below
>
> "NONAME" "NONAME" "Cella " "NONAME" "Chiasm" "NONAME" "FMagnum"
> "NONAME" "Inion" "NONAME" "LEAM" "NONAME" "LTM" "NONAME" "Leye"
> "NONAME" "Nose" "NONAME" "Nz" "NONAME" "REAM" "NONAME" "RTM" "NONAME"
> "Reye" "Cella" "Chiasm" "FMagnum" "Iz" "LEAM" "LEAM" "LPAF" "LTM"
> "Leye" "Nz" "Reye"
>
>
> I was a MATLAB user and am really confused by what happens with me. I
> wish someone could help me with this intro problem and probably
> indicate a convenient way for pattern matching. Thanks!
> ------------------------------------------------------------------------
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
What's happening is you are iterating over the first file and on the
first line on that file you start iterating over the second file. Once
the second file has been completely looped through it is 'empty' so your
further iterations over file 1 can't loop through file 2.
If your output is going to be sorted like that so you know NONAME will
be on the same line in both files what you can do is
INCT = open('something.csv', 'r')
INMRI = open('something_else.csv', 'r')
rec_INCT = INCT.readline()
rec_INMRI = INMRI.readline()
while rec_INCT and rec_INMRI:
name, x, y, z, a, b, c, d = rec_INCT.split(',')
NAME, X, Y, Z, A, B, C, D = rec.INMRI.split(',')
if name == NAME:
print 'Matches'
rec_INCT = INCT.readline()
rec_INMRI = INMRI.readline()
INCT.close()
INMRI.close()
What will happen is that you open the files, read the first line of each
and then start with the while loop. It will only run the while as long
as both the INCT and INMRI files have more lines to read, if one of them
runs out then it will exit the loop. It then does the splitting, checks
to see if it matches at which point you can do your further processing
and after that read another line of each file.
Of course if the files are not sorted then you would have to process it
a little differently. If the file sizes are small you can use one of
the files to build a dictionary, key being the `name` and value being
the rest of your data, and then iterate over the second file checking to
see if the name is in dictionary. It would also work for this scenario
of perfect data as well.
Hope that helps.
--
Kind Regards,
Christian Witts
More information about the Tutor
mailing list