Reading data from 2 different files and writing to a single file

Dave Angel davea at
Mon Jan 28 16:10:48 CET 2013

On 01/28/2013 09:47 AM, inshu chauhan wrote:
> Your current logic tries to scan through the first file, and for each line
> that has 12 elements, scans through the entire second file.  It fails to
> actually do it, because you never do a seek on the second file.
>> Now it appears your requirement is entirely different.  I believe you have
>> two text files each having the same number of lines.  You want to loop
>> through the pair of lines (once from each file, doing some kind of
>> processing and printing).  If that's the case, your nested loop is the
>> wrong thing, and you can forget my caveat about nesting file reads.
>> What you want is the zip() function
>> for l,s in zip(f1, f2):
>>      #you now have one line from each file,
>>      #   which you can then validate and process
>> Note, this assumes that when a line is "bad" from either file, you're
>> going to also ignore the corresponding line from the other.  If you have to
>> accommodate variable misses in the lining up, then your work is *much*
>> harder.
>> Actually these are Arff files used in Weka (Data Mining ), So they have a
> certain amount of header information which is the same in both files(in
> same no. of lines too )  and both files have equal lines, So when I read
> basically In both files I am trying to ignore the Header information.
> then it is like reading first line from f1 and first line from f2,
> extracting the data I want from each file and simply write it to a third
> file line by line...
> What does actually Zip function do ?
> Thanks and Regards

That's  "zip"  not  "Zip"

Have you tried looking at the docs?  Or even typing help(zip) at the 
python interpreter prompt?

In rough terms, zip takes one element (line) from each of the iterators, 
and creates a new list that holds tuples of those elements.  If you use 
it in this form:

      for item1, item2 in zip(iter1, iter2):

then item1 will be the first item of iter1, and item2 will be the first 
item of iter2.  You then process them, and loop around.  It stops when 
either iterator runs out of items.
    gives me

as the first link.

This will read the entire content of both files into the list, so if 
they are more than 100meg or so, you might want to use  izip().  (In 
Python3.x,  zip will do what izip does on Python 2.x)


More information about the Python-list mailing list