[Tutor] merging 2 files.

Martin A. Brown martin at linux-ip.net
Thu Feb 24 11:41:50 CET 2011


Hi Nitin,

 : currently the data in both the file is 6 - 10,000 rows max.

Many ways to skin this cat.  You say that the files are 6-10,000 
lines.  These are small files.  Load them into memory.  Learn how to 
use csv.reader.

 : PROBLEM : I need to pick the "first coloum" from test.csv AND 
 : SEARCH in jhun.csv "second coloum" , IF matches read that row 
 : from jhun.csv, break it into individual values , concat with the 
 : first file, test.csv, individual values and write to a third 
 : file, eg. merged2.csv

Always break your problem into its parts and examine your data.  
There's probably a data structure that suits your needs.  You have a 
lookup table, 'jhun.csv' (your second file).  Given your problem 
description, it seems like the first column in 'jhun.csv' has your 
unique identifiers.  

If that's accurate, then read that second file into some sort of 
in-memory lookup table.  A key, perhaps in a dictionary, would you 
say?

Then, you can simply read your other file (test.csv) and print to 
output.  This is one quick and dirty solution:

  import csv

  # -- build the lookup table
  #
  lookup = dict()
  file0 = csv.reader(open('jhun.csv','r'))
  for row in file0:
      lookup[ row[0] ] = row

  # -- now, read through the 
  #  
  file1 = csv.reader(open('test.csv','r'))
  for row in file1:
      exists = lookup.get( row[0], None )
      if exists:
          print row, exists  # -- print out only what you want
      else:
          pass  # -- do you need to do something if no lookup entry?

At 10^4 lines in the lookup file, you could easily do this in 
memory.

There are many tools for dealing with structured data, even loosely 
structured data such as csv.  When faced with a problem like this in 
the future, ask yourself not only about what tools like csv.reader 
you may have at your disposal, but also what data structures are 
suited to your questions of your data.

 : I am in need of the solution as client breathing down my neck.

They always do.  Wear a scarf.

-Martin

-- 
Martin A. Brown
http://linux-ip.net/


More information about the Tutor mailing list