[Tutor] csv merge with different column title

Peter Otten __peter__ at web.de
Sat Apr 14 06:32:22 EDT 2018


Pareshkumar Panchal wrote:

> I am trying to merge two csv files with following condition.
> 
> filea.csv column( a1,a2,a3)
> fileb.csv column( b1,b2,b3)
> 
> i wanted to merge if a1 & b1 reference (as the title is different but
> contents are valid for comparison). 

If the column title is the problem you can rename it.

> also the result i need is
> a1,a2,a3,b1,b2 (not b3).

Then delete that column either before or after the merge.
 
> i am using openpyxl,csv modules at this moment.
> 
> any help appreciated.

Here's a demo using pandas:

>>> import pandas as pd

Read the original files:

>>> left = pd.read_csv("left.csv")
>>> left
  alpha   beta  gamma
0   foo    one    two
1   bar  three   four
2   baz   five    six
3   ham  seven  eight

[4 rows x 3 columns]
>>> right = pd.read_csv("right.csv")
>>> right
  delta epsilon   zeta
0   foo     the  quick
1   bar   brown    fox
2   baz   jumps   over
3  spam     the   lazy

[4 rows x 3 columns]

Merge by alpha/delta column:

>>> left.merge(right, left_on="alpha", right_on="delta", how="outer")
  alpha   beta  gamma delta epsilon   zeta
0   foo    one    two   foo     the  quick
1   bar  three   four   bar   brown    fox
2   baz   five    six   baz   jumps   over
3   ham  seven  eight   NaN     NaN    NaN
4   NaN    NaN    NaN  spam     the   lazy

[5 rows x 6 columns]
>>> both = _
>>> import sys

Write to file (or stream as I did for demonstration purposes):

>>> both.to_csv(sys.stdout)
,alpha,beta,gamma,delta,epsilon,zeta
0,foo,one,two,foo,the,quick
1,bar,three,four,bar,brown,fox
2,baz,five,six,baz,jumps,over
3,ham,seven,eight,,,
4,,,,spam,the,lazy

Delete a column you don't need (could have done that earlier):

>>> del both["zeta"]
>>> both.to_csv(sys.stdout)
,alpha,beta,gamma,delta,epsilon
0,foo,one,two,foo,the
1,bar,three,four,bar,brown
2,baz,five,six,baz,jumps
3,ham,seven,eight,,
4,,,,spam,the




More information about the Tutor mailing list