[Tutor] mergin two csv files based on a common join

spir denis.spir at free.fr
Sat Nov 1 10:49:19 CET 2008


qsqgeekyogdty at tiscali.co.uk a écrit:
    Hello again,
    Thanks for the replies on my previous post, but I have a different
    problem now and don't see how to deal with it in a smooth way.

    I have two csv files where:

    1.csv

    "1", "text", "aa"
    "2", "text2", "something else"
    "3", "text3", "something else"

    2.csv

    "text", "xx"
    "text", "yy"
    "text3", "zz"

    now I would like to have an output like:

    "1", "text", "aa"
    "1", "text", "xx"
    "2", "text2", "something else"
    "3", "text3", "something else"
    "3", "text3", "zz"

    I basically need to merge the two csv files based on the column-2

Two points seem unclear:
-1- As Kent asks, what happened to "yy" up there? Did you mean to keep 
it instead?
-2- You seem not to merge both file on column #2, rather to build a union.
Right? The following outline applies only if this guess is correct -- if 
not, just skip. You need to add an index field to the second csv file, 
then build a union of both:

for record in csv2:
    <extract n in "textn" as index>
    <insert index as first field>
<build union = csv1 + csv2>
<sort union on index field>

Possibly, what you really want is only the union to be /sorted/ on 
"text" field. And the index only serves this purpose. If this is true, 
then the index field may be kept anyway (but needs not be output), as it 
will speed up the sort.




More information about the Tutor mailing list