python compare and process a csv file

Chris Rebert clp2 at rebertia.com
Wed Nov 4 03:49:31 EST 2009


On Tue, Nov 3, 2009 at 7:43 AM, Siva Subramanian
<elpostssv at rediffmail.com> wrote:
> Hello all,
>
> I am new on this list and computer programming
>
> I have two distinct statistical files (both csv)
>
> 1.       Report_2_5 – this is a report dump containing over a 10 million records and is different every day
>
> 2.       Customer_id dump – this is a daily dump of customers who have made payments. This is generally a million record
>
> I need to extract past history depending on customers who make regular payments
>
> For example,
>
> Report_2_5
>
> Customer ID, Plan_NO, stat1, vol2, amount3
> 2134, Ins1, 10000, 20000, 10
> 2112, Ins3, 30000, 20000, 10
> 2121, Ins3, 30000, 20000, 10
> 2145, Ins2, 15000, 10000, 5
> 2245, Ins2, 15000, 10000, 5
> 0987, Ins1, 10000, 20000, 10
>
> 4546, Ins1, 10020, 21000, 10
>
> 6757, Ins1, 10200, 22000, 10
>>
> customer_id dump
>
>
> 0987
>
> 4546
>
> 6757
>
> 2134
>
> I need to process the Report_2_5 and extract the following output
>
> Stat1: 40220
> Vol2 : 83000
> Amount3 : 40
>
> I am new to programming and I have been extracting this data using MS – Access and I badly need a better solution.

Have you considered using a proper SQL database? (See
http://en.wikipedia.org/wiki/SQL ; MySQL is one example:
http://en.wikipedia.org/wiki/MySQL)
Mucking around with CSV files like this is basically doing the work of
some simple SQL queries, only in an ad-hoc, inefficient manner. (MS
Access is essentially a non-industrial-strength SQL for
non-programmers.)

Cheers,
Chris
--
http://blog.rebertia.com



More information about the Python-list mailing list