[Tutor] Another fine mess

Kent Johnson kent37 at tds.net
Thu Jun 1 04:22:49 CEST 2006


GNULinuxGeek wrote:
> All,
> 
> I posted some time back for another task.
> 
> Have a new task now.  You were all so kind with advice that I thought I 
> might check my thoughts on this.
> 
> Being an old guy and a newbie to Python is the double-whammy, but here goes:
> 
> I get a CSV file with ~26 fields and ~7000 records.  This data needs to 
> be parsed based on specific field values.

Do you know about the csv module? It will take care of all the parsing 
and turn each line into a list of values. You can filter these as you 
like and make a list of lists of row values. This can be sorted and 
otherwise processed. I don't think you will need a temp file, it sounds 
like you can easily fit your data in memory.

Kent

> 
> So, here is my "draft" set of steps that I think need to be performed.
> 
>    1. Select the "file" of data
>    2. Clean the data with "strip" to make sure there is no extraneous
>       whitespace
>    3. Write the file back to a temp file.
>    4. On the new file, select records based on the value of some of the
>       fields (like a "sort on" in a spreadsheet.
>    5. Parse to obtain the matching records
>    6. Do a little math based on how many records and the data in one
>       field (success vs. fail)
>    7. Output the results of the math so it can be used in a spreadsheet
>       to make cute graphics.
> 
> We have a PERL guy in the division, but I am trying to use Python only 
> to get this done.  My question is
> 
> "What is a good way to read and process the lines of the CSV file?" 
> Do I pull them in one line at a time and write them to an output file? 
> Do I read the whole shebang, clean the data and write an output file, 
> then re-open the cleaned file?
> 
> 
> Thanking you all prematurely,
> 
> Regards,
> 
> Ralph
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 
> 




More information about the Tutor mailing list