CSV api with conversion

Hello! My name is Denis Kolodin. I live in Russia, Tambov. I was developing much time with C, Java, C#, R. But two month ago I'm using Python. It's really cool. Now, I move ALL my projects to it fully and have some ideas which API's extensions may will be useful. The first thing I want to say about is an extension of CSV api. In R language I could to set types for the every column in a csv file. I propose to add a same function to the Python's standard library. Here it is (Python 3 version): import csv def reader2(csvfile, frame, *delimiter**=**';'*, **fmtparams): reader = csv.reader(csvfile, delimiter=delimiter, **fmtparams) for row in reader: l = min(len(row), len(frame)) yield [frame[idx](row[idx]) for idx in range(l)] This's generator function which converts an every column to the associated type. In *frame *argument you must to set tuple/list of functions which will uses to convert values in same positions of row from csv file. Frame looks like list of types ))) By default it uses ';' delimiter to make float values conversion are possible. As a sample you have the csv file like: *Any spam...; 1; 2.0; 3* I've saved it to "sample.csv" :) If you are using function reader in the standard "csv" module you get rows as a list of strings :( *>>> reader = csv.reader(open("sample.csv"), delimiter=";")* *>>> print(next(reader))* *['Any spam...', ' 1', ' 2.0', ' 3']* * * * It's not bad in certan situatiuons. But with "reader2" function you can get a list with necessary types:
Now you can work with items without extra conversions. [?] I think it's good to add this function to the standard library. I've already used it many times. This function can be useful for many people who works with csv files. And I suppose it conforms to "batteries included" philosophy. What do you think about this extension? Is it possible to add this function to standard library or to add the same behavior to the standard "readed" function in "csv" Python's module? Best Regards, Denis Kolodin Russia, Tambov *

Hello list This API would be very useful. (I’m using Python right know to filter hundreds of spreadsheets records. Loving it.) Suggestions: 1) Name the argument “converters” (it’s an iterable); 2) Make it a positional argument. Related wish: Add an argument for a row factory. Default would be list, and use cases include using tuple, a named tuple class, or any custom callable. Adding converters and rowfactory would remove the need for looping over CSV reader objects and manually using row and cell converters. Cheers

If you do this, you'll probably want to support mapping empty values. That is, [int,str] could map 1, ,2 to [1,None] [None,'2'] or [1,''] [0,'2'] I'm not sure what the defaults should be but there are reasonable use cases for both. --- Bruce http://www.vroospeak.com On Mon, Apr 12, 2010 at 3:24 AM, Éric Araujo <merwok@netwok.org> wrote:

Bruce Leban wrote:
The actual values read from the CSV file are strings and you're passing them to functions: int("1"), str("") int(""), str("2") If you want it to return a default value instead of raising an exception on an empty field then you should pass a conversion function which does that, for example: def int_or_none(field): if field.strip(): return int(field) else: return None

Hello list This API would be very useful. (I’m using Python right know to filter hundreds of spreadsheets records. Loving it.) Suggestions: 1) Name the argument “converters” (it’s an iterable); 2) Make it a positional argument. Related wish: Add an argument for a row factory. Default would be list, and use cases include using tuple, a named tuple class, or any custom callable. Adding converters and rowfactory would remove the need for looping over CSV reader objects and manually using row and cell converters. Cheers

If you do this, you'll probably want to support mapping empty values. That is, [int,str] could map 1, ,2 to [1,None] [None,'2'] or [1,''] [0,'2'] I'm not sure what the defaults should be but there are reasonable use cases for both. --- Bruce http://www.vroospeak.com On Mon, Apr 12, 2010 at 3:24 AM, Éric Araujo <merwok@netwok.org> wrote:

Bruce Leban wrote:
The actual values read from the CSV file are strings and you're passing them to functions: int("1"), str("") int(""), str("2") If you want it to return a default value instead of raising an exception on an empty field then you should pass a conversion function which does that, for example: def int_or_none(field): if field.strip(): return int(field) else: return None
participants (4)
-
Bruce Leban
-
Denis Kolodin
-
MRAB
-
Éric Araujo