[Csv] Problems with CSV Module
Skip Montanaro
skip at pobox.com
Wed May 21 16:28:29 CEST 2003
Andreas> 1. Documentation:
Andreas> What's a row? (The word row means a list or a tuple.)
Andreas> How does DictReader & DictWriter work? Having a couple of examples would
Andreas> help ;-))
Thanks, I'll add a couple examples and better define row. DictReader works
pretty much like dict cursors in the various Python database packages,
returning a dictionary instead of a tuple for each row of data. Here's an
example of using csv.DictReader. This particular snippet parses CSV files
dumped by Checkpoint Software's Firewall-1 product.
class fw1dialect(csv.Dialect):
lineterminator = '\n'
escapechar = '\\'
skipinitialspace = False
quotechar = '"'
quoting = csv.QUOTE_ALL
delimiter = ';'
doublequote = True
csv.register_dialect("fw1", fw1dialect)
fieldnames = ("num;date;time;orig;type;action;alert;i/f_name;"
"i/f_dir;product;src;s_port;dst;service;proto;"
"rule;th_flags;message_info;icmp-type;icmp-code;"
"sys_msgs;cp_message;sys_message").split(';')
rdr = csv.DictReader(f, fieldnames=fieldnames, dialect="fw1")
for row in rdr:
if row["num"] is None:
continue
nrows += 1
if action is not None and row["action"] != action:
continue
source = row.get("src", "unknown")
...
Note that instead of returning a tuple for each row, a dictionary is
returned. Its keys are the elements of the fieldnames parameter of the
constructor.
Andreas> 2. Locale:
Andreas> The CSV module doesn't use locale. The default delimiter for Austria
Andreas> (+Germany) in Windows is a semicolon ';' not a comma ','.
Andreas> Having the result, that you can't import a list generated by csv.writer()
Andreas> in Excel without changing your regional settings, or using
Andreas> csv.writer(delimiter=';').
Andreas> It would be nice if the CSV module would adopt to the language settings.
How can I get that from Python or do I have to know that if the locale is de
the default Excel delimiter is a semicolon? What other locales have a
semicolon as the default? I suspect if we have to enumerate them all it may
not get done? Also, note that the
Andreas> This could be really simple to implement using the locale
Andreas> module. But I took a short look at the locale module and it
Andreas> seems like there is no way to get the list separator sign
Andreas> (probably it's not POSIX complaint).
That would make it difficult to do.
Andreas> Another possibility would be to have a dialect like 'excel_ger'
Andreas> with the correct settings.
But what about all the other locales which must use a semicolon as the
default delimiter?
How about this in your code:
class excel(csv.excel):
delimiter = ';'
csv.register_dialect("excel", excel)
Andreas> 3. There is no .close()
Note that the "file-like object" can be any object which supports the
iterator protocol, so it need not have a close() method. In the test code
we often use lists, e.g.:
def test_read_with_blanks(self):
reader = csv.DictReader(["1,2,abc,4,5,6\r\n","\r\n",
"1,2,abc,4,5,6\r\n"],
fieldnames="1 2 3 4 5 6".split())
self.assertEqual(reader.next(), {"1": '1', "2": '2', "3": 'abc',
"4": '4', "5": '5', "6": '6'})
self.assertEqual(reader.next(), {"1": '1', "2": '2', "3": 'abc',
"4": '4', "5": '5', "6": '6'})
Andreas> f=file(FILE_CSV,'w')
Andreas> w=csv.writer(f,dialect='excel',delimiter=';')
Andreas> w.writerow((1,5,10,25,100,250,500,1000,1500))
Andreas> f.close()
Andreas> f=file(FILE_CSV,'r')
Andreas> r=csv.reader(file(FILE_CSV,'r'),dialect='excel',delimiter=';')
Andreas> print r.next()
Andreas> f.close()
Yes, this is what you'll have to do, though note that if you reuse f the
first call to f.close() is unnecessary.
Andreas> 4. There is no .readrow()
Andreas> This should be just another name for .next(). It's more
Andreas> intuitive if you write a row via .writerow() and read it via
Andreas> .readrow().
I think we can probably squeeze this in.
Skip
More information about the Csv
mailing list