better csv modules and where have object-craft gone?
Skip Montanaro
skip at pobox.com
Tue May 18 11:51:38 EDT 2004
Tim> I have been using object crafts csv module for quite a few
Tim> projects, mainly because I found the csv in python in it's current
Tim> incarnation is funtionally inferior to object crafts. The object
Tim> craft module for instance allowed you build up csv gradually (ie
Tim> field at a time rather the python csv module where the writer does
Tim> the work a record at a time) which isn't always the way I would
Tim> like to work, also I have always had encoding problems (specifcally
Tim> it doesn't support unicode as per the docs) everytime I used the
Tim> python module where as the object craft one always worked out of
Tim> the box.
I guess beauty is in the eye of the beholder. The Object Craft folks were
key authors of what's in the Python distribution. If you want to write a
field at a time, you should be able to subclass the csv.writer class and add
writefield() and commit() methods. The first appends to an internal list.
The second calls writerow() and clears the list. Something like this
(untested) code might work:
class FieldWriter(csv.writer):
def __init__(self, *args, **kwds):
csv.writer.__init__(self, *args, **kwds)
self.temp = []
def writefield(self, val):
self.temp.append(val)
def commit(self):
self.writerow(self.temp)
self.temp = []
(Be careful. You'll lose partial results if you don't clean up in a __del__
method.)
As for lack of Unicode support that's a known issue. I suppose it hasn't
been high enough on anyone's list of itches to have attracted any scratching
yet. Still, you might be able to get most of the way there with a subclass:
class UnicodeWriter(csv.writer):
def __init__(self, *args, **kwds):
self.encoding = kwds.get('encoding', 'utf-8')
if 'encoding' in kwds: del kwds['encoding']
csv.writer.__init__(self, *args, **kwds)
def writerow(self, row):
for (i,f) in enumerate(row):
if isinstance(f, unicode):
row[i] = f.encode(self.encoding)
I'm almost certain that reading data in multibyte encodings won't work
though, as the low-level reader is byte-oriented instead of
character-oriented. Patches are welcome to resolve that deficiency.
Skip
More information about the Python-list
mailing list