CSV writer question

Chris Rebert clp2 at rebertia.com
Mon Oct 24 02:08:59 EDT 2011


On Sun, Oct 23, 2011 at 10:18 PM, Jason Swails <jason.swails at gmail.com> wrote:
> Hello,
>
> I have a question about a csv.writer instance.  I have a utility that I want
> to write a full CSV file from lots of data, but due to performance (and
> memory) considerations, there's no way I can write the data sequentially.
> Therefore, I write the data in chunks to temporary files, then combine them
> all at the end.  For convenience, I declare each writer instance via a
> statement like
>
> my_csv = csv.writer(open('temp.1.csv', 'wb'))
>
> so the open file object isn't bound to any explicit reference, and I don't
> know how to reference it inside the writer class (the documentation doesn't
> say, unless I've missed the obvious).  Thus, the only way I can think of to
> make sure that all of the data is written before I start copying these files
> sequentially into the final file is to unbuffer them so the above command is
> changed to
>
> my_csv = csv.writer(open('temp.1.csv', 'wb', 0))
>
> unless, of course, I add an explicit reference to track the open file object
> and manually close or flush it
> (but I'd like to avoid it if possible).

Why? Especially when the performance cost is likely to be nontrivial...

> Is there a way to do that directly via the CSV API,

Very doubtful; csv.writer (and reader for that matter) is implemented
in C, doesn't expose a ._file or similar attribute, and has no
.close() or .flush() methods.

Cheers,
Chris
--
http://rebertia.com



More information about the Python-list mailing list