<font color="#000000">Hello,<br><br>I have a question about a csv.writer instance. I have a utility that I want to write a full CSV file from lots of data, but due to performance (and memory) considerations, there's no way I can write the data sequentially. Therefore, I write the data in chunks to temporary files, then combine them all at the end. For convenience, I declare each writer instance via a statement like<br>
<br>my_csv = csv.writer(open('temp.1.csv', 'wb'))<br><br>so the open file object isn't bound to any explicit reference, and I don't know how to reference it inside the writer class (the documentation doesn't say, unless I've missed the obvious). Thus, the only way I can think of to make sure that all of the data is written before I start copying these files sequentially into the final file is to unbuffer them so the above command is changed to<br>
<br>my_csv = csv.writer(open('temp.1.csv', 'wb', 0))<br><br>unless, of course, I add an explicit reference to track the open file object and manually close or flush it (but I'd like to avoid it if possible). My question is 2-fold. Is there a way to do that directly via the CSV API, or is the approach I'm taking the only way without binding the open file object to another reference? Secondly, if these files are potentially very large (anywhere from ~1KB to 20 GB depending on the amount of data present), what kind of performance hit will I be looking at by disabling buffering on these types of files?<br>
<br>Tips, answers, comments, and/or suggestions are all welcome.<br><br>Thanks a lot!<br>Jason</font><br><br>As an afterthought, I suppose I could always subclass the csv.writer class and add the reference I want to that, which I may do if there's no other convenient solution.<br>