
On Tue, Jul 21, 2015 at 12:44 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Mon, Jul 20, 2015 at 10:10 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
If text formatting is your bottleneck, congratulations on fixing your network, disk, RAM and probably your users.
Thank you, but one of my servers just spent 18 hours loading 10GB of XML data into a database. Given that CPU was loaded 100% all this time, I suspect neither network nor disk and not even RAM was the bottleneck. Since XML parsing was done by C code and only formatting of database INSERT instructions was done in Python, I strongly suspect string formatting had a sizable carbon footprint in this case.
Not all string formatting is done for human consumption.
Well-known rule of optimization: Measure, don't assume. There could be something completely different that's affecting your performance. I'd be impressed and extremely surprised if the formatting of INSERT queries took longer than the execution of those same queries, but even if that is the case, it could be the XML parsing (just because it's in C doesn't mean it's inherently faster than any Python code), or the database itself, or suboptimal paging of virtual memory. Before pointing fingers anywhere, measure. Measure. Measure! ChrisA