How to find bad row with db api executemany()?
Roy Smith
roy at panix.com
Fri Mar 29 21:19:22 EDT 2013
In article <mailman.3977.1364605026.2939.python-list at python.org>,
Chris Angelico <rosuav at gmail.com> wrote:
> On Sat, Mar 30, 2013 at 11:41 AM, Roy Smith <roy at panix.com> wrote:
> > In article <mailman.3971.1364595940.2939.python-list at python.org>,
> > Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
> >
> >> If using MySQLdb, there isn't all that much difference... MySQLdb is
> >> still compatible with MySQL v4 (and maybe even v3), and since those
> >> versions don't have "prepared statements", .executemany() essentially
> >> turns into something that creates a newline delimited "list" of
> >> "identical" (but for argument substitution) statements and submits that
> >> to MySQL.
> >
> > Shockingly, that does appear to be the case. I had thought during my
> > initial testing that I was seeing far greater throughput, but as I got
> > more into the project and started doing some side-by-side comparisons,
> > it the differences went away.
>
> How much are you doing per transaction? The two extremes (everything
> in one transaction, or each line in its own transaction) are probably
> the worst for performance. See what happens if you pepper the code
> with 'begin' and 'commit' statements (maybe every thousand or ten
> thousand rows) to see if performance improves.
>
> ChrisA
We're doing it all in one transaction, on purpose. We start with an
initial dump, then get updates about once a day. We want to make sure
that the updates either complete without errors, or back out cleanly.
If we ever had a partial daily update, the result would be a mess.
Hmmm, on the other hand, I could probably try doing the initial dump the
way you describe. If it fails, we can just delete the whole thing and
start again.
More information about the Python-list
mailing list