How to find bad row with db api executemany()?
Dave Angel
davea at davea.name
Fri Mar 29 14:53:30 EDT 2013
On 03/29/2013 10:48 AM, Roy Smith wrote:
> I'm inserting a gazillion rows into a MySQL database using MySQLdb and cursor.executemany() for efficiency. Every once in a while, I get a row which violates some kind of database constraint and raises Error.
>
> I can catch the exception, but don't see any way to tell which row caused the problem. Is this information obtainable, short of retrying each row one by one?
>
I don't know the direct answer, or even if there is one (way to get
MySQL to tell you which one failed), but ...
Assuming that executeMany is much cheaper than a million calls to
executeOne (or whatever).
-- single bad rows --
If you have a million items, and you know exactly one is different, you
can narrow it down more quickly than just sequencing through them. You
can do half of them at a time, carefully choosing which subset of the
total you use each time. After 20 such calls, you can then calculate
exactly which one is different. Standard CS algorithm.
-- sparse set of rows --
If you know that it's at least one, but still less than a dozen or so,
it's a little trickier, but you should still converge on a final list
pretty quickly. Each time you do half, you also do the complementary
half. If either of them has no 'differences" you can then eliminate
half the cases.
If you don't get a specific answer where MySQL can tell you the bad row,
and if you don't know what I'm talking about, ask and I'll try to
elaborate on one of the two above cases.
--
DaveA
More information about the Python-list
mailing list