PEP 249 (database api) -- executemany() with iterable?
martin at address-in-sig.invalid
Thu Oct 14 14:25:02 CEST 2010
On Thu, 14 Oct 2010 16:36:34 +1300, Lawrence D'Oliveiro wrote:
> In message <4cb5e659$0$1650$742ec2ed at news.sonic.net>, John Nagle wrote:
>> Also note that there are some issues with doing a huge volume of
>> updates in one MySQL InnoDB transaction. The system has to keep the
>> data needed to undo the updates, and there's a limit on the amount of
>> pending transaction history that can be stored.
> How does “load data” avoid this? Is that not a transaction too?
Not usually. Its faster because there's no journalling overhead. The
loader takes out an exclusive table lock, dumps the data into the table,
rebuilds indexes and releases the lock. I can't comment about MySQL
(don't use it) but this has been the case on the RDBMS databases I have
> Seems to me this isn’t going to help, since both old and new tables are
> on the same disk, after all. And it’s the disk access that’s the
There's a lot of overhead in journalling - much more than in applying
changes to a table. The before and after images *must* be flushed to disk
on commit. In UNIX terms fsync() must be called on the journal file(s)
and this is an expensive operation on all OSes because committing a
series of small transactions can cause the same disk block to be written
several times. However, the table pages can safely be left in the DBMS
cache and flushed as part of normal cache operation since, after a crash,
the table changes can always be recovered from a journal roll-forward. A
good DBMS will do that automatically when its restarted.
martin@ | Martin Gregorie
gregorie. | Essex, UK
More information about the Python-list