PEP 249 (database api) -- executemany() with iterable?

Lawrence D'Oliveiro ldo at geek-central.gen.new_zealand
Sat Oct 16 18:27:17 EDT 2010


In message <i96squ$aqj$1 at localhost.localdomain>, Martin Gregorie wrote:

> On Thu, 14 Oct 2010 16:36:34 +1300, Lawrence D'Oliveiro wrote:
> 
>> In message <4cb5e659$0$1650$742ec2ed at news.sonic.net>, John Nagle wrote:
>> 
>>>      Also note that there are some issues with doing a huge volume of
>>> updates in one MySQL InnoDB transaction.  The system has to keep the
>>> data needed to undo the updates, and there's a limit on the amount of
>>> pending transaction history that can be stored.
>> 
>> How does “load data” avoid this? Is that not a transaction too?
>>
> Not usually. Its faster because there's no journalling overhead.

So what happens if there’s an error part way? Do you end up with some 
records inserted and others not?

> The loader takes out an exclusive table lock, dumps the data into the
> table, rebuilds indexes and releases the lock.

That’s not acceptable in a situation where other processes are concurrently 
making queries on the table.

>> Seems to me this isn’t going to help, since both old and new tables are
>> on the same disk, after all. And it’s the disk access that’s the
>> bottleneck.
>>
> There's a lot of overhead in journalling - much more than in applying
> changes to a table. The before and after images *must* be flushed to disk
> on commit. In UNIX terms fsync() must be called on the journal file(s)
> and this is an expensive operation on all OSes because committing a
> series of small transactions can cause the same disk block to be written
> several times. However, the table pages can safely be left in the DBMS
> cache and flushed as part of normal cache operation since, after a crash,
> the table changes can always be recovered from a journal roll-forward. A
> good DBMS will do that automatically when its restarted.

And how does this address the point of whether to load the new records into 
a new table or the same table?



More information about the Python-list mailing list