sqlite INSERT performance
duncan smith
buzzard at urubu.freeserve.co.uk
Thu May 31 11:25:10 EDT 2012
On 31/05/12 06:15, John Nagle wrote:
> On 5/30/2012 6:57 PM, duncan smith wrote:
>> Hello,
>> I have been attempting to speed up some code by using an sqlite
>> database, but I'm not getting the performance gains I expected.
>
> SQLite is a "lite" database. It's good for data that's read a
> lot and not changed much. It's good for small data files. It's
> so-so for large database loads. It's terrible for a heavy load of
> simultaneous updates from multiple processes.
>
Once the table is created the data will not be changed at all.
Corresponding integer codes will have to be generated for columns. (I
want to do this lazily because some columns might never be needed for
output files, and processing all columns was relatively expensive for my
initial solution.) After that it's a series of 'SELECT a, b, ... FROM
table WHERE f="g" ORDER by a, b, ...' style queries dumped to space
separated text files.
> However, wrapping the inserts into a transaction with BEGIN
> and COMMIT may help.
>
Unfortunately there's no discernible difference.
> If you have 67 columns in a table, you may be approaching the
> problem incorrectly.
>
Quite possibly. I have defined start and end points. The data are
contained in text files. I need to do the mapping to integer codes and
generate output files for subsets of variables conditional on the levels
of other variables. (I was doing the subsequent sorting separately, but
if I'm using SQL I guess I might as well include that in the query.) The
output files are inputs for other (C++) code that I have no control over.
Any approach that doesn't consume large amounts of memory will do. Cheers.
Duncan
More information about the Python-list
mailing list