sqlite INSERT performance
Jon Clements
joncle at googlemail.com
Thu May 31 12:06:49 EDT 2012
On Thursday, 31 May 2012 16:25:10 UTC+1, duncan smith wrote:
> On 31/05/12 06:15, John Nagle wrote:
> > On 5/30/2012 6:57 PM, duncan smith wrote:
> >> Hello,
> >> I have been attempting to speed up some code by using an sqlite
> >> database, but I'm not getting the performance gains I expected.
> >
> > SQLite is a "lite" database. It's good for data that's read a
> > lot and not changed much. It's good for small data files. It's
> > so-so for large database loads. It's terrible for a heavy load of
> > simultaneous updates from multiple processes.
> >
>
> Once the table is created the data will not be changed at all.
> Corresponding integer codes will have to be generated for columns. (I
> want to do this lazily because some columns might never be needed for
> output files, and processing all columns was relatively expensive for my
> initial solution.) After that it's a series of 'SELECT a, b, ... FROM
> table WHERE f="g" ORDER by a, b, ...' style queries dumped to space
> separated text files.
>
> > However, wrapping the inserts into a transaction with BEGIN
> > and COMMIT may help.
> >
>
> Unfortunately there's no discernible difference.
>
> > If you have 67 columns in a table, you may be approaching the
> > problem incorrectly.
> >
>
> Quite possibly. I have defined start and end points. The data are
> contained in text files. I need to do the mapping to integer codes and
> generate output files for subsets of variables conditional on the levels
> of other variables. (I was doing the subsequent sorting separately, but
> if I'm using SQL I guess I might as well include that in the query.) The
> output files are inputs for other (C++) code that I have no control over.
>
> Any approach that doesn't consume large amounts of memory will do. Cheers.
>
> Duncan
It might be worth checking out https://sdm.lbl.gov/fastbit/ which has Python bindings (nb: the library itself takes a while to compile), but I'm not I00% sure it would meet all your requirements.
Jon
More information about the Python-list
mailing list