[Twisted-Python] How to correctly run Sqlite with Twisted?

Hello, I'm using Sqlite3 module through Twisted's enterpirse.adbapi, I create the ConnectionPool instance like this: self.dbpool = ConnectionPool("sqlite3", db_filename, check_same_thread=False) You can see the code at https://repos.goffi.org/sat/file/tip/src/memory/sqlite.py Sometime, the writing is failing with following exception: Failure instance: Traceback: <class 'sqlite3.OperationalError'>: database is locked So I wonder if the database is correctly used, did anybody experienced something similar with Twisted and Sqlite ? Should I just augment timeout as advised at https://stackoverflow.com/a/ 8618328? Looks like more a workaround than a clean solution. Python 2 documentation doesn't talk about check_same_thread argument, but Python 3 at https://docs.python.org/3.5/library/sqlite3.html#sqlite3.connect says that writing operation should be serialized by the user (I thought it was the default as said in https://sqlite.org/threadsafe.html), how should I achieve that? Also PRAGMA are not working (specially "PRAGMA foreign_keys = ON"), I guess because of multi-threading, what is the good way to activate foreign_keys for all following request? Thanks in advance Goffi

The best answer is probably https://github.com/alex/alchimia On Tue, Sep 19, 2017 at 1:59 PM Goffi <goffi@goffi.org> wrote:

Aloha Goffi I'm using minimal code like the one below for MySQL interaction, should be easy to transition this to SQLite from twisted.internet import reactor, task, defer from store import Store #debugging func, printing the result on the console def _transformResult(result): ## print '####', result if result: return result[0] #unpack the list else: return None class Somestore(object): name = "somestore" def __init__(self, dbapiName, **params): self.store = Store(dbapiName, **params) ka = task.LoopingCall(self.keepAlive) #db keepalive ka.start(307) def dbdisconn(self, reason): print 'db disconnected for ', reason def keepAlive(self): try: d = self.store.runQuery('SELECT 1') d.addErrback(self.dbdisconn) except: pass else: pass #do whatever MUST occur here in all cases def getTableCount(self): d = self.store.runQuery('SELECT tables FROM user WHERE servername = %s', 'total') d.addCallback(_transformResult) d.addErrback(self.dbdisconn) return d def setUserCount(self, waiting, playing, tables): d = self.store.runOperation('UPDATE user SET waiting = %s, playing = %s, tables = %s WHERE servername = %s', waiting, playing, tables, self.loggername) d.addErrback(self.dbdisconn) return d module store.py from itertools import izip from twisted.enterprise import adbapi using it with canned queries like class Store(object): def __init__(self, dbapiName, **params): self.__pool = adbapi.ConnectionPool(dbapiName, **params) print self.__pool.__getstate__() self.runOperation('SET autocommit = %s', 1) def runQuery(self, query, *args): d = self.__pool.runInteraction(self.mapQuery, query, args) return d def runInsert(self, query, *args): def mapQ(curs, query, *args): try: curs.execute(query, *args) except adbapi.ConnectionLost: print print '++++++++++++ rerunning query' print curs.execute(query, *args) #simply resend query, assuming cp_reconnect=True return {'lastrowid': -1} return {'lastrowid': curs.lastrowid} d = self.__pool.runInteraction(mapQ, query, args) return d def mapQuery(self, curs, query, *args): try: curs.execute(query, *args) except adbapi.ConnectionLost: curs.execute(query, *args) #simply resend query, assuming cp_reconnect=True result = curs.fetchall() columns = [d[0] for d in curs.description] return [dict(zip(columns, r)) for r in result] def runOperation(self, query, *args): d = self.__pool.runOperation(query, args) return d On 9/19/2017 10:59 AM, Goffi wrote:

On Tue, Sep 19, 2017 at 4:59 PM, Goffi <goffi@goffi.org> wrote:
"Correctly" has so many shades. Here's another possible direction for exploration: https://github.com/twisted/axiom Jean-Paul http://as.ynchrono.us/

On 2017-09-19, at 21:59, Goffi <goffi@goffi.org> wrote:
I would subscribe to Moshe's and Jean-Paul's suggestions of using higher-level tools to handle database interactions. If, however, you'd rather continue with the adbapi approach, here are a few tips that may help your diagnostic, given that behavior you are observing seems to result from multi-threaded/multi-process interactions: - Confirm you have no other SQLite client at play, locking your data somehow. - Correct serialization of independent SQL queries/transactions may be a complex topic. - But you can try forcing the ConnectionPool thread count to 1 via the cp_min/cp_max init args. - For your PRAGMA statements you should probably use the cp_openfun ConnectionPool init arg. - Using the undocumented "check_same_thread" on Python 2 seems pointless. (not sure it is, though; I did not look at the underlying Python 2 stdlib code) Lastly: I have experienced issues with SQLite3 + Python on one occasion in the past: depending on the platform, SQLite itself is/is-not distributed with Python, IIRC; for diagnostics, confirm the exact version of the sqlite .SO / .DLL in use and, ideally, its compile time options. I recall my troubles were between CentOS 5 vs. 6, due to the specific SQLite versions included in each distribution. A final thought, from the realm of intuition: if I were targeting heavily concurrent access to a database (subjective, yes), I'd consider replacing SQLite with some other engine. SQLite is a great tool for many purposes, but concurrent access is not its strong point (see https://www.sqlite.org/lockingv3.html). Cheers, -- exvito

Le 2017-09-20 10:24, ex vito a écrit :
Thanks for all your answers! I'll check every options before taking a decision, but I'm quite seduced by Alchimia, the SqlAlchemy wrapper. I'm a bit worrying because it has not seen any commit since 11 months (I have had bad experiences with unmaintained libraries). This said, the wrapper is thin, I can probably maintain it myself if needed. As a bonus, it would allow me to propose other storage backends than Sqlite. For the use case, it's used mainly for cache and various data for an XMPP client which can handle several users at the same time, and only one backend is accessing the database. There is already an automatic update mechanism that I would have to adapt, but it may simplify everything at the end. Also I need to check if Full Text Search and Json data type are easily usable, but it seems than plain SQL request are still possible at worst. Thanks again for your answers, it's really helpful! Goffi

The best answer is probably https://github.com/alex/alchimia On Tue, Sep 19, 2017 at 1:59 PM Goffi <goffi@goffi.org> wrote:

Aloha Goffi I'm using minimal code like the one below for MySQL interaction, should be easy to transition this to SQLite from twisted.internet import reactor, task, defer from store import Store #debugging func, printing the result on the console def _transformResult(result): ## print '####', result if result: return result[0] #unpack the list else: return None class Somestore(object): name = "somestore" def __init__(self, dbapiName, **params): self.store = Store(dbapiName, **params) ka = task.LoopingCall(self.keepAlive) #db keepalive ka.start(307) def dbdisconn(self, reason): print 'db disconnected for ', reason def keepAlive(self): try: d = self.store.runQuery('SELECT 1') d.addErrback(self.dbdisconn) except: pass else: pass #do whatever MUST occur here in all cases def getTableCount(self): d = self.store.runQuery('SELECT tables FROM user WHERE servername = %s', 'total') d.addCallback(_transformResult) d.addErrback(self.dbdisconn) return d def setUserCount(self, waiting, playing, tables): d = self.store.runOperation('UPDATE user SET waiting = %s, playing = %s, tables = %s WHERE servername = %s', waiting, playing, tables, self.loggername) d.addErrback(self.dbdisconn) return d module store.py from itertools import izip from twisted.enterprise import adbapi using it with canned queries like class Store(object): def __init__(self, dbapiName, **params): self.__pool = adbapi.ConnectionPool(dbapiName, **params) print self.__pool.__getstate__() self.runOperation('SET autocommit = %s', 1) def runQuery(self, query, *args): d = self.__pool.runInteraction(self.mapQuery, query, args) return d def runInsert(self, query, *args): def mapQ(curs, query, *args): try: curs.execute(query, *args) except adbapi.ConnectionLost: print print '++++++++++++ rerunning query' print curs.execute(query, *args) #simply resend query, assuming cp_reconnect=True return {'lastrowid': -1} return {'lastrowid': curs.lastrowid} d = self.__pool.runInteraction(mapQ, query, args) return d def mapQuery(self, curs, query, *args): try: curs.execute(query, *args) except adbapi.ConnectionLost: curs.execute(query, *args) #simply resend query, assuming cp_reconnect=True result = curs.fetchall() columns = [d[0] for d in curs.description] return [dict(zip(columns, r)) for r in result] def runOperation(self, query, *args): d = self.__pool.runOperation(query, args) return d On 9/19/2017 10:59 AM, Goffi wrote:

On Tue, Sep 19, 2017 at 4:59 PM, Goffi <goffi@goffi.org> wrote:
"Correctly" has so many shades. Here's another possible direction for exploration: https://github.com/twisted/axiom Jean-Paul http://as.ynchrono.us/

On 2017-09-19, at 21:59, Goffi <goffi@goffi.org> wrote:
I would subscribe to Moshe's and Jean-Paul's suggestions of using higher-level tools to handle database interactions. If, however, you'd rather continue with the adbapi approach, here are a few tips that may help your diagnostic, given that behavior you are observing seems to result from multi-threaded/multi-process interactions: - Confirm you have no other SQLite client at play, locking your data somehow. - Correct serialization of independent SQL queries/transactions may be a complex topic. - But you can try forcing the ConnectionPool thread count to 1 via the cp_min/cp_max init args. - For your PRAGMA statements you should probably use the cp_openfun ConnectionPool init arg. - Using the undocumented "check_same_thread" on Python 2 seems pointless. (not sure it is, though; I did not look at the underlying Python 2 stdlib code) Lastly: I have experienced issues with SQLite3 + Python on one occasion in the past: depending on the platform, SQLite itself is/is-not distributed with Python, IIRC; for diagnostics, confirm the exact version of the sqlite .SO / .DLL in use and, ideally, its compile time options. I recall my troubles were between CentOS 5 vs. 6, due to the specific SQLite versions included in each distribution. A final thought, from the realm of intuition: if I were targeting heavily concurrent access to a database (subjective, yes), I'd consider replacing SQLite with some other engine. SQLite is a great tool for many purposes, but concurrent access is not its strong point (see https://www.sqlite.org/lockingv3.html). Cheers, -- exvito

Le 2017-09-20 10:24, ex vito a écrit :
Thanks for all your answers! I'll check every options before taking a decision, but I'm quite seduced by Alchimia, the SqlAlchemy wrapper. I'm a bit worrying because it has not seen any commit since 11 months (I have had bad experiences with unmaintained libraries). This said, the wrapper is thin, I can probably maintain it myself if needed. As a bonus, it would allow me to propose other storage backends than Sqlite. For the use case, it's used mainly for cache and various data for an XMPP client which can handle several users at the same time, and only one backend is accessing the database. There is already an automatic update mechanism that I would have to adapt, but it may simplify everything at the end. Also I need to check if Full Text Search and Json data type are easily usable, but it seems than plain SQL request are still possible at worst. Thanks again for your answers, it's really helpful! Goffi
participants (6)
-
ex vito
-
goffi
-
Goffi
-
Jean-Paul Calderone
-
Moshe Zadka
-
Werner Thie