[Tutor] How should my code handle db connections? Should my db manager module use OOP?
Martin A. Brown
martin at linux-ip.net
Thu Aug 27 06:42:17 CEST 2015
Hi there,
> My ongoing project will be centered around an SQLite db.
Not a bad way to start. There are many possible ways to access SQL
DBs. I'll talk about one of my favorites, since I'm a big fan of
sqlalchemy [0], which provides a broad useful toolkit for dealing
with SQL DBs and an abstraction layer.
To start, often the question is why any such abstraction tool, given
the additional complexity of a module, a.k.a. another layer of code?
Briefly, my main two reasons:
A) abstraction of data model from SQL implementation for the
Python program (allows switching from SQLite another DBAPI,
e.g. postgres, later with a minimum effort)
B) somebody has already implemented the tricky bits, such as ORMs
(see below), failover, connection pooling (see below) and
other DB-specific features
> Since almost all data needed by the program will be stored in this
> db, my thought is that I should create a connection to this db
> shortly after program startup and keep this connection open until
> program closure.
That is one possible approach. But, consider using a "connection
pooling" technique that somebody else has already implemented and
tested. This saves your time for working on the logic of your
program.
There are many different pooling strategies, which include things
like "Use only one connection at a time." or "Connect on demand." or
"Hold a bunch of connections open and let me use one when I need
one, and I'll release it when I'm done." and even "When the
connection fails, retry quietly in the background until a successful
connection can be re-established."
> I am assuming that opening and closing a db connection has enough
> overhead that I should only do this once. But I do not *know*
> that this is true. Is it? If not, then the alternative would
> make more sense, i.e., open and close the db as needed.
Measure, measure, measure. Profile it before coming to such a
conclusion. You may be correct, but, it behooves you to measure.
(My take on an old computing adage: Premature optimization can lead
you down unnecessarily painful or time consuming paths.)
N.B. Only you (or your development cohort) can anticipate the load
on the DB, the growth of records (i.e. data set size), the growth of
the complexity of the project, or the user count. So, even if the
measurements tell you one thing, be sure to consider the longer-term
plan for the data and application.
Also, see Steven D'Aprano's comments about concurrency and other
ACIDic concerns.
> In the first iteration of my project, my intent is to create and
> populate the db with tables external to the program. The program
> will only add entries to tables, query the db, etc. That is, the
> structure of the db will be pre-set outside of the program, and
> the program will only deal with data interactions with the db.
If the structure of the DB is determined outside the program,
this sounds like a great reason to use an Object Relational
Modeler (ORM). An ORM which supports reflection (sqlalchemy
does) can create Pythonic objects for you.
> My intent is to make the overall design of the program OO, but I
> am wondering how to handle the db manager module. Should I go OO
> here as well? With each pertinent method handling a very specific
> means of interacting with the db? Or go a procedural route with
> functions similar to the aforementioned methods? It is not clear
> to me that OOP provides a real benefit here, but, then again, I am
> learning how to OOP during this project as well, so I don't have
> enough knowledge yet to realistically answer this question.
I'm not sure I can weigh in intelligently here (OOP v. procedural),
but I'd guess that you could get that Object-Oriented feel by taking
advantage of an ORM, rather than writing one yourself. Getting used
to the idea of an ORM can be tricky, but if you can get reflection
working [1], I think you will be surprised at how quickly your
application logic (at the business layer) comes together and you can
(mostly) stop worrying about things like connection logic and SQL
statements executing from your Python program [2].
There probably are a few people on this list who have used
sqlalchemy and are competent to answer it, but if you have questions
specifically about sqlalchemy, you might find better answers on
their mailing list [3].
Now, back to the beginnings...a SQLite DB is a fine place to start
if you have only one thread/user/program accessing the data at any
time. Don't host it on a network(ed) file system if you have the
choice. If your application grows so much in usage or volume that
it needs a new and different DB, consider it all a success and
migrate accordingly.
Best of luck,
-Martin
[0] http://www.sqlalchemy.org/
[1] http://docs.sqlalchemy.org/en/rel_1_0/core/reflection.html
[2] Here, naturally, I'm assuming that you know your way around
SQL, since you are asserting that the DB already exists, is
maintained and designed outside of the Python program.
[3] https://groups.google.com/forum/#!forum/sqlalchemy
--
Martin A. Brown
http://linux-ip.net/
More information about the Tutor
mailing list