[Tutor] File vs. Database (possible off topic)
Steven D'Aprano
steve at pearwood.info
Tue Nov 22 14:13:43 CET 2011
Ken G. wrote:
> It occurred to me last week while reviewing the files I made in using
> Python, it could be somewhat similar to a database.
>
> What would be a different between a Python files and Python databases?
> Granted, the access in creating them are different, I really don't see
> any different in the format of a file and a database.
A database is essentially a powerful managed service built on top of one
or more files. There's nothing you can do with a database that you can't
do with a big set of (say) Windows-style INI files and a whole lot of
code to manage them. A database does all the management for you,
handling all the complexity, data integrity, and security, so that you
don't have to re-invent the wheel. Since database software tends to be
big and complicated, there is a lot of wheel to be re-invented.
To be worthy of the name "database", the service must abide by the ACID
principles:
Atomicity
---------
The "all or nothing" principle. Every transaction must either completely
succeed, or else not make any changes at all. For example, if you wish
to transfer $100 from account A to account B, it must be impossible for
the money to be removed from A unless it is put into B. Either both
operations succeed, or neither.
Consistency
-----------
Any operation performed by the database must always leave the system in
a consistent state at the end of the operation. For example, a database
might have a table of "Money Received" containing $2, $3, $5, $1 and $2,
and another field "Total" containing $13, and a rule that the Total is
the sum of the Money Received. It must be impossible for an operation to
leave the database in an inconsistent state by adding $5 to the Money
Received table without increasing Total to $18.
Isolation
---------
Two transactions must always be independent. It must be impossible for
two transactions to attempt to update a field at the same time, as the
effect would then be unpredictable.
Durability
----------
Once a transaction is committed, it must remain committed, even if the
system crashes or the power goes out. Once data is written to disk,
nothing short of corruption of the underlying bits on the disk should be
able to hurt the database.
Note that in practice, these four ACID principles may be weakened
slightly, or a lot, for the sake of speed, convenience, laziness, or
merely by incompetence. Generally speaking, for any program (not just
databases!) the rule is:
"Fast, correct, simple... pick any two."
so the smaller, faster, lightweight databases tend to be not quite as
bullet-proof as the big, heavyweight databases.
Modern databases also generally provide an almost (but not quite)
standard interface for the user, namely the SQL programming language.
Almost any decent database will understand SQL. For example, this command:
SELECT * FROM Book WHERE price > 100.00 ORDER BY title;
is SQL to:
* search the database for entries in the Book table
* choose the ones where the price of the book is greater than $100
* sort the results by the book title
* and return the entire record (all fields) for each book
So, broadly speaking, if you learn SQL, you can drive most databases, at
least well enough to get by.
--
Steven
More information about the Tutor
mailing list