Which non SQL Database ?
Deadly Dirk
dirk at pfln.invalid
Sun Jan 23 01:15:38 EST 2011
On Sat, 04 Dec 2010 16:42:36 -0600, Jorge Biquez wrote:
> Hello all.
>
> Newbie question. Sorry.
>
> As part of my process to learn python I am working on two personal
> applications. Both will do it fine with a simple structure of data
> stored in files. I now there are lot of databases around I can use but I
> would like to know yoor advice on what other options you would consider
> for the job (it is training so no pressure on performance). One
> application will run as a desktop one,under Windows, Linux, Macintosh,
> being able to update data, not much, not complex, not many records. The
> second application, running behind web pages, will do the same, I mean,
> process simple data, updating showing data. not much info, not complex.
> As an excersice it is more than enough I guess and will let me learn
> what I need for now. Talking with a friend about what he will do (he use
> C only) he suggest to take a look on dBase format file since it is a
> stable format, fast and the index structure will be fine or maybe go
> with BD (Berkley) database file format (I hope I understood this one
> correctly) . Plain files it is not an option since I would like to have
> option to do rapid searches.
>
> What would do you suggest to take a look? If possible available under
> the 3 plattforms.
>
> Thanks in advance for your comments.
>
> Jorge Biquez
Well, two NoSQL databases that I have some experience with are MongoDB
and CouchDB. The choice among them depends on your application. CouchDB
is an extremely simple to set up, it is all about the web interface, as a
matter of fact it communicates with the outside world using HTTP
protocol, returning JSON objects. You can configure it using curl. It is
also extremely fast but it doesn't allow you to run ad hoc queries. You
have to create something called a "view". This is more akin to what
people in the RDBMS world call a "materialized view". Views are created
by running JavaScript function on every document in the database. Results
are stored in B*Tree index and then modified as documents are being
inserted, updated or deleted. It is completely schema free, there are no
tables, collections or "shards". The primary language for programming
Couch is JavaScript.
The same thing applies to MongoDB which is equally fast but does allow ad
hoc queries and has quite a few options how to do them. It allows you to
do the same kind of querying as RDBMS software, with the exception of
joins. No joins. It also allows map/reduce queries using JavaScript and
is not completely schema free. Databases have sub-objects called
"collections" which can be indexed or partitioned across several machines
("sharding"), which is an excellent thing for building shared-nothing
clusters. Collections can be indexed and can be aggregated using
JavaScript and Google's map/reduce. Scripting languages like Python are
very well supported and linked against MongoDB, which tends to be faster
then communicating using HTTP. I find MongoDB well suited for what is
traditionally known as data warehousing.
Of course, traditional RDBMS specimens like MySQL, PostgreSQL, Firebird,
Oracle, MS SQL Server or DB2 still rule supreme and most of the MVC tools
like Django or Turbo Gears are made for RDBMS schemas and can read things
like the primary or foreign keys and include that into the application.
In short, there is no universal answer to your question. If prices are a
consideration, Couch, Mongo, MySQL, PostgreSQL, Firebird and SQL Lite 3
all cost about the same: $0. You will have to learn significantly less
for starting with a NoSQL database, but if you need to create a serious
application fast, RDBMS is still the right answer. You may want to look
at this Youtube clip entitled "MongoDB is web scale":
http://www.youtube.com/watch?v=b2F-DItXtZs
--
I don't think, therefore I am not.
More information about the Python-list
mailing list