[BangPypers] NoSQL

Sun Feb 13 10:36:43 CET 2011

On Sun, Feb 13, 2011 at 2:21 PM, Noufal Ibrahim <noufal at gmail.com> wrote:

> On Sun, Feb 13 2011, Anand Balachandran Pillai wrote:
>
> > I am sure many of you must have gone through this discussion, but
> > sharing it anyway since I liked the analogy he makes with SQL against
> > NoSQL compared to transmission in cars.
>
> I liked the analogy but don't agree with the second paragraph. It's not
> only about size. Thanks to the dominance of SQL databases, everyone
> tends to think of them as the "default" and use noSQL only if
> necessary. That needn't be the case. Small (non Google, non Facebook)
> applications that need to store documents (e.g. a wiki) *might* work
> better with a noSQL backend than a relational one
>

Hmmm, "better" in what sense ? By "better" if you mean the programmer's
or maintainer's work is reduced from designing schema, writing SQL
to store/retrieve data, basically all the RDBMS stuff vs designing a rather
flat key/value store using a Document store (nosql), I am not convinced.

Does it really matter in a small-size wiki project whether you are
using SQLite to store your data, vs a mongodb that just runs on
your machine ? I would rather go for the sqlite solution since,

1. It is the most simplistic RDBMs one can think of.
2. You get the power of SQL, thereby chance of writing
 adhoc queries in the future.

I don't agree with it. One of the basic premises from where the nosql
platforms come is that they are trying to solve problems where your
data is distributed in a scale that traditional RDBMs would find it
difficult
to address with sufficient performance. If I use the "CAP" terminology,
nosql is solving the problems of A and P on a large scale while
making no promises on the C side.

Unless your wiki need to scale to at least 100K nodes or more,
I don't see a real technical reason to use document stores apart from
relieving you upfront of complex schema design and writing SQL queries.
If you mean that is "better" for you, then we are talking of different
problems here. Mileages vary.

>
> > http://stackoverflow.com/questions/2559411/sql-mysql-vs-nosql-couchdb
> >
> > It might be a cliche, but I kind of feel the current "NoSQL movement"
> > is simply a case of "The grass must be greener on the other side".
>
> I don't really follow "movements" but disagree agree with your general
> statement.
>
> The kinds of data and hardware that people are dealing with have changed
> and different problems are cropping up. The constraints and requirements
> have changed as well. New technologies have come up to address these
> problems and given that we live in these times, it's quite possible that
> the problems we face might fall into the categories for which these
> systems have been designed. It's unwise to summarily dismiss document
> stores out of the box.
>
> Also, the transition is not abrupt (SQL yesterday, noSQL today). SQL
> databases have been used in a semi schemaless fashion e.g. Triple
> stores[1], Entity-attribute-value model[2] etc.
>
> For some kinds of datasets, sound RDMS rules are violated to gain
> performance. e.g. Denormalisation[3]. These kinds of things indicate
> that RDBMs systems are not designed to handle certain classes of
> problems that are cropping up and new solutions have to be sought out.
>
> It's an engineering problem. Different situations call for different
> tools and solutions.
>
> I personally tend to ignore the whippersnappers with their "SQL suxx0rZ!
> noSQL roX!" outlook and the grumpy SQL advocates with their "Get off my
> lawn!" attitude.
>

You might have got me wrong. My point was that there seems to be a
trend where programmers and designers choose to implicitly assume that
just because their data is expected to scale to gigabytes or terabytes
in the future, the right choice upfront is a Document store (I prefer to
use this term as against the confusing "nosql" one), which is not
the correct way to do this. I think any complex data storage problem
will at the end consist of a mix of Document stores and Relational
stores. For example, in the link I quoted the O.P seems to come
from that kind of a thinking. Something to do with all the current thinking
in terms of "cloud" and "data out there" and the fashion to think of
SQL as "that old thing" and Document store as "this flashy new thing".

Looking at the hardware part of it, RDBMs have been severely limited
by current storage technology, i.e platter spinning disks, which is a
limiting factor when trying to optimize closer to the metal. SSDs
could solve a whole lot of the problems at that level, though now a day's
the trend is to blame the poor performance on the DB and think of
a document store which scales to millions of nodes, like
Digg did for example.

Meanwhile, in a lighter vein, watch this NoSQL lightning talk by
Brian Aker. It is fun, if you haven't seen it :)

http://www.youtube.com/watch?v=LhnGarRsKnA

> [...]
>
>
> Footnotes:
> [1]  http://en.wikipedia.org/wiki/Triplestore
> [2]  http://en.wikipedia.org/wiki/Entity-attribute-value_model
> [3]  http://en.wikipedia.org/wiki/Denormalization
>
> --
> _______________________________________________
> BangPypers mailing list
> BangPypers at python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>

-- 
--Anand