[BangPypers] NoSQL

Sun Feb 13 16:43:10 CET 2011

On Sun, Feb 13, 2011 at 3:06 PM, Anand Balachandran Pillai <
abpillai at gmail.com> wrote:

> On Sun, Feb 13, 2011 at 2:21 PM, Noufal Ibrahim <noufal at gmail.com> wrote:
>
> > On Sun, Feb 13 2011, Anand Balachandran Pillai wrote:
> >
> > > I am sure many of you must have gone through this discussion, but
> > > sharing it anyway since I liked the analogy he makes with SQL against
> > > NoSQL compared to transmission in cars.
> >
> > I liked the analogy but don't agree with the second paragraph. It's not
> > only about size. Thanks to the dominance of SQL databases, everyone
> > tends to think of them as the "default" and use noSQL only if
> > necessary. That needn't be the case. Small (non Google, non Facebook)
> > applications that need to store documents (e.g. a wiki) *might* work
> > better with a noSQL backend than a relational one
> >
>
> Does it really matter in a small-size wiki project whether you are
> using SQLite to store your data, vs a mongodb that just runs on
> your machine ?

Well, its easier to look up wiki pages from their slugs (primary keys).
Besides you could store additional metadata about the page easily using a
json syntax to go over and beyond the wiki markdown text as an example.

> I would rather go for the sqlite solution since,
>
> 1. It is the most simplistic RDBMs one can think of.
> 2. You get the power of SQL, thereby chance of writing
>  adhoc queries in the future.
>

Not specifically referring to a single machine use case, but a nosql +
lucene solution could perceivably be more natural. In a wiki the adhoc
queries beyond free text search are likely to be less frequently used.

>
> I don't agree with it. One of the basic premises from where the nosql
> platforms come is that they are trying to solve problems where your
> data is distributed in a scale that traditional RDBMs would find it
> difficult
> to address with sufficient performance. If I use the "CAP" terminology,
> nosql is solving the problems of A and P on a large scale while
> making no promises on the C side.
>
> Unless your wiki need to scale to at least 100K nodes or more,
> I don't see a real technical reason to use document stores apart from
> relieving you upfront of complex schema design and writing SQL queries.
> If you mean that is "better" for you, then we are talking of different
> problems here. Mileages vary.
>
>
> >
> > > http://stackoverflow.com/questions/2559411/sql-mysql-vs-nosql-couchdb
> > >
> > > It might be a cliche, but I kind of feel the current "NoSQL movement"
> > > is simply a case of "The grass must be greener on the other side".
> >
> > I don't really follow "movements" but disagree agree with your general
> > statement.
> >
> > The kinds of data and hardware that people are dealing with have changed
> > and different problems are cropping up. The constraints and requirements
> > have changed as well. New technologies have come up to address these
> > problems and given that we live in these times, it's quite possible that
> > the problems we face might fall into the categories for which these
> > systems have been designed. It's unwise to summarily dismiss document
> > stores out of the box.
> >
> > Also, the transition is not abrupt (SQL yesterday, noSQL today). SQL
> > databases have been used in a semi schemaless fashion e.g. Triple
> > stores[1], Entity-attribute-value model[2] etc.
> >
> > For some kinds of datasets, sound RDMS rules are violated to gain
> > performance. e.g. Denormalisation[3]. These kinds of things indicate
> > that RDBMs systems are not designed to handle certain classes of
> > problems that are cropping up and new solutions have to be sought out.
> >
> > It's an engineering problem. Different situations call for different
> > tools and solutions.
> >
> > I personally tend to ignore the whippersnappers with their "SQL suxx0rZ!
> > noSQL roX!" outlook and the grumpy SQL advocates with their "Get off my
> > lawn!" attitude.
> >
>
> You might have got me wrong. My point was that there seems to be a
> trend where programmers and designers choose to implicitly assume that
> just because their data is expected to scale to gigabytes or terabytes
> in the future, the right choice upfront is a Document store (I prefer to
> use this term as against the confusing "nosql" one), which is not
> the correct way to do this.
>

+1. I tend to prefer viewing the situation as a set of prioritised CAP
requirements.

There are quite a few situations where one might require a very high
scalability but CA requirements abound (eg. banking / financial apps) -
where introducing noSQL would be a pain, whereas one might just want a 3
node wiki (to reuse the example above) which simply is always available
regardless of partition failures even though it serves say only 100000 users
- and many noSQL solutions might serve that situation just fine.

Because noSQL emerged out of an unsatisfied demand for the scalability in
the extreme hardly means there is a very strong correlation between noSQL
and scalability in the extreme.

-- 
--------------------------------------------------------
blog: http://blog.dhananjaynene.com
twitter: http://twitter.com/dnene