[BangPypers] NoSQL

Noufal Ibrahim noufal at gmail.com
Sun Feb 13 10:54:55 CET 2011


On Sun, Feb 13 2011, Anand Balachandran Pillai wrote:


[...]

> Hmmm, "better" in what sense ? By "better" if you mean the
> programmer's or maintainer's work is reduced from designing schema,
> writing SQL to store/retrieve data, basically all the RDBMS stuff vs
> designing a rather flat key/value store using a Document store
> (nosql), I am not convinced.

Better meaning that it models the domain properly. I have documents that
I want to store with versioning and couch (haven't used Mongo) does
exactly that.

> Does it really matter in a small-size wiki project whether you are
> using SQLite to store your data, vs a mongodb that just runs on your
> machine ?

Perhaps not but you'd have to select one anyway. 

> I would rather go for the sqlite solution since,
>
> 1. It is the most simplistic RDBMs one can think of.
> 2. You get the power of SQL, thereby chance of writing
>  adhoc queries in the future.

I'd consider using Couch simply because I don't have to write a *real*
application. Just a few views to present the pages and that's it. I
don't need to bother with a relational database, a schema, an ORM, a
separate web application and what not. At the end of the day, a wiki is
just a bunch of documents that are versioned. My example probably has
flaws because I'm not that comfortable with document stores yet but I
think the point I'm making is clear.

As for point 2., Sounds like YAGNI to me. 

My point is that there's not need to "default" to RDBMS (except the fact
that we're mostly "used to" relational databases rather than document
stores).


> I don't agree with it. One of the basic premises from where the nosql
> platforms come is that they are trying to solve problems where your
> data is distributed in a scale that traditional RDBMs would find it
> difficult to address with sufficient performance. If I use the "CAP"
> terminology, nosql is solving the problems of A and P on a large scale
> while making no promises on the C side.

Probably but that doesn't mean that scale is the only reason to move
away from RDBMS. 

Once upon a time, I had binary file formats, then XML, then lightweight
markups. In the future more of these might come. Just like that, I have
a new way of storing data now. 

> Unless your wiki need to scale to at least 100K nodes or more, I don't
> see a real technical reason to use document stores apart from
> relieving you upfront of complex schema design and writing SQL
> queries.  If you mean that is "better" for you, then we are talking of
> different problems here. Mileages vary.

I don't think the schema design for a wiki is "complex". I just don't
see any reason to bother doing it at all. I also don't plan to scale to
100s of nodes. Maybe just run it as a personal wiki.

I have a problem and a data store that models the domain almost
exactly. Why do I have to restructure my data as "relations" and then
write "queries" to get them? The only reason I can think of to do that
is because I'm "used to" SQL.

[...]

> You might have got me wrong. My point was that there seems to be a
> trend where programmers and designers choose to implicitly assume that
> just because their data is expected to scale to gigabytes or terabytes
> in the future, the right choice upfront is a Document store (I prefer to
> use this term as against the confusing "nosql" one), which is not
> the correct way to do this. 

Agreed. 

> I think any complex data storage problem will at the end consist of a
> mix of Document stores and Relational stores. For example, in the link
> I quoted the O.P seems to come from that kind of a thinking. Something
> to do with all the current thinking in terms of "cloud" and "data out
> there" and the fashion to think of SQL as "that old thing" and
> Document store as "this flashy new thing".

Agreed there too. Technical reasons and decisions are fine. It's the
whole "SQL camp" vs. "noSQL camp" thing that's fruitless and I think
Santosh's mail that started this thread with the "god help you",
suggests that. 

> Looking at the hardware part of it, RDBMs have been severely limited
> by current storage technology, i.e platter spinning disks, which is a
> limiting factor when trying to optimize closer to the metal. SSDs
> could solve a whole lot of the problems at that level, though now a
> day's the trend is to blame the poor performance on the DB and think
> of a document store which scales to millions of nodes, like Digg did
> for example.

Well, if you *already* have a relational setup, you can beef up the
machine with SSDs and things to keep it going but if it doesn't work,
you might have to consider scaling horizontally and then things change. 

[...]


-- 


More information about the BangPypers mailing list