xahlee at gmail.com
Mon Mar 8 16:12:05 CET 2010
many people mentioned scalibility... though i think it is fruitful to
talk about at what size is the NoSQL databases offer better
scalability than SQL databases.
For example, consider, if you are within world's top 100th user of
database in terms of database size, such as Google, then it may be
that the off-the-shelf tools may be limiting. But how many users
really have such massive size of data?
note that google's need for database today isn't just a seach engine.
It's db size for google search is probably larger than all the rest of
search engine company's sizes combined. Plus, there's youtube (vid
hosting), gmail, google code (source code hosting), google blog, orkut
(social networking), picasa (photo hosting), etc, each are all ranked
within top 5 or so with respective competitors in terms of number of
accounts... so, google's datasize is probably number one among the
world's user of databases, probably double or triple than the second
user with the most large datasize. At that point, it seems logical
that they need their own db, relational or not.
On Mar 4, 10:35 pm, John Nagle <na... at animats.com> wrote:
> Xah Lee wrote:
> > recently i wrote a blog article on The NoSQL Movement
> > athttp://xahlee.org/comp/nosql.html
> > i'd like to post it somewhere public to solicit opinions, but in the
> > 20 min or so, i couldn't find a proper newsgroup, nor private list
> > that my somewhat anti-NoSQL Movement article is fitting.
> Too much rant, not enough information.
> There is an argument against using full relational databases for
> some large-scale applications, ones where the database is spread over
> many machines. If the database can be organized so that each transaction
> only needs to talk to one database machine, the locking problems become
> much simpler. That's what BigTable is really about.
> For many web applications, each user has more or less their own data,
> and most database activity is related to a single user. Such
> applications can easily be scaled up with a system that doesn't
> have inter-user links. There can still be inter-user references,
> but without a consistency guarantee. They may lead to dead data,
> like Unix/Linux symbolic links. This is a mechanism adequate
> for most "social networking" sites.
> There are also some "consistent-eventually" systems, where a query
> can see old data. For non-critical applications, those can be
> very useful. This isn't a SQL/NoSQL thing; MySQL asynchronous
> replication is a "consistent-eventually" system. Wikipedia uses
> that for the "special" pages which require database lookups.
> If you allow general joins across any tables, you have to have all
> the very elaborate interlocking mechanisms of a distributed database.
> The serious database systems (MySQL Cluster and Oracle, for example)
> do offer that, but there are usually
> substantial complexity penalties, and the databases have to be carefully
> organized to avoid excessive cross-machine locking. If you don't need
> general joins, a system which doesn't support them is far simpler.
> John Nagle
More information about the Python-list