dvkeeney at gmail.com
Thu Mar 11 17:00:29 CET 2010
On Mar 8, 12:14 pm, Duncan Booth <duncan.bo... at invalid.invalid> wrote:
> You've totally missed the point. It isn't the size of the data you have
> today that matters, it's the size of data you could have in several years'
> Maybe today you've got 10 users each with 10 megabytes of data, but you're
> aspiring to become the next twitter/facebook or whatever. It's a bit late
> as you approach 100 million users (and a petabyte of data) to discover that
> your system isn't scalable: scalability needs to be built in from day one.
Do you have examples of sites that got big by planning their site
architecture from day 0 to be big?
Judging from published accounts, even Facebook and Twitter did not
plan to be 'the next twitter/facebook'; each started with routine
LAMP stack architecture and successfully re-engineered the
architecture multiple times on the way up.
Is there compelling reason to think the 'next twitter/facebook' can't
and won't climb a very similar path?
I see good reasons to think that they *will* follow the same path, in
that there are motivations at both ends of the path for re-engineering
as you go. When the site is small, resources commited to the backend
are not spent on making the frontend useful, so business-wise the best
backend is the cheapest one. When the site becomes super-large, the
backend gets re-engineered based on what that organization learned
while the site was just large; Facebook, Twitter, and Craigslist all
have architectures custom designed to support their specific needs.
Had they tried to design for large size while they were small, they
would have failed; they couldn't have known enough then about what
they would eventually need.
The only example I can find of a large site that architected large
very early is Google, and they were aiming for a market (search) that
was already known to be huge.
Its reasonable to assume that the 'next twitter/facebook' will *not*
be in web search, social-networking, broadcast instant messaging, or
classified ads, just because those niches are taken already. So
whichever 'high-scalability' model the aspiring site uses will be the
wrong one. They might as well start with a quick and cheap LAMP
stack, and re-engineer as they go.
Just one internet watcher's biased opinion...
www.rdbhost.com -> SQL databases via a web-service
More information about the Python-list