OT: why are LAMP sites slow?

Paul Rubin http
Thu Feb 3 23:42:26 EST 2005


Skip Montanaro <skip at pobox.com> writes:
> It's more than a bit unfair to compare Wikipedia with Ebay or
> Google.  Even though Wikipedia may be running on high-performance
> hardware, it's unlikely that they have anything like the underlying
> network structure (replication, connection speed, etc), total number
> of cpus or monetary resources to throw at the problem that both Ebay
> and Google have.  I suspect money trumps LAMP every time.

I certainly agree about the money and hardware resource comparison,
which is why I thought the comparison with 1960's mainframes was
possibly more interesting.  You could not get anywhere near the
performance of today's servers back then, no matter how much money you
spent.  Re connectivity, I wonder what kind of network speed is
available to sites like Ebay that's not available to Jane Webmaster
with a colo rack at some random big ISP.  Also, you and Tim Danieliuk
both mentioned caching in the network (e.g. Akamai).  I'd be
interested to know exactly how that works and how much difference it
makes.

But the problems I'm thinking of are really obviously with the server
itself.  This is clear when you try to load a page and your browser
immediately get the static text on the page, followed by a pause while
the server waits for the dynamic stuff to come back from the database.
Serving a Slashdotting-level load of pure static pages on a small box
with Apache isn't too terrible ("Slashdotting" = the spike in hits
that a web site gets when Slashdot's front page links to it).  Doing
that with dynamic pages seems to be much harder.  Something is just
bugging me about this.  SQL servers provide a lot of capability (ACID
for complicated queries and transactions, etc). that most web sites
don't really need.  They pay the price in performance anyway.

> We also know Google has thousands of CPUs (I heard 5,000 at one point and
> that was a couple years ago).

It's at least 100,000 and probably several times that ;-).  I've heard
every that search query does billions of cpu operations and crunches
through 100's of megabytes of data (search on "apple banana" and there
are hundreds of millions of pages with each word, so two lists of that
size must be intersected).  100,000 was the published number of
servers several years ago, and there were reasons to believe that they
were purposely understating the real number.



More information about the Python-list mailing list