help! advocacy resources needed fast

Geoff Gerrietts geoff at gerrietts.net
Thu Mar 6 19:00:16 CET 2003


Quoting Kyler Laird (Kyler at news.Lairds.org):
> It sounds like you're running ZEO.  Is that correct?

Yes.

> So it's a problem in the ZEO clients/Zope servers (as
> opposed to the ZEO server), right?  Isn't it enough to
> throw more hardware at the problem?  If the data used
> by these pages doesn't change constantly (thus can be
> cached effectively), I'd expect that doubling the
> number of servers would come close to doubling the
> number of requests it can handle with the current
> response time.

This is correct, it is a problem in Zope, not ZEO. Business estimates
suggest our traffic could multiply by as much as a factor of 10 by
June. If we do nothing on the business side -- no new deals or
products or anything else to increase traffic -- history suggests our
traffic will double by June. Business does not plan to take the next 3
months off, though.

Right now we have 22 boxes doing Zope, a couple ZEO servers, and 6
boxes doing back-end processing. The back-end services aren't written
to do "everything they could", which is a (basically irreparable) flaw
in our design. But the back-end services are not heavily loaded.

A tenfold increase in traffic with linear scalability would mean 220
Zope boxes. That does not seem like a reasonable solution to our
problem.

> >We have spent a year refactoring key components, and building caching
> >solutions to minimize the impact of load.
> 
> I realize you're trying to solve your problems now,
> but I'd enjoy hearing more about this.

I'm not sure how enlightening it would be, but I wouldn't mind writing
a bit about what I've done and what I've seen done to try tweaking the
load off things. Most of it's pretty standard stuff. I'll try to
remember to write something about it next week -- remind me if I
forget.

> >And I need numbers, something to recommend the final solution I want
> >to push as most viable in Python beyond "a bunch of Python advocates
> >think it's quality shit". Or, I should say, I think I need numbers.
> 
> While I understand that you *must* handle the load
> with adequate response times, I think there's more
> to it than that.

I agree. What I need more than anything is someone who's walked this
road before, someone who's fielding something like a million requests
a day against a highly dynamic site.

> >As dire as I've made things look, every engineer who touches the
> >Python code, wants to work in Python. We like Python, we're good at
> >Python, and Python has proven to be our engineering department's
> >competitive advantage.
> 
> This reminds me of the situation where people jump
> up and down about Python not running as fast as 
> C/asm/whatever.  The *real* benefit is that you can
> *create* in it much more efficiently.  Need it to
> run faster?  Throw some hardware at it.  Hardware
> is cheap and you can always get more.

But there's a point at which that stops being true, and stops being
reasonable. If it takes you twice or three times as long to create the
equivalent solution in Java, but it takes 22 boxes to run it instead
of 220, the long-term amortized cost of developing in Java is
significantly lower: at some point you save yourself some development
time, but at the expense of hiring more sysadmins and paying for more
real estate in your co-lo.

More significantly, it's difficult to quantify, exactly, what the
benefit is to using (say) Python + Webware over Java + Servlets, in
terms of project turnaround time, programmer productivity, and ease of
maintenance. Everyone who programs knows it's true.

Meanwhile, we're in a position where there simply aren't very many
good answers for our scalability problems. Java + Servlets seems to be
the early favorite because the competition in our space uses that
technology successfully, and lots of other sites have risen to the
challenge of thousands of concurrent sessions using Java.

I believe Python can grow that far, and probably with greater
cost-effectiveness and organizational agility. But it's a multimillion
dollar bet: if we run into similar scaling problems again, a year from
now, when our traffic doubles again, we aren't likely to survive a
second architecture shift.

--G.


-- 
Geoff Gerrietts             "I don't think it's immoral to want to  
<geoff at gerrietts net>     make money."      -- Guido van Rossum





More information about the Python-list mailing list