Against the CGI-is-slow chorus (Re: CGI with Python: advantages?)
Sam Penrose
see at message.body
Sun Aug 6 23:40:39 EDT 2000
In article <3cdd.398d5989.a554b at yetix.sz-sb.de>, ajung at suxers.de wrote:
> Mimmo <noname at unknown.com> wrote:
> 2qyThe question is: is Python a good language for CGI scripting?
>
> Python is not only a good language for CGI scripting but for
> lots of others tasks (especially in middleware environments).
>
> When you're just looking for Python in web environments take a look at
> mod_snake and mod_python for the Apache webserver, Zope for using Python
> inside
> an application server or FastCGI modules for Python for optimal
> performance. I
> would not use Python for CGI scripts when need a good performance.
> Loading the
> interpreter and all modules for every request is a pain. There are also
> some
> solutions available to embed Python into HTML (similar to ASP, JSP).
>
> Kind regards,
> Andreas
[[Redundancy warning: I have made similar posts recently, but they seem
to bear making again.]]
I disagree with all of this. Here's why:
I work for a company that does ecommerce sites, using Python CGIs to
move data between MySQL and HTML. Our busiest site (CGI page views in
the tens of thousands per day; ecommerce transactions up to 1,000 per
day, plus lots of administrative work done via Python CGIs) runs on
several thousand lines of Python, quite a bit of it involving
subclassing and multiple __init__ calls. The server is a 500 MHz Pentium
III box running recent versions of Apache secure server and Red Hat
Linux. With a 7200 RPM SCSI hard drive and 256 megs of RAM it probably
represents under $2,000 of hardware. We are not experts in tuning Apache
or Linux. Our Python scripts have never been optimized for speed. The
site is a lot busier than anyone expected, and it performs great. We are
about to go to a cluster, but for availability, not performance. Despite
doing all the things that are supposed to mean bad performance (Python
instead of Perl or name-your-app-server, lots of database connections, a
new Python interpreter for every page view, etc.), the only two
noticeable bottlenecks involve scripts that return ridiculous amounts of
content (~1 meg of stats/several dozen images). We could rewrite those
bits in C and it wouldn't speed them up much; the solution is to break
up the content across several page views.
If this is so, why do so many smart, knowledgeable people think that
serving dynamic content with plain Python CGIs dooms performance? I
think the answer is that received wisdom erodes less quickly than
Moore's "Law" operates.
<hypothesis>
When Mosaic was all the rage, an inexpensive Web server might have
consisted of a 486/50 with 16 megs of RAM and a 4200 RPM ATA hard drive
running god knows what http server software. A CGI of any significant
functionality was probably hitting that lethargic hard drive pretty
heavily. We are talking s...l...o...w. Early web programmers therefore
warned newbies to stay away from CGI when performance was required. More
to the point, they came up mod_perl, which rolled an interpreter for the
most common CGI language into the http software.
Nowadays, even those with modest budgets and expertise can use the
wonderful Apache http server to run CGIs essentially out of RAM, and the
RAM in turn is accessed by a vastly more powerful CPU over a 100+ MHz
bus. Yes, the bus is now faster than the processor was when the
CGI-is-slow opinion became established.
Meanwhile, Perl's awesome geek marketing force has had five years to
pound away at the idea that CGI is a problem and an Apache module is the
solution. Because mod_perl works so well and is so entrenched in the CGI
community, no one has had occasion to rethink this issue. So when us
Python coders look wistfully at CPAN and O'Reilly's CGI Programming with
Perl and on and on (an infrastructure I do envy), we acquire the
now-outdated take on CGI's inadequacy at the same time. The result is
that novice Python/CGI programmers who have yet to type the characters
"import cgi" are wasting time trying to figure out which Apache module
to install.
</hypothesis>
I am grateful to the authors of the various Python Apache modules for
their efforts to fill a perceived need. And, sure, it would be nice to
have a Python interpreter rolled into Apache, provided only that:
1) It didn't leak memory, unlike the current/final version of PyApache
and, I am told, some of the other efforts at various stages.
2) It didn't force an idiosyncratic coding approach on anyone. There is
simply too much to learn about Python, HTML, XML, SQL...for me to spend
a week learning some special purpose syntax that is going to make my
already-screaming-fast executable 30% or 300% faster. For people who are
trying to build something on the Internet, there is no more valuable
commodity than programming time. That is why (to answer Mippo's
question), Python is a wonderful language for CGIs. Python accelerates
programmers.
3) It was just there and worked, particularly when new versions of
Python or Apache come out. Apache gets updated several times a year.
No such module existed a couple months ago. Let's say that one will
exist as early as this fall. By that time, $2,500 will have you serving
CGIs out of RAM across a 200 MHz bus, with a 10,000 RPM SCSI hard drive
for data that isn't in RAM. That in mind, I want to ask:
1) Does anyone have firsthand experience of a slow website served on
contemporary hardware that was made fast by moving away from Python
CGIs? If so, what were the details?
2) Who exactly needs a faster Python interpreter for serving CGIs, who
wouldn't be better served by upgrading their hardware? Even a non-profit
that is handling tens of thousands of dynamic requests a day can
probably find funding for a better server a lot easier than they can get
more programmer/sysadmin hours.
3) Is it time for the received wisdom to state that CGI performance
should not be worried about unless your hardware is obsolete, you are
serving dozens of dynamic pages a minute, or you have evidence to the
contrary?
--
spenrose at well dot com
More information about the Python-list
mailing list