<html style="direction: ltr;">

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

    <style type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>

  </head>

  <body style="direction: ltr;"

    bidimailui-detected-decoding-type="latin-charset" bgcolor="#FFFFFF"

    text="#000000">

    If you use either NginX or Apache's WSGi modules, you get the

    following crucial features: management of the CPython processes,

    support for threading (provides limited concurrency benefits but

    does reduce RAM use), and multiplexing of the port. If you want to

    do anything similar with Tornado, you're on your own: you have to

    run several Tornado processes yourself, manage them somehow (what do

    you do when they get stuck? how do you find out?), each on its port,

    and introduce a load balancer in front for all ports. This doesn't

    seem very scalable or easy to maintain to me: just adding a "worker"

    process involves allocating a port, reconfiguring your load balancer

    (and deploying it), etc. In terms of programming, sure, Tornado is

    easy, but in terms of operations I think it's a nightmare. One man's

    bloat is another man's necessary feature.<br>

    <br>

    I didn't mean to sound so negative about Tornado! I appreciate its

    approach a lot. I just think it's overused in the web world, often

    chosen for the wrong reasons. If I have any criticism about Tornado

    is that it has virtually no support for threading -- which is really

    why the codebase is so small, coherent and easy to debug. That ends

    up being a great fit for the Python world, which generally abhors

    threads (for the right reasons, in context).<br>

    <br>

    The real comparison, in my view, is not Tornado vs. Django (Django

    over what server?), but Tornado vs. other lightweight async servers,

    such as Node.js.<br>

    <br>

    If I were to write an async Internet application right now from

    scratch, and was not limited to the Python world, I would definitely

    look towards something like Erlang. You get the advantages of

    threading while still maintaining code coherence and debuggability.<br>

    <br>

    <br>

    <div class="moz-cite-prefix">On 10/11/2012 04:08 PM, Japhy Bartlett

      wrote:<br>

    </div>

    <blockquote

cite="mid:CANTsVHKR+7Q6=MDBJurqphe3q5q1nv6-P8n7nLjYtz+HC_tg9A@mail.gmail.com"

      type="cite">As far as tornado being.. "quite bad ... for REST"...

       I guess I'll just say that I've been paid to write REST services

      using both tornado and django, and the tornado systems were not

      only easier to write, maintain and scale.

      <div>

        <br>

      </div>

      <div>It also happens to win a lot of benchmarks,

        and "over-simplification" is another man's lack of bloat.  The

        underlying code is quite nice, and a human being can read it.</div>

      <div><br>

      </div>

      <div>

        No.. it is not meant to serve static files, but it's certainly

        capable ("absolutely miserably"?), and that is a *really* weird

        criticism to make of a python web server.</div>

      <div><br>

      </div>

      <div>"fast" in the context of tornado actually means.. fast.  Like

        requests, natively asynchronous *or* synchronous through WSGI,

        tend to get served in fewer milliseconds than most other python

        frameworks.</div>

      <div><br>

      </div>

      <div><br>

      </div>

      <div>I think it's very underrated, and I hate to see people saying

        bad things about it.  Maybe it's worth a talk in the next month

        or two?</div>

      <div><br>

      </div>

      <div><br>

      </div>

      <div><br>

        <div class="gmail_quote">On Thu, Oct 11, 2012 at 3:00 PM, Tal

          Liron <span dir="ltr"><<a moz-do-not-send="true"

              href="mailto:tal.liron@threecrickets.com" target="_blank">tal.liron@threecrickets.com</a>></span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div style="direction:ltr" bgcolor="#FFFFFF" text="#000000">

              <div class="im"> On 10/11/2012 01:59 PM, Jordan Bettis

                wrote:<br>

                <br>

                <blockquote style="direction:rtl" type="cite">

                  <pre style="direction:ltr">Of course you can have a dynamic worker pool. That's the way apache

works. Given that python has a fairly "big boned" runtime, there's a

substantial cost there, as well as doing other things like making DB

connections for the new workers. And anyway it still only partially

solves the problem. You're still going to run out of memory or file

descriptors or something eventually. Compare Apache's behavior in the

face of a Slow DOS attack compared to that of an asynchronous server

like nginx.

</pre>

                </blockquote>

              </div>

              My life mission is to dispel this myth (especially because

              I used to believe it myself).<br>

              <br>

              Let's get rid of one myth first: a long time ago, it was

              the case that Linux's single-threaded epoll service was

              somehow more scalable than more simply using threads to

              read the socket, because thread switching was painful.

              This stopped being true a long time ago: the "fastest" web

              servers (lighttpd) do not use epoll. And, in any case,

              whether you have a single thread <i>accepting</i> the

              connections or not, you'll want a pool of thread (or

              "workers" of some kind) generating content for these

              connections. I'm saying this to point out that there's

              some confusion as in what counts as "async": so let's just

              get the idea that it has to do with <i>accepting</i> the

              connections out of the way.<br>

              <br>

              In a true "async" server, the server calls you to tells

              you, look, there's this new client connection here (the

              server maintains a pool of <i>information</i> -- not

              threads -- about each client). You can then call the

              server at your convenience when you have data to send to

              the client, or ask it to close the connection. (Again,

              let's forget how the server actually implements this

              internally; it has nothing to do with asynchronicity in

              the sense we are talking about here.) The quality of an

              async server has a lot to do with what kind of information

              it keeps.<br>

              <br>

              Think of it for not only in terms of the server but also

              in terms of your application. At some point the server

              turns things over to your code. So, what is your

              application doing?<br>

              <br>

              For a typical "web" application (REST), each client

              connection returns an entity of some kind. So, you really

              need to process each client quickly in turn. While it's

              true that NginX or Tornado or Node.js can accept a great

              many connections (it's just a small record of information

              they keep for each, not a thread), if there's no thread

              (or "worker") ready at your application's end to generate

              an entity, then these connections will queue up and your

              clients will consider your site "down." Async or sync

              server makes no difference: your app is sync because it

              needs to handle one request at a time.<br>

              <br>

              So, when does the asynchronous approach make a real

              difference? Say your application is not typical REST, but

              instead you are streaming video. There's no single entity

              that the clients are waiting for. So, what you can do

              instead is have each of your threads divide their time

              between the open connections. The more load you have, the

              less data you want to send per client when their turn

              comes (or you can give paying clients more time per

              turn...) A good async server will provide you with

              statistics about load to help you do the right thing and

              degrade gracefully. The API approach Garret mentioned for

              WSGi is typical: your app can just return a null or

              otherwise tell the server: "Don't return anything to the

              client right now; in fact, don't you worry about it all,

              I'll handle the data my own way and close the connection."

              Yes, such an approach enables async, but I wouldn't call

              it a good approach. The architectural burden becomes

              yours. If you're working with Tornado, for example, you're

              much better off working with its native API than using

              WSGi. Your app won't be portable, but then async rarely

              is.<br>

              <br>

              There's also a kinda middle ground between these extremes:

              serving files. Text files are usually too small to make a

              difference, but what if you are serving a lot of images?

              They're big, and sending them to slow clients can hold

              things up if you are using the typical "web" approach of

              sending them everything then need immediately. So, instead

              you can kinda stream the file to them, chunk by chunk, and

              if you do this well you can degrade gracefully. It's

              async, but with more determinability (because you know the

              size of the files), so it's a use case that has been

              heavily optimized. For example, individual chunks can be

              cached (mmap files ftw). But this has nothing to do with

              whether the server presents your app with a sync or async

              interface. As I stated, some of the best file servers are

              synchronous servers. They provide only a traditional REST

              API for your apps, but internally they do semi-streaming

              for files very well.<br>

              <br>

              (And actually there's another myth here: that somehow file

              servers that degrade more gracefully will help you scale.

              Well, do you ever really want any individual server of

              yours to get to the point where it starts to degrade

              performance in any way, let alone degrade gracefully?

              These days, Google and other search indexes will penalize

              you for degradation. The trick is to scale horizontally

              with cheap VMs, so you <i>never</i> hit that point in the

              graph where things start heading south. You don't care if

              you're heading south fast or slowly. So, at the high scale

              it makes almost no difference if you choose Apache or

              NginX or lighttpd for your REST apps. It will matter only

              if you're limited to one or two servers in your cluster.)<br>

              <br>

              As an opposite example, let's consider Tornado. Yes, it

              can serve files, but it does so absolutely miserably. Its

              devs make it clear that it was never their priority to

              compete with mature web file servers. Instead, the goal

              was to create a good, straightforward and (to be honest)

              overly simple async server. Tornado is great if you want

              to write a streaming server without bells and whistles.

              But it's quite bad, partly due to its over-simplification,

              for traditional REST. If you're picking Tornado for your

              web application because it's "async" and "fast" you might

              not be understanding what these terms mean in this

              context. Find a mature sync server and make sure <i>your</i>

              app, on your end, never holds up a thread for too long.<br>

              <br>

              Over and out.<span class="HOEnZb"><font color="#888888"><br>

                  <br>

                  -Tl<br>

                </font></span></div>

            <br>

            _______________________________________________<br>

            Chicago mailing list<br>

            <a moz-do-not-send="true" href="mailto:Chicago@python.org">Chicago@python.org</a><br>

            <a moz-do-not-send="true"

              href="http://mail.python.org/mailman/listinfo/chicago"

              target="_blank">http://mail.python.org/mailman/listinfo/chicago</a><br>

            <br>

          </blockquote>

        </div>

        <br>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Chicago mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Chicago@python.org">Chicago@python.org</a>

<a class="moz-txt-link-freetext" href="http://mail.python.org/mailman/listinfo/chicago">http://mail.python.org/mailman/listinfo/chicago</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>