[Twisted-web] Re: [Web-SIG] A more Twisted approach to async apps in WSGI

Thu Oct 7 07:28:42 CEST 2004

At 12:59 AM 10/7/04 -0400, James Y Knight wrote:
>On Oct 5, 2004, at 2:37 AM, Phillip J. Eby wrote:
>>Although you probably want something more like a pipe error if the input 
>>times out or the connection is broken.
>
>You normally only get pipe errors on writes, read just sees EOF.
>
>But that does bring up a good point: How does the server notify the 
>application that the client has gone away, and any further work is useless?
>- For non-async apps that use the iterator model: I think the server is 
>allowed to just call iterable.close() and never iterate again.

Yes.

>- For async applications, with the proposed API, that may not be an 
>option, because the iterable returned is the special wrapper, not a 
>user-created class. Although, actually, I guess the app can return its own 
>iterable whose __iter__ calls through and returns the wrapper's __iter__.

Not if the server wants to be able to handle that iterable specially.  But 
anyway, it seems that the wrapper's constructor should take a close method, 
or have a way to set one.

>- What about for non-async applications that use the write callable? 
>Should write be allowed to raise an exception? Or should it just become a 
>no-op when the client is disconnected?

It's allowed to raise an exception, though this was never explicitly put in 
the spec; I'll have to fix that.  The actual process for that scenario 
looks something like this:

    * app calls write()
    * write() raises error
    * app catches error (maybe) and calls start_response() with exc_info
    * start_response() reraises the error, because it has already sent 
headers to the client and can't restart the response
    * application error handler bombs out and returns to server/gateway
    * server/gateway logs the exception (maybe) and gets on with life in 
the big 'net

>Hmm, yes. I totally missed the option of just yielding ''. Of course it's 
>a very bad idea to repeatedly yield '' to a server if you don't know the 
>server can properly handle it (by e.g. delaying longer and longer), but, 
>in this case, since the server itself is providing the special iterable, 
>that should be fine.

Yes.  Also, when we finally settle on an async API, I do want to cover the 
issue of backing off iteration when empty strings are yielded.  I'm 
actually inclined to suggest that an async application should take 
responsibility for doing the delaying if it's called repeatedly, and the 
async API isn't available.

>It seems like it should be possible to make a generic class that 
>implements this async API for use with sync servers that do not support it 
>natively. That would allow async apps to run on a sync server without 
>modification, which is potentially useful. To do that, though, I think the 
>it'd have to spawn an extra thread per request that is waiting to read 
>data, for the read() call to block on. Unless, of course, the app never 
>needs to yield outgoing data while waiting for incoming data.

Well, with Twisted you could deferToThread the read() operations, though 
it's hard for me to think straight about that scenario because I keep 
finding it hard to imagine an async web app that isn't just written to the 
Twisted API to start with... ;)

>The one remaining issue I have is the required thread-safeness of various 
>APIs.
>
>The spec doesn't mention much of anything about threadsafeness: is it ok 
>to call wsgi methods from a different thread than the one the server 
>originally called the request on? Especially interesting for implementing 
>the above sync->async adapter: environ['wsgi.input'].read(x) would be 
>called from a second thread.

Excellent question; I should add the answer to the spec, as soon as I 
decide precisely what it is. :)

One point: the spec should absolutely forbid servers from using thread 
identity to identify the application/caller.  The "what can you call while 
what else is executing" part of the question is a bit trickier.

>What thread (if there's a choice) does the on_get callback get called on. Etc.

My inclination is to make threading issues symmetrical.  That is, the 
application doesn't get any thread-identity guarantees either.

>  I haven't really thought about these thready questions much either, so 
> maybe the answers are obvious, but in my experience, that's usually not 
> the case when it comes to threads.

Yep.  :)  However, the more I think about it, the more it seems to me that 
WSGI should emulate single-threadedness with respect to any 
function/method/iterator invocations associated with a given application 
invocation.  However, it is *not* guaranteed that all such invocations will 
occur from the same thread.

Basically, it means "no multitasking with the other guy's objects", and 
puts the locking burdens on whoever's trying to mix multitasking into the 
works.

>That's why async apps are nice. ;)

Not to mention fork().  :)

By the way, after all this discussion...  do you think it would be better to:

1) Push towards a full async API, nailing down all these loose ends

2) Use the simple-but-klugdy "pause iteration" API idea

3) Don't make an "official" async API, and just leave it open to server 
authors to create their own extensions, and maybe cherry pick the best 
ideas for WSGI 2.0, or

4) Do something else altogether?