[Twisted-web] Re: [Web-SIG] A more Twisted approach to async apps in WSGI

Thu Oct 7 06:59:47 CEST 2004

On Oct 5, 2004, at 2:37 AM, Phillip J. Eby wrote:
> Although you probably want something more like a pipe error if the 
> input times out or the connection is broken.

You normally only get pipe errors on writes, read just sees EOF.

But that does bring up a good point: How does the server notify the 
application that the client has gone away, and any further work is 
useless?
- For non-async apps that use the iterator model: I think the server is 
allowed to just call iterable.close() and never iterate again.
- For async applications, with the proposed API, that may not be an 
option, because the iterable returned is the special wrapper, not a 
user-created class. Although, actually, I guess the app can return its 
own iterable whose __iter__ calls through and returns the wrapper's 
__iter__.
- What about for non-async applications that use the write callable? 
Should write be allowed to raise an exception? Or should it just become 
a no-op when the client is disconnected?

>>  and on_get seems like the a fairly usable API for input. It doesn't 
>> let you pause the incoming data,
>
> Actually it does; it's supposed to be a one-shot.  You have to call it 
> again if you want to get called back again.

Ah, didn't see that it was one-shot. Yeah, in that case, the server can 
stop reading if there is no registered data callback and some 
predetermined buffer size is filled. Nice.

>>  If the input stream was iterable, an on_get callback could just be 
>> considered notice that you can iterate the input stream once without 
>> blocking, assuming the block boundary requirements were also in 
>> effect here.
>
> Yes, but this'd only work if the input were an iterator.  input.read() 
> returning an empty string would mean EOF, so the boundary stuff 
> doesn't work in that case.

Right -- just pointing out one plus to the iterator model. :)

>>  This means the .put/.next methods should communicate out-of-band, 
>> effectively calling pause/resume functions in the server so it knows 
>> when it's safe to iterate the vanilla iterator the middleware 
>> returned without the middleware blocking when calling the 
>> asyncwrapper-iterator.
>
> It could do that, certainly.  But, the truth is it's *always* safe to 
> iterate.  Note that the application can just use the on_get callback 
> to set a flag that it's ready to continue, and just keep yielding 
> empty strings till then.
>
> More to the point, the iterator-wrapper can simply yield empty strings 
> when its internal queue is empty, and a sensible async server should 
> back off its iterator.next() retry attempts when an application yields 
> empty strings.  This is pretty much always safe and sensible.
>
> However, the out-of-band communication you describe can also take 
> place, since it provides better communication in the case where the 
> extension is available.

Hmm, yes. I totally missed the option of just yielding ''. Of course 
it's a very bad idea to repeatedly yield '' to a server if you don't 
know the server can properly handle it (by e.g. delaying longer and 
longer), but, in this case, since the server itself is providing the 
special iterable, that should be fine.

It seems like it should be possible to make a generic class that 
implements this async API for use with sync servers that do not support 
it natively. That would allow async apps to run on a sync server 
without modification, which is potentially useful. To do that, though, 
I think the it'd have to spawn an extra thread per request that is 
waiting to read data, for the read() call to block on. Unless, of 
course, the app never needs to yield outgoing data while waiting for 
incoming data.

The one remaining issue I have is the required thread-safeness of 
various APIs.

The spec doesn't mention much of anything about threadsafeness: is it 
ok to call wsgi methods from a different thread than the one the server 
originally called the request on? Especially interesting for 
implementing the above sync->async adapter: 
environ['wsgi.input'].read(x) would be called from a second thread.

What thread (if there's a choice) does the on_get callback get called 
on. Etc. I haven't really thought about these thready questions much 
either, so maybe the answers are obvious, but in my experience, that's 
usually not the case when it comes to threads. That's why async apps 
are nice. ;)

James