[Web-SIG] Reviewing WSGI open issues, again...
Alan Kennedy
py-web-sig at xhaus.com
Thu Sep 9 18:01:51 CEST 2004
[Phillip J. Eby]
> * File-like objects -- I think anything we offer for file-like objects
> should be optional. The big question is whether to offer a single,
> introspection-based extension for all file-like things, or whether to
> use separate extensions for different sorts of things, like
> 'wsgi.fd_wrapper' for file descriptors and 'wsgi.nio_wrapper' for Java
> NIO objects, etc. Does anybody have any arguments/use cases one way
> or the other?
Optionality is fine by me.
But I don't understand what reasons there might be to have separate
class names per platform?
It's always been my understanding that the intention for this capability
is so that applications can give "hints", to servers that support
high-performance methods of file transmission, that the resource being
returned is a candidate for bulk transfer. So, as an application author,
I'll surely want that hinting process to work on as many servers as
possible, regardless of the platform.
So, if there is a choice of multiple such hinting processes, and I have
to look for each one of them at runtime, my code is longer and less
efficient than it could be, e.g.
def app_object(environ, start_response):
start_response('200 AuQuay', [ ('content-type', 'x-humungous-pdf') ] )
result = open('humungous.pdf')
for cname in ['fd','nio','dotnet','stackless','pypy','smalltalk']:
try:
return environ['wsgi.%s_wrapper' % cname](result):
except KeyError:
pass
return result
Instead, if a single class is used, the definition of which is different
per server, then I have only to look at that one class.
def app_object(environ, start_response):
start_response('200 AuQuay', [ ('content-type', 'x-humungous-pdf') ] )
result = open('humungous.pdf')
if environ.has_key('wsgi.file_wrapper'):
return environ['wsgi.file_wrapper'](result)
return result
One reason I can see for having multiple classes is if they really
represent fundamentally different concepts.
For example, there are possibly more types of optimisations available,
e.g. return a stream of bytes from a shared memory partition, if the
platform supported DMA access to that shared memory, which would then be
bulk-transferable, i.e. bypassing the CPU. Since shared memory is a
concept whose implementation varies subtly between platforms, should we
be trying to abstract that concept into one class with a single
interface, whose implementation differs between platforms, or into
separate classes, one for each platform?
What about an optimised transfer from an RDBMS, say a BLOB stored in a
database row. Should that be wrapped with a file_wrapper (because it's
really coming from a file descriptor?), or with a special
db_blob_wrapper class? Would these db_blob_wrappers differ between
different database platforms? Because it is quite possible that the
RDBMS data is also coming through the network subsystem, this bulk
transfer could potentially be arranged at the network level, conceivably
on a sophisticated network-card/router/etc, and thus never even reach
the bus on the serving machine. OK, that's a bit wild and unlikely :-),
but I'm just trying to foresee as many scenarios for bulk transfers as I
can, to see if the proposed WSGI model fits.
I suppose it's about recording enough meta-information for the server to
recognise such optimisable scenarios. So the question has to be asked:
how portable do we need these optimisations to be between servers. Is
medusa likely to have its middleware component dedicated to sendfile,
for example? And twisted have its own, thread-pool based,
implementation, for example. In which case portability of, say the
sendfile optimisation, becomes an issue of server configuration, not
support classes.
Or might it be that we need to facilitate the application at two levels
in the server? Take the example of shared memory :-
1. In the middleware stack, a component maps a certain URL space into
the shared memory partition, and returns a specialised wrapper class
that contains a shared memory reference, i.e. a handle, start/end/len, etc.
2. The application also needs to plug into the server, below the
middleware stack, so that it can implement the actual bulk transfer from
the shared memory (assuming that the shared_memory_wrapper wasn't
obscured by some component below it in the stack). Since shared memory
support, and probably DMA support, would vary between platform, this is
where the platform specific element comes in: there would be different
versions of that "server plug-in" for different platforms/servers.
Lastly, I should also point out that, with the current jython I/O
subsystem, the sendfile/transferTo optimisation is not currently
possible, inside most existing J2EE containers anyway. This is because
sockets created using the old java.net APIs, do not by default have
nio.channels associated with them. Most existing J2EE containers, which
must support blocking servlets by definition, don't bother to handle
sockets using java.nio, because it's more work, not necessary, and not
portable to older versions of the platform. So it's not possible to use
the sockets they create for bulk transfers.
A container could be redesigned to use the java.nio APIs, completely in
a blocking fashion, if desired. Which still wouldn't be any use in
existing jython, because jython's current socket modules are entirely
based on old java.net classes. Which means that jython code couldn't
access the channel nature of the sockets, even if those sockets
supported it, without modification of the standard library.
I have a (~60% complete) side-project to develop aysnchronous socket
support on jython 2.1, by porting the socket, select and (maybe)
asyncore modules to java.nio. When that is complete (timescale==months,
v busy), I hope to see experimentation, from myself and others, on
running python asynchronous models on jython.
Here is what the jython file_wrapper code might look like.
class jython_file_wrapper:
def __init__(self, wrapped):
self.wrapped = wrapped
def sendfile(self, jynio_socket):
if hasattr(self.wrapped, 'getChannel') :
self.wrapped.getChannel().transferTo(jynio_socket)
else:
self.send_in_chunks_instead(jynio_socket)
Regards,
Alan.
More information about the Web-SIG
mailing list