From pje at telecommunity.com Tue Oct 7 01:15:49 2014 From: pje at telecommunity.com (PJ Eby) Date: Mon, 6 Oct 2014 19:15:49 -0400 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging Message-ID: Based on last week's public and private feedback on my "native server APIs" pre-PEP, I've done an almost complete rewrite of the previous draft, in order to provide *concrete examples* of the proposal in use, along with code samples for Django and WebOb, as well as both HTTP/2 and Websockets. The rationale has also been overhauled, and there is a new "Next Steps" section, plus a new discussion of how .close() is affected. In addition, there is a new explanation and example of how to use this proposal to build future standardized APIs atop currently-available native APIs, using straightforward middleware. As before, you can find a "living" HTML version of the draft in progress at: https://gist.github.com/pjeby/62e3892cd75257518eb0 (In addition to nice formatting, it also has a clickable table of contents.) After the next round of feedback, I plan to convert this to reST and get a PEP number assigned -- assuming nobody comes up with a killer problem that sends me back to the drawing board, of course. ;-) # WSGI Response Upgrade Bridging ### Contents * Overview * The Problems * The Proposed Solution * Example Usage Scenarios * Example 1: HTTP/2 Response Pushing from inside Django * Example 2: Websocket Chat from inside a WebOb-based Framework * Proposal Scope * Specification * Providing an API * Response Key Details * Closing and Resource Management * Accessing an API * Intercepting, Disabling, or Upgrading API Bridges * Next Steps * Open Questions and Issues * Notes on the Current Design Rationale * Acknowledgements * References * Copyright # Overview ## The Problems Current Python web frameworks and applications are built mostly on WSGI: a request/response API based on HTTP/0.9's simple request-per-connection model. Web libraries and frameworks offer a wide variety of services for request routing, session management, authentication and authorization, etc., based on this model and working with WSGI. Modern web protocols, however, including Websockets, HTTP/2, SPDY, and so on, are based on a more sophisticated communication model that *doesn't* fit very well within WSGI. (For that matter, WSGI doesn't play well with Twisted or `asyncio`-style asynchronous APIs, either.) Other web API standards have been proposed or are in the process of being developed, but they are between a rock and a hard place, in that if they aim for compatibility with WSGI, then it is harder to provide new features, but if they focus on providing new features, then compatibility with existing frameworks, middleware, etc. is limited. At the same time, server developers are stuck in something of a holding pattern. Their servers may have (or want to add) new features, but what if they invest in a proposed API that doesn't pan out? Conversely, what if they get stuck needing to support multiple APIs? Meanwhile, application developers face their own dilemma as well: since existing Websocket and HTTP/2 APIs cannot be easily (and compatibly) accessed from within WSGI, they are unable to use their application and frameworks' existing code for managing routing, sessions, authentication, authorization, etc., when making use of either Websockets or HTTP/2. Instead, they must duplicate code, or else use sideband communications (e.g. via redis) to link between a server with the needed API and the main code of their application. But what if we could cut through *all three* of these dilemmas, in a way that would let us have our existing framework "cake", and get to "eat" our advanced protocols, too? ## The Proposed Solution Since the majority of existing WSGI framework and middleware tools deal mainly with the WSGI *request*, what if we could keep using WSGI to handle our requests, but use a *different* API for the responses? In fact, what if we could use that different API, *only* for the responses that actually needed it, on a request-by-request basis? That way, for example, we could still use our existing middleware or framework code to make sure that a session has been established, authentication and authorization have been handled, and so on. Then, our existing framework and app code could send their existing login redirects and error pages. But, once everything is logged in and ready to go, we could finally switch over to that other API, to send the *real* response -- and still have access to our user objects, routing parameters, etc., within that other API. And this "real" response wouldn't have to be a single HTTP response, either. It could be a handler of some kind, sending or receiving packets of information via websockets, HTTP/2 push, the `asyncio` API, or whatever other specialized response APIs are available in the WSGI environment. What's more, if we could ask for these "other APIs" by *name*, then we could begin using these other APIs today, right now... *and* still define standardized Python APIs for these features later. And, developers of these other APIs wouldn't have to convince people to switch away from WSGI, nor struggle to come up with clever ways to "tunnel" their APIs through WSGI in a compatible way. Therefore, this PEP proposes a mechanism akin to HTTP's `Upgrade:` process, to allow an existing web framework and/or middleware to handle the initial incoming HTTP request and select an application/controller/view/etc., invoking it with information obtained from the request. Then, when it's time to respond to the request, the running application can choose to upgrade or "bridge" to using a more advanced API to handle the response (and possibly continue to manage an ongoing connection, depending on the nature of the protocols involved). (But, if the request doesn't *need* any special handling, the application can simply issue a standard WSGI response, however it currently does that. So only the parts of an application that *need* this special handling ever have to use it.) ## Example Usage Scenarios Below are two code samples, showing different use cases, different frameworks, and different "upgraded" APIs. In each case, there is an outer piece of framework-specific code (the **request handler**), and an inner piece of non-framework, API-specific code (the **response handler**). To link the two handlers, a small bit of bridging code (shown in these examples as `request.upgrade_to()`) is used to request a desired API by name, register the response handler, and return a **bridging response**: a special WSGI response that tells the server to invoke the response handler in its place. Please note, however, that these are *use case illustrations* only. This proposal does not specify *any* of the APIs shown in these examples, including the `request.upgrade_to()` method itself! Also, depending on the framework and API involved, the request and response handlers could be functions, methods, instances, classes, or something else altogether. A framework might not provide an `upgrade_to()` API of its own (or spell it differently) and an application developer always has the option of creating their own version of it as a utility function. (An example implementation will also be shown later in this spec.) ### Example 1: HTTP/2 Response Pushing from inside Django def main_view(request): def http2_handler(server): server.push(path='/css/myApp.css', ...) server.push(path='/js/myApp.js', ...) server.send_response(status=200, headers = [('content-type', 'text/plain')], body='Hello world!'.encode('ascii')) return request.upgrade_to('http2', http2_handler) This example shows a relatively simple use case: adding pushed files to an HTTP response. The assumption here is that any routing, authentication, etc. have been handled by Django by the time the above code runs, and so it just needs to send a response using some non-WSGI/non-Django API: a hypothetical API named `http2`. The hypothetical `request.upgrade_to(api_name, *args, **kw)` method takes a desired API name, looks it up in the WSGI environment, and invokes it to create a **bridging response**: a special response that tells the WSGI server to use the registered response handler to perform the response, bypassing any middleware that doesn't alter or replace this response. (Again, please note that the actual `http2` API shown is a purely hypothetical illustration, loosely based on the [nghttp2] API; this proposal only covers the *behavior* of `request.upgrade_to()`, and not its existence or spelling, let alone the behavior of Django or `nghttp2`.) ### Example 2: Websocket Chat from inside a WebOb-based Framework This next example is more complex, demonstrating how response upgrade bridging can be used to switch to a "conversational" or packet-oriented protocol such as Websockets: @someframework.route('/chat/:room_id'): # route to the request handler def chat(self, request, room_id): # code here looks up room, user, etc. # can redirect to login/registration # validate room existence, etc. # using the web framework's request and other tools ... # Ready to chat? Define a handler for the websocket API: def websocket_handler(sock): # code here has access to request/room # *plus* whatever it gets passed by the websocket API sock.send("Welcome to the %s room, %s" % (room.name, user.name)) room.sockets[user.name] = sock def sendall(msg): data = msg.encode('utf8') for s in room.sockets.values(): s.send(data) sendall("%s has entered the chat room" % user.name) @sock.on_receive def receive_handler(data): sendall("%s: %s" % user.name, data.decode('utf8')) @sock.on_close def close_handler(): if room.sockets.get(user.name) is sock: delete room.sockets[user.name] # etc... return request.upgrade_to('websockets', websocket_handler) Again, note that this `websockets` API is purely hypothetical; the point of this illustration is merely to show that response-upgrade bridging isn't limited to synchronous control flow or a single request-response pair. Upgraded response APIs can be event driven, callback-based, generator-oriented, or almost anything at all. So, while both of these examples show: 1. An outer function, used as a **request handler** 2. An inner function, used as a **response handler**, and 3. A `request.upgrade_to()` function, used to register the response handler and generate a **bridging response** Please note again that *none* of these three parts have to be implemented in the ways shown above. The request handler could have been a class, instance, or method, depending on the web framework in use, and the same is true for the response handler, depending on the API being bridged to. (And, as previously mentioned, `request.upgrade_to()` is a short bit of glue code that can be written by hand.) ## Proposal Scope Goals of this proposal include: 1. Defining a way for WSGI applications, at runtime (i.e., during the execution of a request), to detect the existence of, and access, upgraded non-WSGI server APIs which can be used in place of WSGI for either effecting a response to the current request, or initiating a more advanced communications protocol (such as websocket connections, associated content pushing, etc.) as an upgrade to the current request. 2. Defining ways for WSGI middleware to: 1. Continue to be used for request routing and other pre-response activities for all requests, as well as post-response activities for requests that do not require bridged API access 2. Intercept and assume control of any bridged APIs to be used by wrapped applications or subrequests (assuming the middleware knows how to do this for a specific bridged API, and desires to do so) 3. Disable any or even *all* bridged API access by its wrapped apps -- even without prior knowledge of *which* APIs might be used -- in the event that the middleware can only perform its intended function by denying such access 3. Defining a way for WSGI servers to negotiate a smooth transition of response handling between standard WSGI and their native API, while safely detecting whether intervening middleware has taken over or altered the response in a way that conflicts with elevating the current request to native API processing Non-goals include: * Actually defining any specification for the bridged APIs themselves ;-) # Specification The basic idea of this specification is to add a dictionary to the WSGI environment, under the key `wsgi.upgrades`. Within this dictionary, a single ASCII string key is allocated for each non-WSGI API offered by the server (or implemented via middleware). So, for example, if Twisted were to offer an upgrade bridge, it might register a `twisted` key within the `wsgi.upgrades` dictionary. And if uWSGI were to offer a websocket API bridge, it might register a `uwsgi.websocket` key (perhaps conditionally on whether the current request included a websocket upgrade header), and so on. The registered key in the `wsgi.upgrades` dictionary MUST be an ASCII string containing a dot-separated sequence of one or more valid Python identifiers. (So, `http2` and `http.v2` are valid API keys, but `http.2` and `http/2` are NOT.) The registered value, on the other hand, is a callable used to create a bridge between a web application's request handler, and a handler for the upgraded (non-WSGI, non-web framework) API. ## Providing an API The implementation of an upgrade bridge consists of a callable object, looking something like this pseudocode: def some_api_bridge(environ, start_response, XXX...): response_key = new_unique_header_compatible_string() current_request.response_registry[response_key] = XXX... start_response('399 WSGI-Bridge: '+response_key, [ ('Content-Type', 'application/x-wsgi-bridge; id='+response_key), ('Content-Length', str(len(response_key))) ]) return [response_key] environ.setdefault('wsgi.upgrades',{})['some_api'] = some_api_bridge As you can see, this is a little bit like a WSGI application -- and in fact it *is* a valid WSGI application, except that one or more positional or keyword arguments (shown here as `XXX...`) are included after the standard WSGI ones, to specify details of the desired response handler. Depending on the needs of the API, these arguments could be a single "handler" callback, or they could be multiple objects, callbacks, or configuration values. The upgrade bridge's job is simply to generate a unique ASCII "native string" key to be used in the bridging response as a substitute for these additional arguments, and to register these arguments under that key for future use by the server. Finally, the bridge sends a WSGI response as shown above, with the status, headers, and body all containing the generated response key. The server MUST NOT actually invoke or begin using the provided handler until *after* the standard WSGI response process has been completed, and it has verified that the response key is *still present* in all three parts of the WSGI response: the status, headers, and body. The continued presence of the response key is used to verify three things: 1. That the registered response handler is indeed a response to the original incoming request, and not merely a response to a subrequest created by middleware 2. That intervening middleware hasn't replaced the bridging response with a response of its own (for example, an error response created because of an error occurring after the bridged handler was registered, but before it was used) 3. *Which* response handler should be invoked, if more than one was registered So, a server providing an upgrade bridge MUST wait until it receives a WSGI response whose status, content-type, content-length, and body all unequivocally identify which of the response handlers registered for the current request should actually be used. In the event that the status, type, and body all match each other, the server MUST then activate the registered response handler for that key, allowing the current request (and possibly subsequent requests, depending on the API involved) to be handled via the associated API. (It also MUST discard any other registered response handlers for the current request.) In the event that neither the status nor headers designate a registered response handler, the server MUST treat the response as a standard WSGI response, and discard all registered response handlers for the current request. In the event that the status and headers disagree on *which* handler is to be used (or *whether* one is to be used at all), or in the event that they *do* agree, but the body disagrees with them, or if all three agree but the supplied ID was not registered for this request or API, then the server MUST generate an error response, and discard both the WSGI response and any registered handlers. (In the face of ambiguity, refuse the temptation to guess; errors should not pass silently.) ### Response Key Details The key used to distinguish responses MUST be an ASCII "native string" (as defined by PEP 3333). It SHOULD also be relatively short, and MUST contain only those characters that are valid in a MIME "token". (That is, it may contain any non-space, non-control ASCII character, except the special characters `(`, `)`, `<`, `>`, `@`, `,`, `;`, `:`, `\`, `"`, `/`, `[`, `]`, `?`, and `=`.) Response keys generated for a given API MUST be unique for the duration of a given request, and MUST be generated in such a way so as not to collide with keys issued for any *other* API during the same request. (e.g., by including the API's name in them.) Response keys SHOULD also be unique within the lifetime of the process that generates them, e.g. by including a global counter value. (So, the simplest way of generating a response key that conforms to this spec is to just append a global counter to a string uniquely identifying the chosen API. However, there is nothing stopping a server from adding other information like a request ID, channel desginator, or other information in, as an aid to debugging. Just make sure there's no whitespace or special characters involved, as mentioned above.) ### Closing and Resource Management Because the bridging response may have been wrapped by middleware -- e.g. session middleware that saves updated session data on `.close()`, database connection-pooling middleware that releases connections on `.close()`, etc. -- the server MUST NOT invoke the WSGI response's `.close()` method (if any) before the new response handler is finished, in order to prevent premature resource release. If the response protocol implements something like websockets, or an extended HTTP/2 conversation, then the provided API SHOULD provide some way for the response handler to explicitly ensure that the response `.close()` method is called, at some point *before* the conversation is completed and the connection is closed. These two requirements exist because even if the response *content* is not altered by middleware, it is still possible for middleware to attach resource-release handlers to the WSGI response *object*. If these are not closed at all, or closed prematurely, it may cause problems with the underlying web framework. For example, some web frameworks offer a facility to tie database transaction scope to request scope, so that when a request is completely finished, the current transaction is automatically committed, and a database connection may be returned to a pool. A response handler might then be in the position of trying to use a connection that no longer "belonged" to it. In the simpler, more common case of a single response to a single request, deferring the `.close()` operation until the entire response is completed will help to preserve existing framework behavior and user expectations, so long as the framework is using a `.close()`-based mechanism to control these other features. Conversely, in the case where an extended conversation takes place, the user may wish to signal completion earlier, in order to avoid hanging on to unnecessary resources. Of course, if a framework uses some other mechanism to allocate its connections, scope its transactions, or do other resource management, then that may impose certain limitations on the user with respect to what framework features are still usable within a given response handler. Web frameworks supporting this spec MUST document what framework features will be unavailable from within a bridged API response handler (i.e. after the framework request handler returns a response), and SHOULD provide alternate ways to access those features from a response handler. Further, a framework MAY intercept and wrap registered response handlers (for APIs whose control flow they understand) in order to transparently provide these features. (However, since this has to be done on an API-by-API basis, it's likely that most framework providers will only offer this interception feature for a few, community-standardized APIs. But they may -- and perhaps already do -- expose APIs that would let others do the necessary wrapping or interception themselves.) ## Accessing an API Now that we have seen both the application and server sides of the bridging process, we can look at the bridge itself. Essentially, the bridging is done by: 1. Retrieving the appropriate upgrade bridge from the environ 2. Invoking that bridge as if it were a WSGI application, passing any extra arguments required by the specific bridged API (such as a handler) 3. Returning the bridge's WSGI response, as the WSGI response of the current app or framework. Here's an example, using a pure WSGI app and no web framework: def my_wsgi_app(environ, start_response): foobar_api = environ.get('wsgi.upgrades', {}).get('foobar') if foobar_api is None: # appropriate error action here # i.e. raise something, or return an error response def my_foobar_handler(foobar_specific_arg, another_foobar_arg, etc...): # code here that uses the foobar API to do something cool # Delegate the WSGI response to the foobar API return foobar_api(environ, start_response, my_foobar_handler) However, since most application code *isn't* pure WSGI and *does* use a framework, here's an example of how Django's `WSGIRequest` class might implement our previously-illustrated `request.upgrade_to()` method: def upgrade_to(self, api_name, *args, **kw): api_bridge = self.environ.get('wsgi.upgrades', {}).get(api_name) if api_bridge is None: raise RuntimeError("API unavailable") # Capture the bridging response as a Django response: response = StreamingHttpResponse() def start_response(status, headers): code, reason = status.split(' ', 1) response.status_code = int(code) response.reason_phrase = reason for h, v in headers: response[h] = v response.streaming_content = api_bridge(self.environ.copy(), start_response) return response And here's the `webob.Request` version of the same functionality (which is a lot simpler, since WebOb already provides a way to capture a WSGI app as a response): def upgrade_to(self, api_name, *args, **kw): api_bridge = self.environ.get('wsgi.upgrades', {}).get(api_name) if api_bridge is None: raise RuntimeError("API unavailable") return self.send(lambda env, s_r: api_bridge(env.copy(), s_r, *args, **kw)) Individual web frameworks can of course decide how best to expose this functionality to their users, whether via a request or response method, controller method, special object to return, exception to raise, or whatever other approach best suits their framework's API paradigm. (And of course, as long as the framework provides access to the WSGI environ, and allows setting every aspect of the WSGI response, an application developer can implement their own variation of the above, without any extra assistance from the framework itself.) ## Intercepting, Disabling, or Upgrading API Bridges Because all API upgrade bridges are contained in a single WSGI environment key, it is easy for WSGI middleware to disable access to them when creating subrequests, by simply deleting the entire `wsgi.upgrades` key before invoking an application. Likewise, in the event that WSGI middleware wishes to disable one *specific* API, or intercept it, it can do so by removing or replacing the appropriate bridge in the upgrades dictionary. Last, but far from least, WSGI middleware can add *new* bridges to the environment, though it should usually only do so if it implements the new bridge in terms of a bridge that already exists. (For example, to provide a standardized wrapper over a server's native API, or to emulate one server's API in terms of another server's API.) These "middleware bridges" should work by delegating the actual bridging process to the base API, e.g.: def api_standardizing_middleware(app): def standard_api_bridge(environ, start_response, std_handler): def native_handler(...): # translate/wrap native args to std args, then pass them on std_handler(...) native_api = environ['wsgi.upgrades']['native_api'] return native_api(environ, start_response, native_handler) def wrapped_app(environ, start_response): upgrades = environ.setdefault('wsgi.upgrades', {}) if 'native_api' in upgrades: upgrades['standard_api'] = standard_api_bridge return app(environ, start_response) return wrapped_app In this example, we show a piece of middleware that converts some server's native API (`native_api`) to some Python standard API (`standard_api`), if the required native API is available at request time. It doesn't have to implement any other part of the bridging specification, since the server's native API bridge will register and invoke the native response handler (`native_handler`), which in turn will invoke the "standardized" handler (`std_handler`). So, all the middleware needs to do is accept handler arguments for the API it wants to provide, and then register a linked handler with the native API. (Apart from the code shown above, everything else is just whatever is needed to implement the actual API translation.) This means that if a server exposes whatever its native API is, then any number of translated, standardized, or simplified versions of that API can be offered via middleware, without needing to alter the server itself, or the server's core WSGI implementation. Instead, those other APIs can just be implemented via the existing native API bridge. (Note: The `wsgi.upgrades` dictionary is to be considered volatile in the same way as the WSGI environment is. That is, apps or middleware are allowed to modify or delete its contents freely, so a copy MUST be saved by middleware if it wishes to access the original values after it has been passed to another application or middleware.) # Next Steps Once this specification is stable, the next step is to implement native server API bridges for existing web servers. These do not necessarily need to be provided by the server implementers themselves, but they do need to be implemented in the server's native API, and extend its WSGI implementation. Because it is possible for API bridges to be layered or upgraded by standard WSGI middleware, it is **not** necessary for servers to directly support multiple APIs. Servers can simply expose their existing API as an API bridge, and let third parties implement middleware to translate that API to any future standardized APIs. As soon as even one such native API exists, it is immediately beneficial for web frameworks to provide support for the bridging API, and possible for framework users to supply their own. (WebOb support would be especially useful, since a significant number of web frameworks base their request and response objects on WebOb.) It may also be helpful to publish a reference library for response key generation and response verification, along with perhaps a wsgiref update or at least some sample code showing how to modify the wsgiref request handler flow to initiate a bridge operation. ## Open Questions and Issues * Transaction and object lifetimes -- is the current spec correct/sufficient? * What if middleware adds headers but leaves the status and content-type unchanged? Should that be an error? What happens if middleware requests setting cookies? * Do the chosen status/headers/body signatures actually make sense? Do they even need to be more specified, less-specified? * Are there any major obstacles to sending a special status from major web frameworks? * Should a different status be used? * Are there any other ways to corrupt, confuse, or break this? * What else am I missing, overlooking, or getting wrong? ## Notes on the Current Design Rationale * A dictionary is used for all bridged APIs, so they can be easily disabled for subrequests * Multiple registrations are allowed, so that middleware invoking multiple subrequests is unaffected, so long as exactly one subrequest's response is returned to the top-level WSGI server * A `Content-Type` header is part of the spec, because most response-altering middleware should avoid altering content types it does not understand, thereby increasing the likelihood that the response will be passed through unchanged # Acknowledgements (TBD, but should definitely include Robert Collins for research, inspiration, and use cases) # References TBD [nghttp2]: http://nghttp2.org/documentation/package_README.html#python-bindings # Copyright This document has been placed in the public domain. From graham.dumpleton at gmail.com Fri Oct 10 14:56:53 2014 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 10 Oct 2014 23:56:53 +1100 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: Message-ID: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> On 07/10/2014, at 7:15 AM, PJ Eby wrote: > As before, you can find a "living" HTML version of the draft in progress at: > > https://gist.github.com/pjeby/62e3892cd75257518eb0 > > (In addition to nice formatting, it also has a clickable table of contents.) > > After the next round of feedback, I plan to convert this to reST and > get a PEP number assigned -- assuming nobody comes up with a killer > problem that sends me back to the drawing board, of course. ;-) For those who were not aware, I personally haven't commented as yet on this discussion because I have been on a holiday for the last few weeks and I wasn't going to allow a discussion about this to ruin my holiday. I haven't caught up yet on all the discussion, but it is sad to say that it has headed down a direction exactly as a I warned Robert Collins in private discussions would likely happen, with certain people trying to rush things to push their own specific idea for how things should be done, with the risk that that will dominate the agenda and so push Robert out of the way as far as trying to coordinate this as a community effort where anyone could feel confident about providing input with the result then also being a community effort. So PJE, please step back and do not go rushing out to create a PEP. That is the worst thing you could do at this point and will only serve to deter people from the community contributing and so stifle proper discussion about this whole topic. You have no more experience or mandate to be specifying a standard for this than anyone else. By creating a PEP though that gets perceived by many as meaning the discussion is over. This is exactly what you did for PEP 3333 and which caused previous discussion about improving WSGI to get shutdown. The result was that the only thing that really got addressed in PEP 3333 was Python 3 compatibility and a lot of the other bits of the WSGI specification which are poorly defined, contradictory or restrictive and which cause WSGI server and application developers pain never got addressed. If that prior discussion hadn't been shutdown in that way, we could have been using a better defined and improved WSGI years ago already. Robert has stuck his neck out to try and bring various parties together to work on this where anyone who has an opinion or idea can raise them so we as a community can all together come up with something which is workable for both server implementers and web application developers. Robert even setup a github repo specifically as a place to bring together all those ideas and described how people can add stuff there. For whatever personal reason you have decided to ignore that repo Robert set up and decided to go alone. If you have an issue with the way the repo was structured which didn't make it easy for you to contribute your work into it, then work with Robert to address that. Right now, that you have created your own separate space for writing up a specification which you are now trying to rush into a PEP comes across as you not really wanting to co-ordinate with Robert on this as a community effort with it instead appearing that you think you know better than anyone else and nothing anyone else says will be of value. In the face of that, it is hardly surprising that no one has really responded to what you have proposed. So slow down. This is not a race to see who can be the first to come out with a PEP and so dominate the discussion, it is meant to be a community effort. Robert. What I would suggest you do is reboot this whole effort. Go back and perhaps look at how the github repo you setup is structured and make it more obvious how anyone can add their work into it in separate areas of it as need be and not just as issues, if that isn't already clear enough. Document exactly what you want people to do as far as adding anything there. Find people who will work with you on making all this clearer and defining any process. The next step is to make a more definite statement about the timeline for this whole discussion. Specifically, give notice of a formal request for comment period and publicise it through any Python blogs of the PSF that might be able to be used, as well as through the different Python web communities. Also get prominent individuals in the Python WSGI and web community to also publicise the comment period. Set a specific date for the end of that comment period. There should be no rush on this and people should be given adequate time to respond. Most interested parties would only do this in their spare time and employers aren't going to allow them to waste their work time on it. So make the comment period something like 2 months from the date of announcing it. What can people comment on? They may want to comment on the process itself of how we get to the various specifications that may come out of this. They may want to comment on what should even be addressed in any revisions or extensions to the WSGI specification. In other words, don't limit this to just HTTP/2 and web sockets support. Allow people to raise their pet peeves about the existing WSGI specification so we can perhaps properly address them this time. The whole ASYNC issue with existing WSGI applications also should not be ruled out of scope as far as the comment period. Finally and hopefully, rather than people just complaining about things or giving wish lists, they will present properly fleshed out ideas for how to concretely solve ideas around ASYNC, HTTP/2 and web sockets. The point is that this should purely be a period for collecting information as was I believe your original intent. Make it easy for people to contribute, but defuse the idea that it is a competition or race played out on the WEB-SIG mailing list, to see who is better, and so take the heat out of the discussion by simply having people put their ideas in the github repo for later review in a followup step. Make it absolutely clear that the intent is not for people to start pushing out PEPs as competing proposals. Such a path is only going to be detrimental to the long term success of all this and make people in the community feel that they are unable to participate because of the higher expectations of what has to go into a PEP. At the end of the comment period would then come a period of review of what is submitted to better define the scope of what seems sensible as far as what can be tackled. Not every idea has to be addressed. Set a time line you would expect for that review to take place, but specifically say that how long it will take is going to be fuzzy as how that review is even to be run would have to be worked out first. Then from that review, a scope of work can be specified and broken out into different working groups using any proposals by people submitted during the comment period as a basis. Yes Robert, I understand that this is how you want things to work, but the structure and timeline has to be better specified as an overall process. If you don't then you will keep getting people who think that they can ignore the process you want followed and set their own agenda. Once the process is better defined, then we as a community can say, work within it, or go away. Graham -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Fri Oct 10 20:38:26 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sat, 11 Oct 2014 07:38:26 +1300 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On 11 October 2014 01:56, Graham Dumpleton wrote: > ... > Robert. What I would suggest you do is reboot this whole effort. > > Go back and perhaps look at how the github repo you setup is structured and > make it more obvious how anyone can add their work into it in separate areas > of it as need be and not just as issues, if that isn't already clear enough. > Document exactly what you want people to do as far as adding anything there. > Find people who will work with you on making all this clearer and defining > any process. > > The next step is to make a more definite statement about the timeline for > this whole discussion. Thanks for the process critique - I agree. I will put together such improvements in a little bit. I hadn't actually intended to go quiet - my intended next step was to collate the feedback we've had so far (and prompt you for some mod_wsgi orientated feedback). However, about 2 weeks back my Mum died, and that caused a rather big speed bump in the 'what I need to do' chore list, which still isn't over (but at least the crisis wise aspects are) .... > They may want to comment on what should even be addressed in any revisions > or extensions to the WSGI specification. In other words, don't limit this to > just HTTP/2 and web sockets support. Allow people to raise their pet peeves > about the existing WSGI specification so we can perhaps properly address > them this time. The whole ASYNC issue with existing WSGI applications also > should not be ruled out of scope as far as the comment period. > > Finally and hopefully, rather than people just complaining about things or > giving wish lists, they will present properly fleshed out ideas for how to > concretely solve ideas around ASYNC, HTTP/2 and web sockets. TBH I'd be fine with complaints and wish lists - got to start somewhere, and having a clear list of the places WSGI has not met needs would be excellent. One thing you could do, if you like, is to put a PR together for the wsgi-ng repo that adjusts README in the light of your feedback. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From pje at telecommunity.com Fri Oct 10 21:10:09 2014 From: pje at telecommunity.com (PJ Eby) Date: Fri, 10 Oct 2014 15:10:09 -0400 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On Fri, Oct 10, 2014 at 8:56 AM, Graham Dumpleton wrote: > So PJE, please step back and do not go rushing out to create a PEP. That is > the worst thing you could do at this point and will only serve to deter > people from the community contributing and so stifle proper discussion about > this whole topic. Huh? Have you *read* the PEP? The entire point of it is to provide a basis for *experimenting* with new standards, not to "stifle discussion" of them. It's not even an *API*, for heavens' sake. It's just a description of how to upgrade to new standards from within existing WSGI frameworks, without needing to tunnel responses and without breaking subrequest middleware. IOW, it's a WSGI *1.0* server extension protocol, and a fairly *minor* one at that. (Indeed, it's little more than an enhanced variation of wsgi.file_wrapper!) It's not any sort of competitor or alternative to what Robert's working on; it's a *stepping stone* for what Robert's working on. In early discussion with Robert -- both here and on github -- it became apparent to me that restricting post-WSGI specifications to what can be achieved in WSGI 1 tunneling is a bad idea. So I've created a *bridging* specification, that allows post-WSGI APIs to be accessed from within WSGI-based apps and frameworks. That's *all*. All of the things you've mentioned as being in scope for discussion, are *still* in scope for discussion. All *this* proposal does is show how those things could be *accessed*, *today*, from inside existing web apps and frameworks, once those new APIs exist. > You have no more experience or mandate to be specifying a > standard for this than anyone else. If by "this" you're referring to HTTP/2 or some other new post-WSGI API, then I agree with you. But that's not what the PEP is about. > By creating a PEP though that gets > perceived by many as meaning the discussion is over. This is exactly what > you did for PEP 3333 and which caused previous discussion about improving > WSGI to get shutdown. That's an interesting perspective, but I don't see how it can be reconciled with the facts. First off, I didn't write a new PEP; I wrote up some of *your* proposed clarifications for Python 3 as WSGI 1.0.1, which was intended to add new clarifying text to PEP 333, *not* to create a new PEP. It was *Guido* who said it must be a new PEP, as you will see here: https://mail.python.org/pipermail/web-sig/2010-September/004691.html and here (where he even says, "Don't see this as a new spec. See it as a procedural issue."): https://mail.python.org/pipermail/web-sig/2010-September/004694.html Second, I didn't make anybody stop discussing alternatives for moving things forward. Nobody *ever* said to stop working on a version 2 or even 1.1, certainly not me. See for example, this message, where I agreed with Ian's POV that there was room for both PEP 333 fixes *and* continued work on PEP 444: https://mail.python.org/pipermail/web-sig/2010-September/004662.html Third, and finally, as far as I can tell from the record of the discussion back then, it was you -- and *only* you -- who suggested that the acceptance of PEP 3333 meant the discussion was *over*. Indeed, on your blog you actually pushed back at Alice for bringing up more PEP 444 discussion! Nonetheless, discussion of PEP 444 and async APIs and such proceeded well past the introduction of PEP 3333, even without its original authors' participation. And, ironically enough, your posts show up in that discussion, bemoaning that Alice (the new PEP 444 champion) was creating confusion by calling that proposal WSGI 2.0! > The result was that the only thing that really got > addressed in PEP 3333 was Python 3 compatibility and a lot of the other bits > of the WSGI specification which are poorly defined, contradictory or > restrictive and which cause WSGI server and application developers pain > never got addressed. If that prior discussion hadn't been shutdown in that > way, we could have been using a better defined and improved WSGI years ago > already. Those things didn't get addressed because *you* didn't take up the lead -- a lead which I more than once mentioned you should take up. For example, as I said in https://mail.python.org/pipermail/web-sig/2010-September/004693.html : > The full list of things Graham and others have asked for or > recommended would indeed require a 1.1 version at minimum, and thus a > new PEP. But I really don't want to start down that road right now, > and therefore hope that I can talk Graham or some other poor soul > into shepherding a 1.1 PEP instead. ;-) You didn't, and haven't, taken up that slack. What you've consistently done is mutter and grumble on the sidelines about how it's never going to happen and disclaim any responsibility for actually writing a proposal because it's never going to go anywhere -- thereby *ensuring* that it's never going to go anywhere. (And one key reason I wrote the WSGI-RUB PEP is that I noticed I'd started doing what *you* do: grumbling at Robert about his proposals, without taking the time to write up my own, dumping the hard work on him instead of getting my own hands dirty.) PEPs don't magically arise from some mysterious group consensus. They happen because some poor shmuck does the work of *building* a consensus around something they want to have happen, hammering out agreements, and writing the thing up. The only way the improvements you want are ever going to happen are if you either lead the process yourself, or get somebody else to do it for you. If you want to ride Robert's back and get him to do the dirty work, that's A-OK by me. I don't have a horse in *that* race, and haven't for *ten years*. The PEP 444 discussion didn't stop because I did the dirty work of turning some of your gripes into concrete specifications. It stopped because the poor schmucks who initially volunteered to do the heavy lifting on PEP 444 were only doing it to get Python 3 sorted out, and were sufficiently happy with the 1.0.1 clarifications to be glad of dropping the *workload* involved... a workload which you declined to pick up, despite many people (myself included) *asking* for continued discussion of PEP 444. *PEPs* don't stifle discussions. *Lack of volunteers* stifles discussions. Without somebody driving the process, discussions about multiple substantive issues with a PEP tend to die a natural death as everybody voices their opinion *once*... and then shuts up. What keeps a PEP moving is somebody taking those raw opinions and pushing something forward from them, asking "so what about this, then? Will this work? What do you think of that?" Without that energy being put *in*, nothing comes back out, except maybe the occasional lengthy session of bikeshedding on the non-substantive parts of a proposal. Indeed, the more substantive the discussion, the fewer the participants, and the harder it is to actually get things moving... and the more likely the champion is to give up in the face of what seems like overwhelming opposition. So it's not suprising that Chris and Armin and Alice all gave up on doing that: it's a lot of hard work, and *I'm* not volunteering for it, either. If you want Robert to do the work of shepherding a new post-WSGI PEP under your guidance I am *all* in favor of it. I've been trying to get *you* off the sidelines on this thing for *years*. Indeed, if that is the only outcome of the work I did on the new RUB proposal, then I am as happy to drop out of it as Chris and Armin were to drop PEP 444. (Frankly, some of my admonishments to Robert were based on my expectation that you would continue to snipe from the sidelines and avoid getting your hands dirty.) Which is unfortunate, because AFAIK and IMO, *you* are the only person currently active in the community who's both *actually* qualified to ride herd on WSGI 1.1 *and* is an absolute "must-have" contributor for the sucess of any true post-WSGI specification. (Again, AFAIK and IMO, not intended as a slight to anybody else whose qualifications and contributions I'm presently unaware of.) So the PEP I've written *isn't* an attempt to make such a post-WSGI specification. It's my attempt to build a bridge from the WSGI we have, to whatever specification you and the rest of the Web-SIG come up with *next*. Indeed, it's intended as something of a "parting gift", to address the development, deployment, and long-term migration issues that *any* post-WSGI spec will have. I'm tired of dealing with WSGI's limitations and corner cases and quirks, and I don't want to have to spend a lot of time reviewing post-WSGI specs to check for breakage on those quirks. The point of the RUB is to set the post-WSGI world entirely free of them, and it's my gift to you and Robert and anybody else who wants to clean away the mess and start over. With it, you can imagine completely new APIs (like the generator-based ones Robert was sketching), without needing to figure out how to make the bloody thing work with WSGI 1.0 middleware. Pre-empting that kind of free API design is the *last* thing I want to have happen, which is precisely *why* I've put forward the RUB spec. I don't *want* to spend a lot of time telling Robert all the things he *can't* do, because of WSGI 1's limitations. So instead, I've proposed something that will let him "have" whatever sort of post-WSGI cake he wants, while still letting WSGI 1 code "eat" it too. > Right now, that you have created your own separate space How is a Web-SIG thread a "separate space", let alone my "own" separate space? > for writing up a specification which you are now > trying to rush into a PEP comes across as you not really wanting to > co-ordinate with Robert on this as a community effort with it instead > appearing that you think you know better than anyone else and nothing anyone > else says will be of value. In the face of that, it is hardly surprising > that no one has really responded to what you have proposed. Well, if that's the case, it's certainly an unfortunate misunderstanding. Because all I have *actually* done is described a way that *everyone* can contribute to the development of new APIs, without *first* needing everyone *else* to agree on those APIs and modify their web frameworks to support them. I'm trying to make it possible for there to be *more* participation, not less. Any J. Random Developer with an idea or an itch to scratch should be able to throw together an implementation of a post-WSGI API, and start using it today from inside existing WSGI frameworks. It's from just such experiments that de facto -- and later, de jure -- standards can and should arise. Personally, I don't expect there to be much discussion of my proposal right now because nobody is yet trying to *implement* any post-WSGI specifications. The RUB spec is mostly a stake in the ground, to say, "Don't worry about WSGI 1 compatibility or tunneling; you can use any API paradigm you want, and the community will still be able to make an orderly transition to using it. Here's how." At this point, AFAIK, there are precisely two APIs that could have benefited from the prior existence of WSGI-RUB: nghttp2 and uwsgi.websockets. (Which is why the examples in the spec are based on them.) And I'm not especially aware of anybody else writing new ones, who would therefore be interested in it. Frankly, I wrote the thing to get it out of my head and to have a convenient place to point to when anybody makes the same mistake I did, of trying to limit post-WSGI API design to what can be safely shoehorned back through WSGI 1 response middleware. I don't expect much *detailed* discussion of WSGI-RUB, in other words, until there's at least a strawman post-WSGI API proposal. (Which, IIUC from his last message, Robert is doing some experimenting towards.) Perhaps it is not clear from your cursory review of the existing discussion, but both Robert and I *learned* some things from our interaction. And one of the things that *I* learned from that interaction was that nitpicking from the sidelines about what Robert couldn't do or how he should do it was not productive. Which is why I've put forth a proposal that eliminates the need for post-WSGI APIs to be nitpicked for WSGI 1 middleware compatibility. And it's why I hope *you* will read that proposal, and have something to say about its substance, because you are one of the people from whom I have most eagerly awaited such feedback, be it good, bad, or ugly. But I would *much* prefer some feedback about its *substance*, to a bunch of insinuations about whether I should've proposed it at all. From robertc at robertcollins.net Sun Oct 12 23:38:59 2014 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 13 Oct 2014 10:38:59 +1300 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: On 30 September 2014 11:47, Alan Kennedy wrote: > [Robert] >> So it sounds like it should be the responsibility of a middleware to >> renormalize the environment? > > In order for that to be the case, you have strictly define what > "normalization" means. For a given deployment its well defined. I agree that in general its not. > I believe that it is not possible to fully specify "normalization", and that > any attempt to do so is futile. > > If you want to attempt it for the specific scenarios that your particular > application has to deal with, then by all means code your version of > "normalization" into your application. Or write some middleware to do it. > > But trying to make "normalization" a part of a WSGI-style specification is > impossible. I don't recall proposing that it should be in a WSGI-style spec. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Mon Oct 13 00:47:50 2014 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 13 Oct 2014 11:47:50 +1300 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On 11 October 2014 08:10, PJ Eby wrote: > On Fri, Oct 10, 2014 at 8:56 AM, Graham Dumpleton > wrote: >> So PJE, please step back and do not go rushing out to create a PEP. That is >> the worst thing you could do at this point and will only serve to deter >> people from the community contributing and so stifle proper discussion about >> this whole topic. > > Huh? Have you *read* the PEP? The entire point of it is to provide a > basis for *experimenting* with new standards, not to "stifle > discussion" of them. It's not even an *API*, for heavens' sake. It's > just a description of how to upgrade to new standards from within > existing WSGI frameworks, without needing to tunnel responses and > without breaking subrequest middleware. FWIW I'm totally fine with you bringing together that PEP - as you say its complementary to what I'm focused on (I believe I even suggested you might want to do that). -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Mon Oct 13 03:13:10 2014 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 13 Oct 2014 14:13:10 +1300 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On 11 October 2014 01:56, Graham Dumpleton wrote: I've pushed up https://github.com/python-web-sig/wsgi-ng/commit/df51d7d6fd4faa4efbe397fda2c323932f967020 which hopefully addresses the process and clarity concerns you expressed. (If not please help me tweak it appropriately). -Rob From pje at telecommunity.com Mon Oct 13 05:59:59 2014 From: pje at telecommunity.com (PJ Eby) Date: Sun, 12 Oct 2014 23:59:59 -0400 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On Sun, Oct 12, 2014 at 6:47 PM, Robert Collins wrote: > FWIW I'm totally fine with you bringing together that PEP - as you say > its complementary to what I'm focused on (I believe I even suggested > you might want to do that). Did you have any feedback on the proposal itself? I'm particularly counting on you to tell me if I've horribly misunderstood something important about the use cases or the requirements for the protocols themselves. I think that the "upgrade" model I've presented will enable you to happily design completely new API paradigms without having to figure out how to tunnel them through a maze of WSGI middleware, with the exception of having reasonable ways to present the incoming request as a WSGI request. But if I've missed something there, please let me know. From bchesneau at gmail.com Mon Oct 13 06:12:14 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Mon, 13 Oct 2014 06:12:14 +0200 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On Fri, Oct 10, 2014 at 9:10 PM, PJ Eby wrote: > On Fri, Oct 10, 2014 at 8:56 AM, Graham Dumpleton > wrote: > > So PJE, please step back and do not go rushing out to create a PEP. That > is > > the worst thing you could do at this point and will only serve to deter > > people from the community contributing and so stifle proper discussion > about > > this whole topic. > > Huh? Have you *read* the PEP? The entire point of it is to provide a > basis for *experimenting* with new standards, not to "stifle > discussion" of them. It's not even an *API*, for heavens' sake. It's > just a description of how to upgrade to new standards from within > existing WSGI frameworks, without needing to tunnel responses and > without breaking subrequest middleware. > > IOW, it's a WSGI *1.0* server extension protocol, and a fairly *minor* > one at that. (Indeed, it's little more than an enhanced variation of > wsgi.file_wrapper!) It's not any sort of competitor or alternative to > what Robert's working on; it's a *stepping stone* for what Robert's > working on. > > In early discussion with Robert -- both here and on github -- it > became apparent to me that restricting post-WSGI specifications to > what can be achieved in WSGI 1 tunneling is a bad idea. So I've > created a *bridging* specification, that allows post-WSGI APIs to be > accessed from within WSGI-based apps and frameworks. > > That's *all*. > > All of the things you've mentioned as being in scope for discussion, > are *still* in scope for discussion. All *this* proposal does is show > how those things could be *accessed*, *today*, from inside existing > web apps and frameworks, once those new APIs exist. > > > > You have no more experience or mandate to be specifying a > > standard for this than anyone else. > > If by "this" you're referring to HTTP/2 or some other new post-WSGI > API, then I agree with you. But that's not what the PEP is about. > > > > By creating a PEP though that gets > > perceived by many as meaning the discussion is over. This is exactly what > > you did for PEP 3333 and which caused previous discussion about improving > > WSGI to get shutdown. > > That's an interesting perspective, but I don't see how it can be > reconciled with the facts. > > First off, I didn't write a new PEP; I wrote up some of *your* > proposed clarifications for Python 3 as WSGI 1.0.1, which was intended > to add new clarifying text to PEP 333, *not* to create a new PEP. It > was *Guido* who said it must be a new PEP, as you will see here: > > https://mail.python.org/pipermail/web-sig/2010-September/004691.html > > and here (where he even says, "Don't see this as a new spec. See it as > a procedural issue."): > > https://mail.python.org/pipermail/web-sig/2010-September/004694.html > > Second, I didn't make anybody stop discussing alternatives for moving > things forward. Nobody *ever* said to stop working on a version 2 or > even 1.1, certainly not me. See for example, this message, where I > agreed with Ian's POV that there was room for both PEP 333 fixes *and* > continued work on PEP 444: > > https://mail.python.org/pipermail/web-sig/2010-September/004662.html > > Third, and finally, as far as I can tell from the record of the > discussion back then, it was you -- and *only* you -- who suggested > that the acceptance of PEP 3333 meant the discussion was *over*. > Indeed, on your blog you actually pushed back at Alice for bringing up > more PEP 444 discussion! > > Nonetheless, discussion of PEP 444 and async APIs and such proceeded > well past the introduction of PEP 3333, even without its original > authors' participation. And, ironically enough, your posts show up in > that discussion, bemoaning that Alice (the new PEP 444 champion) was > creating confusion by calling that proposal WSGI 2.0! > > > > The result was that the only thing that really got > > addressed in PEP 3333 was Python 3 compatibility and a lot of the other > bits > > of the WSGI specification which are poorly defined, contradictory or > > restrictive and which cause WSGI server and application developers pain > > never got addressed. If that prior discussion hadn't been shutdown in > that > > way, we could have been using a better defined and improved WSGI years > ago > > already. > > Those things didn't get addressed because *you* didn't take up the > lead -- a lead which I more than once mentioned you should take up. > For example, as I said in > https://mail.python.org/pipermail/web-sig/2010-September/004693.html : > > > The full list of things Graham and others have asked for or > > recommended would indeed require a 1.1 version at minimum, and thus a > > new PEP. But I really don't want to start down that road right now, > > and therefore hope that I can talk Graham or some other poor soul > > into shepherding a 1.1 PEP instead. ;-) > > You didn't, and haven't, taken up that slack. What you've > consistently done is mutter and grumble on the sidelines about how > it's never going to happen and disclaim any responsibility for > actually writing a proposal because it's never going to go anywhere -- > thereby *ensuring* that it's never going to go anywhere. (And one key > reason I wrote the WSGI-RUB PEP is that I noticed I'd started doing > what *you* do: grumbling at Robert about his proposals, without taking > the time to write up my own, dumping the hard work on him instead of > getting my own hands dirty.) > > PEPs don't magically arise from some mysterious group consensus. They > happen because some poor shmuck does the work of *building* a > consensus around something they want to have happen, hammering out > agreements, and writing the thing up. The only way the improvements > you want are ever going to happen are if you either lead the process > yourself, or get somebody else to do it for you. If you want to ride > Robert's back and get him to do the dirty work, that's A-OK by me. I > don't have a horse in *that* race, and haven't for *ten years*. > > The PEP 444 discussion didn't stop because I did the dirty work of > turning some of your gripes into concrete specifications. It stopped > because the poor schmucks who initially volunteered to do the heavy > lifting on PEP 444 were only doing it to get Python 3 sorted out, and > were sufficiently happy with the 1.0.1 clarifications to be glad of > dropping the *workload* involved... a workload which you declined to > pick up, despite many people (myself included) *asking* for continued > discussion of PEP 444. > > *PEPs* don't stifle discussions. *Lack of volunteers* stifles > discussions. Without somebody driving the process, discussions about > multiple substantive issues with a PEP tend to die a natural death as > everybody voices their opinion *once*... and then shuts up. > > What keeps a PEP moving is somebody taking those raw opinions and > pushing something forward from them, asking "so what about this, then? > Will this work? What do you think of that?" Without that energy > being put *in*, nothing comes back out, except maybe the occasional > lengthy session of bikeshedding on the non-substantive parts of a > proposal. Indeed, the more substantive the discussion, the fewer the > participants, and the harder it is to actually get things moving... > and the more likely the champion is to give up in the face of what > seems like overwhelming opposition. > > So it's not suprising that Chris and Armin and Alice all gave up on > doing that: it's a lot of hard work, and *I'm* not volunteering for > it, either. > > If you want Robert to do the work of shepherding a new post-WSGI PEP > under your guidance I am *all* in favor of it. I've been trying to > get *you* off the sidelines on this thing for *years*. Indeed, if > that is the only outcome of the work I did on the new RUB proposal, > then I am as happy to drop out of it as Chris and Armin were to drop > PEP 444. (Frankly, some of my admonishments to Robert were based on > my expectation that you would continue to snipe from the sidelines and > avoid getting your hands dirty.) > > Which is unfortunate, because AFAIK and IMO, *you* are the only person > currently active in the community who's both *actually* qualified to > ride herd on WSGI 1.1 *and* is an absolute "must-have" contributor for > the sucess of any true post-WSGI specification. (Again, AFAIK and > IMO, not intended as a slight to anybody else whose qualifications and > contributions I'm presently unaware of.) > > So the PEP I've written *isn't* an attempt to make such a post-WSGI > specification. It's my attempt to build a bridge from the WSGI we > have, to whatever specification you and the rest of the Web-SIG come > up with *next*. Indeed, it's intended as something of a "parting > gift", to address the development, deployment, and long-term migration > issues that *any* post-WSGI spec will have. > > I'm tired of dealing with WSGI's limitations and corner cases and > quirks, and I don't want to have to spend a lot of time reviewing > post-WSGI specs to check for breakage on those quirks. The point of > the RUB is to set the post-WSGI world entirely free of them, and it's > my gift to you and Robert and anybody else who wants to clean away the > mess and start over. > > With it, you can imagine completely new APIs (like the generator-based > ones Robert was sketching), without needing to figure out how to make > the bloody thing work with WSGI 1.0 middleware. > > Pre-empting that kind of free API design is the *last* thing I want to > have happen, which is precisely *why* I've put forward the RUB spec. > I don't *want* to spend a lot of time telling Robert all the things he > *can't* do, because of WSGI 1's limitations. > > So instead, I've proposed something that will let him "have" whatever > sort of post-WSGI cake he wants, while still letting WSGI 1 code "eat" > it too. > > > > Right now, that you have created your own separate space > > How is a Web-SIG thread a "separate space", let alone my "own" separate > space? > > > > for writing up a specification which you are now > > trying to rush into a PEP comes across as you not really wanting to > > co-ordinate with Robert on this as a community effort with it instead > > appearing that you think you know better than anyone else and nothing > anyone > > else says will be of value. In the face of that, it is hardly surprising > > that no one has really responded to what you have proposed. > > Well, if that's the case, it's certainly an unfortunate > misunderstanding. Because all I have *actually* done is described a > way that *everyone* can contribute to the development of new APIs, > without *first* needing everyone *else* to agree on those APIs and > modify their web frameworks to support them. I'm trying to make it > possible for there to be *more* participation, not less. Any J. > Random Developer with an idea or an itch to scratch should be able to > throw together an implementation of a post-WSGI API, and start using > it today from inside existing WSGI frameworks. It's from just such > experiments that de facto -- and later, de jure -- standards can and > should arise. > > Personally, I don't expect there to be much discussion of my proposal > right now because nobody is yet trying to *implement* any post-WSGI > specifications. The RUB spec is mostly a stake in the ground, to say, > "Don't worry about WSGI 1 compatibility or tunneling; you can use any > API paradigm you want, and the community will still be able to make an > orderly transition to using it. Here's how." > > At this point, AFAIK, there are precisely two APIs that could have > benefited from the prior existence of WSGI-RUB: nghttp2 and > uwsgi.websockets. (Which is why the examples in the spec are based on > them.) And I'm not especially aware of anybody else writing new ones, > who would therefore be interested in it. Frankly, I wrote the thing > to get it out of my head and to have a convenient place to point to > when anybody makes the same mistake I did, of trying to limit > post-WSGI API design to what can be safely shoehorned back through > WSGI 1 response middleware. > > I don't expect much *detailed* discussion of WSGI-RUB, in other words, > until there's at least a strawman post-WSGI API proposal. (Which, > IIUC from his last message, Robert is doing some experimenting > towards.) > > Perhaps it is not clear from your cursory review of the existing > discussion, but both Robert and I *learned* some things from our > interaction. And one of the things that *I* learned from that > interaction was that nitpicking from the sidelines about what Robert > couldn't do or how he should do it was not productive. > > Which is why I've put forth a proposal that eliminates the need for > post-WSGI APIs to be nitpicked for WSGI 1 middleware compatibility. > > And it's why I hope *you* will read that proposal, and have something > to say about its substance, because you are one of the people from > whom I have most eagerly awaited such feedback, be it good, bad, or > ugly. > > But I would *much* prefer some feedback about its *substance*, to a > bunch of insinuations about whether I should've proposed it at all. > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com > OK, So I should probably know you, but I can't recollect right now what you do or write. Anyway I find it really disturbing the way you're actually acting and try to push your ideas based on private feedback coming from unknown or choosing who should be a reference. That certainly not the right way to have all actors on the table. Because if we go for a new WSGI spec, you certainly want it. And I am speaking as one of these actors. In my opinion, if we want to go further we should first define what are the problem we want to solve, and then get the feedback from all the actors around: - framweorks authors - libraries author - server authors If you don't have all actors around and majors are missing, there is probably no point to continue. I do think the idea of having a repository to collect it with people arbitrating the discussions on them on the mailing is a good way to go further. Now I think we are still missing of a clear definition of the problem. This is from what we should start instead of starting by giving our philosophy on how the problem should be solved. - benoit. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Mon Oct 13 11:17:09 2014 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 13 Oct 2014 22:17:09 +1300 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On 13 October 2014 16:59, PJ Eby wrote: > On Sun, Oct 12, 2014 at 6:47 PM, Robert Collins > wrote: >> FWIW I'm totally fine with you bringing together that PEP - as you say >> its complementary to what I'm focused on (I believe I even suggested >> you might want to do that). > > Did you have any feedback on the proposal itself? I'm particularly > counting on you to tell me if I've horribly misunderstood something > important about the use cases or the requirements for the protocols > themselves. Not yet. Really just got back to stuff today. Rather than digging a hole for myself by commenting until I've absorbed it, let me do that and then I'll comment. :) > I think that the "upgrade" model I've presented will enable you to > happily design completely new API paradigms without having to figure > out how to tunnel them through a maze of WSGI middleware, with the > exception of having reasonable ways to present the incoming request as > a WSGI request. But if I've missed something there, please let me > know. Sure will. As I said earlier on in our thread, I'm not convinced that presenting new things as WSGI1 requests makes sense. I understand your arguments about adoption, but as I understand it WSGI itself started with nothing implementing it, and yet its now a very common lingua franca. So - I'd like to defer thinking too hard about the migration path, other than ensuring that its possible - and your draft may well be instrumental in some of the conversion paths needed. More once I've absorbed it. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Mon Oct 13 11:52:32 2014 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 13 Oct 2014 22:52:32 +1300 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On 13 October 2014 17:12, Benoit Chesneau wrote: > ... > > > OK, > > So I should probably know you, but I can't recollect right now what you do > or write. Its not clear to me who you were replying to. If Graham - Graham is the mod_wsgi maintainer, which I'm sure you've heard of - he, like you is one of the actors we need engaged and behind this effort. And PJE was the original WSGI maintainer :). > Anyway I find it really disturbing the way you're actually acting > and try to push your ideas based on private feedback coming from unknown or > choosing who should be a reference. That certainly not the right way to have > all actors on the table. Because if we go for a new WSGI spec, you certainly > want it. And I am speaking as one of these actors. As I said when folk talked about going private in the first thread on this, I'm willing to discuss anything publically or privately - I can't tell folk where they will be comfortable discussing things. But I'm going to do *my* work on this in public, because I think that is essential to get broad consensus. > In my opinion, if we want to go further we should first define what are the > problem we want to solve, and then get the feedback from all the actors > around: I think I've been fairly clear about the problem *I* want to solve. """ We want to create a clean common API for applications and middleware written in a post HTTP/2 world - where single servers may accept up to all three of HTTP/1.x, HTTP/2 and Websocket connections, and applications and middleware want to be able to take advantage of HTTP/2 and websockets when available, but also degrade gracefully. We also want to ensure that there is a graceful incremental path to adoption of the new API, including Python 2.7 support, and shims to enable existing WSGI apps/middleware/servers to respectively be contained, contain-or-be-contained and contain, things written to this new API. We want a clean, fast and approachable API, and we want to ensure that its no less friendly to work with than WSGI, for all that it will expose much more functionality. """ > - framweorks authors I reached out to a number of such authors directly. I encourage you to do the same. > - libraries author Ditto and > - server authors Ditto :). > If you don't have all actors around and majors are missing, there is > probably no point to continue. I do think the idea of having a repository to > collect it with people arbitrating the discussions on them on the mailing is > a good way to go further. Now I think we are still missing of a clear > definition of the problem. This is from what we should start instead of > starting by giving our philosophy on how the problem should be solved. Here's my definition of some of the problems: A - there is no common spec equivalent to WSGI that permits writing server side code that takes advantage of HTTP/2. There's *a* http/2 server out there which one can write code for, but that code is either specific to that servers plumbing, or plain WSGI and misses the HTTP/2 goodness. B - WSGI has some oddness and overheads due in large part to the situation it was aiming to fix (which it broadly did) that perhaps we can now come together to fix. C - Support for chunked uploads, comet, bosh and websockets is effectively impossible within WSGI - one ends up writing server specific code, and being tied to a single server - even though multiple servers support (some of) those things. This defeats the point of WSGI IMNSHO: its not that WSGI is broken or anything, its just that we're once again writing all our generic middleware in server-specific fashions. Because the world has moved on and we haven't. I think A and C are crucial if we want to re-instate a lingua franca for the current web, in Python. I'd like to address B, because we can. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Mon Oct 13 11:58:57 2014 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 13 Oct 2014 22:58:57 +1300 Subject: [Web-SIG] handling different network protocols Message-ID: One of the issues raised on the github repo was about upgrading to new protocols. Digging into that I think there is a splinter question: how *should* we represent the different network protocols in our python protocol. I've put some thoughts together about this in https://github.com/python-web-sig/wsgi-ng/issues/10 I rather suspect that any answer we have will make some folk unhappy, so I'd like to measure the answers against the baseline concerns: how much code will folk implementing the python protocol(s) have to write- lets minimise boilerplate and checks that the need to remember to put in. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From bchesneau at gmail.com Mon Oct 13 14:18:09 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Mon, 13 Oct 2014 14:18:09 +0200 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On Mon, Oct 13, 2014 at 11:52 AM, Robert Collins wrote: > On 13 October 2014 17:12, Benoit Chesneau wrote: > > > ... > > > > > > OK, > > > > So I should probably know you, but I can't recollect right now what you > do > > or write. > > Its not clear to me who you were replying to. > I answered at the bottom of the thread so to PJE. > > Anyway I find it really disturbing the way you're actually acting > > and try to push your ideas based on private feedback coming from unknown > or > > choosing who should be a reference. That certainly not the right way to > have > > all actors on the table. Because if we go for a new WSGI spec, you > certainly > > want it. And I am speaking as one of these actors. > > As I said when folk talked about going private in the first thread on > this, I'm willing to discuss anything publically or privately - I > can't tell folk where they will be comfortable discussing things. But > I'm going to do *my* work on this in public, because I think that is > essential to get broad consensus. > > > In my opinion, if we want to go further we should first define what are > the > > problem we want to solve, and then get the feedback from all the actors > > around: > > I think I've been fairly clear about the problem *I* want to solve. > > """ > We want to create a clean common API for applications and middleware > written in a post HTTP/2 world - where single servers may accept up to > all three of HTTP/1.x, HTTP/2 and Websocket connections, and > applications and middleware want to be able to take advantage of > HTTP/2 and websockets when available, but also degrade gracefully. We > also want to ensure that there is a graceful incremental path to > adoption of the new API, including Python 2.7 support, and shims to > enable existing WSGI apps/middleware/servers to respectively be > contained, contain-or-be-contained and contain, things written to this > new API. We want a clean, fast and approachable API, and we want to > ensure that its no less friendly to work with than WSGI, for all that > it will expose much more functionality. > """ > By which problem we need to solve, I mean we need to identify clearly what are the problem not solved by the current spec. And see why, and how it is actually solved in the python world. we need to clearly identify these issues and make sure we have a comprehensive view of them. > > > - framweorks authors > > I reached out to a number of such authors directly. I encourage you to > do the same. > I could do that eventually if we are all agree on the process. > > > - libraries author > > Ditto and > > > - server authors > > Ditto :). > > > If you don't have all actors around and majors are missing, there is > > probably no point to continue. I do think the idea of having a > repository to > > collect it with people arbitrating the discussions on them on the > mailing is > > a good way to go further. Now I think we are still missing of a clear > > definition of the problem. This is from what we should start instead of > > starting by giving our philosophy on how the problem should be solved. > > Here's my definition of some of the problems: > A - there is no common spec equivalent to WSGI that permits writing > server side code that takes advantage of HTTP/2. There's *a* http/2 > server out there which one can write code for, but that code is either > specific to that servers plumbing, or plain WSGI and misses the HTTP/2 > goodness. > B - WSGI has some oddness and overheads due in large part to the > situation it was aiming to fix (which it broadly did) that perhaps we > can now come together to fix. > C - Support for chunked uploads, comet, bosh and websockets is > effectively impossible within WSGI - one ends up writing server > specific code, and being tied to a single server - even though > multiple servers support (some of) those things. This defeats the > point of WSGI IMNSHO: its not that WSGI is broken or anything, its > just that we're once again writing all our generic middleware in > server-specific fashions. Because the world has moved on and we > haven't. > Chunkedn upload is possible and already handled with Gunicorn. But there is no standard for that. For C I would separate it from the rest. This a different discussion and imo not everything can be achieved at the same time. Maybe we should start first by fixing them, then go for the next step anyway. So the transition could be incremental in servers and frameworks and actually fix the current spec. For A (And C), i think we should keep the new specification enough agnostic. Especially since HTTP 2 is not yet completely out. - benoit I -------------- next part -------------- An HTML attachment was scrubbed... URL: From bchesneau at gmail.com Mon Oct 13 14:26:48 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Mon, 13 Oct 2014 14:26:48 +0200 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: On Sun, Oct 12, 2014 at 11:38 PM, Robert Collins wrote: > On 30 September 2014 11:47, Alan Kennedy wrote: > > > [Robert] > >> So it sounds like it should be the responsibility of a middleware to > >> renormalize the environment? > > > > In order for that to be the case, you have strictly define what > > "normalization" means. > > For a given deployment its well defined. I agree that in general its not. > > > I believe that it is not possible to fully specify "normalization", and > that > > any attempt to do so is futile. > > > > If you want to attempt it for the specific scenarios that your particular > > application has to deal with, then by all means code your version of > > "normalization" into your application. Or write some middleware to do it. > > > > But trying to make "normalization" a part of a WSGI-style specification > is > > impossible. > > I don't recall proposing that it should be in a WSGI-style spec. > > -Rob > > -- > Robert Collins > Distinguished Technologist > HP Converged Cloud > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com > All this issue looks like the problem raised (and not yet solved) recently in Gunicorn when the REMOTE_ADDR has been handled more strictly and we removed all the X-Forward-* headers handling: https://github.com/benoitc/gunicorn/issues/797 There is another case to take in consideration, when your server is answering on unix sockets, so you don't have any TCP address to present. For now we answer with an empty field. Also some application frameworks recently removed the middleware handling X-Forward-* headers. I wonder why. There is an RFC for forward headers: http://tools.ietf.org/html/rfc7239 . For me instead of trying to change the strict behaviour of REMOTE_ADDR I wonder if we shouldn't rather add a new field to the environ. Thoughts? - benoit -------------- next part -------------- An HTML attachment was scrubbed... URL: From tseaver at palladion.com Tue Oct 14 04:36:56 2014 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 13 Oct 2014 22:36:56 -0400 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/13/2014 12:12 AM, Benoit Chesneau wrote: > So I should probably know you, but I can't recollect right now what > you do or write. Seriously? On *this* sig? PJE was the author of PEP 333, defining the WSGI 1.0 spec. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAlQ8jEgACgkQ+gerLs4ltQ6VGgCg3UUBwHClvJv3dtHP22Z1EVBn wpgAn0oA3pMAgLY0lqExyZ2zvsQfnteY =o+WJ -----END PGP SIGNATURE----- From robertc at robertcollins.net Tue Oct 14 04:59:22 2014 From: robertc at robertcollins.net (Robert Collins) Date: Tue, 14 Oct 2014 15:59:22 +1300 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On 14 October 2014 01:18, Benoit Chesneau wrote: >> C - Support for chunked uploads, comet, bosh and websockets is >> effectively impossible within WSGI - one ends up writing server >> specific code, and being tied to a single server - even though >> multiple servers support (some of) those things. This defeats the >> point of WSGI IMNSHO: its not that WSGI is broken or anything, its >> just that we're once again writing all our generic middleware in >> server-specific fashions. Because the world has moved on and we >> haven't. > > > Chunkedn upload is possible and already handled with Gunicorn. But there is > no standard for that. Right. Thus we need one. > For C I would separate it from the rest. This a different discussion and imo > not everything can be achieved at the same time. Maybe we should start first > by fixing them, then go for the next step anyway. So the transition could be > incremental in servers and frameworks and actually fix the current spec. What makes C a different discussion? > > For A (And C), i think we should keep the new specification enough agnostic. > Especially since HTTP 2 is not yet completely out. HTTP/2 is in last call stage: it will be entirely finished by the time we get through whatever process we have here. What do you want to see changed in the process I'm following? -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From graham.dumpleton at gmail.com Tue Oct 14 05:21:32 2014 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 14 Oct 2014 14:21:32 +1100 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: <300C2539-08D2-4A5E-836F-A7AC8DCB0133@gmail.com> On 13/10/2014, at 11:26 PM, Benoit Chesneau wrote: > > > On Sun, Oct 12, 2014 at 11:38 PM, Robert Collins wrote: > On 30 September 2014 11:47, Alan Kennedy wrote: > > > [Robert] > >> So it sounds like it should be the responsibility of a middleware to > >> renormalize the environment? > > > > In order for that to be the case, you have strictly define what > > "normalization" means. > > For a given deployment its well defined. I agree that in general its not. > > > I believe that it is not possible to fully specify "normalization", and that > > any attempt to do so is futile. > > > > If you want to attempt it for the specific scenarios that your particular > > application has to deal with, then by all means code your version of > > "normalization" into your application. Or write some middleware to do it. > > > > But trying to make "normalization" a part of a WSGI-style specification is > > impossible. > > I don't recall proposing that it should be in a WSGI-style spec. > > -Rob > > -- > Robert Collins > Distinguished Technologist > HP Converged Cloud > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com > > > All this issue looks like the problem raised (and not yet solved) recently in Gunicorn when the REMOTE_ADDR has been handled more strictly and we removed all the X-Forward-* headers handling: > > https://github.com/benoitc/gunicorn/issues/797 > > There is another case to take in consideration, when your server is answering on unix sockets, so you don't have any TCP address to present. For now we answer with an empty field. > > Also some application frameworks recently removed the middleware handling X-Forward-* headers. I wonder why. > > > There is an RFC for forward headers: http://tools.ietf.org/html/rfc7239 . For me instead of trying to change the strict behaviour of REMOTE_ADDR I wonder if we shouldn't rather add a new field to the environ. Thoughts? My prior thinking on this was that REMOTE_ADDR should be left alone. If front end proxies support RFC-7239 and pass them through you are all good. If you are in a situation where a front end proxy doesn't support RFC-7239 but uses the prior convention of X-Forwarded-* headers, then one could take the older headers and construct the new RFC-7239 headers and drop the old X-Fowarded headers. In other words, converge on the new convention set by RFC-7239 by translating the old way of doing things to the new. This way a WSGI application can be coded up just to check for the new header and not have to deal with both. The actual translation from old headers to new could be done by a WSGI middleware or an optionally enabled WSGI server feature. Either way it doesn't need to be part of the WSGI specification. As noted by others, the issue though is how much you trust the information passed in by the headers and does it capture entirely the existence of multiple hops. In the case of REMOTE_ADDR it is added by the web server based on actual socket information and so there is no way a client can supersede it. The X-Fowarded-* and Forwarded headers have the problem that a client can set them itself. In having multiple ways now of denoting it, which takes precedence and do you trust. If your proxies use X-Forwarded-* but a HTTP client sets Forwarded, what do you do. Ultimately, whether you use a WSGI middleware or a WSGI server which provides a built function for the typical case (optionally enabled), it has to be configurable to the point of an administrator being able to say what are the trusted headers. You may also want to be able to say what the IPs of proxies are that you want to trust if practical. This must be something an administrator can do and not be be dependent on developers embedding it within an application, which is why a builtin mechanism with a WSGI server may be preferred. Anyway, this way a system administrator can say whether it is expected that a proxy only sets X-Forwarded-* and not Forwarded or vice versa and who to trust. You likely can't just have a default strategy if you want to be safe. Another issue to consider is header spoofing, which not all WSGI servers protect against at the moment. The spoofing problem is because of the CGI rule around how header names are converted. That is: Meta-variables with names beginning with "HTTP_" contain values read from the client request header fields, if the protocol used is HTTP. The HTTP header field name is converted to upper case, has all occurrences of "-" replaced with "_" and has "HTTP_" prepended to give the meta-variable name. The header data can be presented as sent by the client, or can be rewritten in ways which do not change its semantics. If multiple header fields with the same field-name are received then the server MUST rewrite them as a single value having the same semantics. Similarly, a header field that spans multiple lines MUST be merged onto a single line. The server MUST, if necessary, change the representation of the data (for example, the character set) to be appropriate for a CGI meta-variable. So this means that X-Forwarded-For is translated to HTTP_X_FOWARDED_FOR. The problem is that if a client itself sends X_Forwarded_For, then it would also map to the same thing. By the rules above the two values would be concatenated if a proxy set one and the client sent the other, usually separating the values with a comma. If you are attempting to block certain clients based on this, then the header value could be poisoned and cause problems for such a scheme. If using a WSGI middleware therefore, depending on the final usage, you may want to be making sure the WSGI server deals with this form of header spoofing as well. FWIW, latest versions of mod_wsgi will only accept headers and convert using the above rule where they only contain alphanumerics and '-'. If any other characters are used the header is thrown away. This behaviour is by virtue of Apache 2.4 doing the blocking. There was however a bug in mod_wsgi which means that spoofed headers still got through in environ passed to mod_wsgi specific access/authentication/authorization hook extensions for Apache. This has been fixed in recent release. At the same time it was decided to apply the more strict rules about what was allowed back to older Apache 2.2 as well, since Apache 2.2 doesn't do the blocking that Apache 2.4 does. Unfortunately because Linux distros ship out of date mod_wsgi versions, it can still be an issue there. Have been pondering turning the issue into a CERT just to force them to back port the fixes. :-) Graham -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Tue Oct 14 06:06:34 2014 From: robertc at robertcollins.net (Robert Collins) Date: Tue, 14 Oct 2014 17:06:34 +1300 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: <300C2539-08D2-4A5E-836F-A7AC8DCB0133@gmail.com> References: <300C2539-08D2-4A5E-836F-A7AC8DCB0133@gmail.com> Message-ID: On 14 October 2014 16:21, Graham Dumpleton wrote: > > This behaviour is by virtue of Apache 2.4 doing the blocking. Nice :). > There was however a bug in mod_wsgi which means that spoofed headers still > got through in environ passed to mod_wsgi specific > access/authentication/authorization hook extensions for Apache. This has > been fixed in recent release. At the same time it was decided to apply the > more strict rules about what was allowed back to older Apache 2.2 as well, > since Apache 2.2 doesn't do the blocking that Apache 2.4 does. > > Unfortunately because Linux distros ship out of date mod_wsgi versions, it > can still be an issue there. Have been pondering turning the issue into a > CERT just to force them to back port the fixes. :-) +1 on that, its indeed an issue and many folk won't consider issue there. For WSGI I agree that the protocol doesn't need to make these deployer decisions etc - but we do need to clarify REMOTE_ADDR for unix sockets. I've filed https://github.com/python-web-sig/wsgi-ng/issues/11 to track this. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From bchesneau at gmail.com Tue Oct 14 07:19:09 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Tue, 14 Oct 2014 07:19:09 +0200 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On Tuesday, October 14, 2014, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 10/13/2014 12:12 AM, Benoit Chesneau wrote: > > So I should probably know you, but I can't recollect right now what > > you do or write. > > Seriously? On *this* sig? PJE was the author of PEP 333, defining the > WSGI 1.0 spec. > > Actually i was more annoyed by the way the discussion was handled than the rest. If we can release somerhing more adapted to the current challenge and not finish like the pep3333 this time that would be awesome. -- Sent from my Mobile -------------- next part -------------- An HTML attachment was scrubbed... URL: From bchesneau at gmail.com Tue Oct 14 08:47:27 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Tue, 14 Oct 2014 08:47:27 +0200 Subject: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging In-Reply-To: References: <34363A54-F5F6-46E8-851A-82E5C8FAF650@gmail.com> Message-ID: On Tuesday, October 14, 2014, Robert Collins wrote: On 14 October 2014 01:18, Benoit Chesneau wrote: >> C - Support for chunked uploads, comet, bosh and websockets is >> effectively impossible within WSGI - one ends up writing server >> specific code, and being tied to a single server - even though >> multiple servers support (some of) those things. This defeats the >> point of WSGI IMNSHO: its not that WSGI is broken or anything, its >> just that we're once again writing all our generic middleware in >> server-specific fashions. Because the world has moved on and we >> haven't. > > > Chunkedn upload is possible and already handled with Gunicorn. But there is > no standard for that. Right. Thus we need one. > For C I would separate it from the rest. This a different discussion and imo > not everything can be achieved at the same time. Maybe we should start first > by fixing them, then go for the next step anyway. So the transition could be > incremental in servers and frameworks and actually fix the current spec. What makes C a different discussion? > > For A (And C), i think we should keep the new specification enough agnostic. > Especially since HTTP 2 is not yet completely out. HTTP/2 is in last call stage: it will be entirely finished by the time we get through whatever process we have here. What do you want to see changed in the process I'm following? -Rob I meant there are 2 separate problems: fixing the current spec, and extend it **if** needed to handle the new web patterns. I am speaking more about patterns than protocols. Protocol is one thing to take in consideration of course, but actually if we redefining a spec to build (server) and interact with (apps) a gateway for the web it will be more than simply handling HTTP 2 and soon HTTP 2.1 or 3 depending on the discussions. On the server started I started recently a redesign of the core of Gunicorn to prepare it tfor this pattern and beeing able to handle the different challengenns they gives. I actually identified some common features and distinct features. In term of patterns we have the following: - start a response: detect the protocol, eventually *upgrade* it to a new protocol (HTTP -> Websockets or HTTP2 -> HTTP 1.1) - send/receive headers - PUSH pattern: HTTP 2, new PUSH specifcations from W3C, SSE - ASYNC pattern: HTTP 2 and Websockets, receiving and sending can happen at different time and on the long term - Continuous connections: keepalive, HTTP 2 channels, websockets, SSE (how do we keep/identify connections states) - Streaams: chunked encoding, http 2 channels, ... We should have a clear way to notify the application about it. Also the spec/the server should have a standard way to handle the different async frameworks not based on threads like gevent, eventlet. The application should also be able to tell to the server that an operation will take a long time and let the server take appropriate actions. Some kind of reply/no_reply pattern allowing to answer later and switch to an async pattern on the fly if the server support it. It would allows the servers to handle more gracefully some common issues like long queries when the application knows about them. Since there is a concensus on using the github tracker, should i open a tickets for these different things? Or one generic? Let me know. - benoit -- Robert Collins Distinguished Technologist HP Converged Cloud -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Oct 14 18:47:35 2014 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Oct 2014 09:47:35 -0700 Subject: [Web-SIG] WSGI and asyncio (tulip)? Message-ID: I am fascinated by the new WSGI - HTTP/2 discussions. I don't have much to contribute, because my own experience with web development is either very old (when CGI was new and exciting) or uses corporate frameworks where there's a huge set of layers between the app code and the external network (e.g. Google, Dropbox). I have strong emotional responses to some of the discussion topics (e.g. I feel that REMOTE_ADDR should represent the public IP address from which the request originated, not the internal address of a reverse proxy, since my use cases for it are all about blocking or rate-limiting specific clients, and I assume the network between the reverse proxy and the app server is secure) but I am sure there are already enough voices here and I trust that the sig will come up with the right answers (even if they override my gut feelings!). My most recent foray into web stuff was writing a small web crawler as an example for asyncio (PEP 3156, a.k.a. tulip). The crawler is written in Python 3 and the source code is here: https://github.com/aosabook/500lines/blob/master/crawler/crawling.py and it supports several advanced HTTP features: TLS, connection reuse, chunked transfer encoding, redirects (but not compression -- I think it would be straightforward to add, but the code would then exceed the 500 lines limit imposed by the book). Perhaps the main lesson I learned from writing this is how different yet similar web code looks when you use an asynchronous framework. Which makes me wonder -- can there be a future where you can write your web app as an asyncio coroutine? It looks like the WSGI protocol already supports asynchronously producing the response (using yield), and I don't think asyncio would have a problem with converting this convention to its own "yield from" convention. However, the situation is different for reading the request body. The app can read this in dribs and drabs if it wants to, by reading limited amounts of data from environ['wsgi.input'], but in an asyncio-driven world reading operations should really be marked with "yield from" (or perhaps just yield -- again, I'm sure an adaptor won't have a problem with this). I'm wondering if a small extension to the WSGI protocol might be sufficient to support this: the special environ variable "wsgi.async_input" could optionally be tied to a standard asyncio stream reader ( https://docs.python.org/3/library/asyncio-stream.html#streamreader), from which bytes can be read using "yield from stream.read([nbytes])" or "yield from stream.readline()". Thinking a little more about this, it might be better if an async app could be a regular asyncio coroutine. In this case it might be better if start_response() were to return an asyncio stream writer ( https://docs.python.org/3/library/asyncio-stream.html#streamwriter) and if it was expected to produce all its output by writing to this stream. Anyway, I think I'm getting ahead of myself, but I do think it would be nice if the next WSGI standard supported asyncio. For older Python versions it could then instead support Trollius ( https://pypi.python.org/pypi/trollius), a backport of asyncio that supports Python 2.7 and 3.2 (and newer). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From justin at justinholmes.com Tue Oct 14 20:27:03 2014 From: justin at justinholmes.com (Justin Holmes) Date: Tue, 14 Oct 2014 11:27:03 -0700 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: References: Message-ID: Fascinating and exciting. Up until now, my go-to tactic for contain WSGI inside async has been to use the WSGI container in twisted.web (this is how hendrix works: https://github.com/hangarunderground/hendrix). However, if we're talking about an actual flag in WSGI (like wsgi.async_input), this is the probably the most significant game-changer of all the proposals so far. On Tue, Oct 14, 2014 at 9:47 AM, Guido van Rossum wrote: > I am fascinated by the new WSGI - HTTP/2 discussions. I don't have much to > contribute, because my own experience with web development is either very > old (when CGI was new and exciting) or uses corporate frameworks where > there's a huge set of layers between the app code and the external network > (e.g. Google, Dropbox). > > I have strong emotional responses to some of the discussion topics (e.g. I > feel that REMOTE_ADDR should represent the public IP address from which the > request originated, not the internal address of a reverse proxy, since my > use cases for it are all about blocking or rate-limiting specific clients, > and I assume the network between the reverse proxy and the app server is > secure) but I am sure there are already enough voices here and I trust that > the sig will come up with the right answers (even if they override my gut > feelings!). > > My most recent foray into web stuff was writing a small web crawler as an > example for asyncio (PEP 3156, a.k.a. tulip). The crawler is written in > Python 3 and the source code is here: > https://github.com/aosabook/500lines/blob/master/crawler/crawling.py and it > supports several advanced HTTP features: TLS, connection reuse, chunked > transfer encoding, redirects (but not compression -- I think it would be > straightforward to add, but the code would then exceed the 500 lines limit > imposed by the book). > > Perhaps the main lesson I learned from writing this is how different yet > similar web code looks when you use an asynchronous framework. Which makes > me wonder -- can there be a future where you can write your web app as an > asyncio coroutine? > > It looks like the WSGI protocol already supports asynchronously producing > the response (using yield), and I don't think asyncio would have a problem > with converting this convention to its own "yield from" convention. However, > the situation is different for reading the request body. The app can read > this in dribs and drabs if it wants to, by reading limited amounts of data > from environ['wsgi.input'], but in an asyncio-driven world reading > operations should really be marked with "yield from" (or perhaps just yield > -- again, I'm sure an adaptor won't have a problem with this). > > I'm wondering if a small extension to the WSGI protocol might be sufficient > to support this: the special environ variable "wsgi.async_input" could > optionally be tied to a standard asyncio stream reader > (https://docs.python.org/3/library/asyncio-stream.html#streamreader), from > which bytes can be read using "yield from stream.read([nbytes])" or "yield > from stream.readline()". > > Thinking a little more about this, it might be better if an async app could > be a regular asyncio coroutine. In this case it might be better if > start_response() were to return an asyncio stream writer > (https://docs.python.org/3/library/asyncio-stream.html#streamwriter) and if > it was expected to produce all its output by writing to this stream. > > Anyway, I think I'm getting ahead of myself, but I do think it would be nice > if the next WSGI standard supported asyncio. For older Python versions it > could then instead support Trollius (https://pypi.python.org/pypi/trollius), > a backport of asyncio that supports Python 2.7 and 3.2 (and newer). > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/justin%40justinholmes.com > -- jMyles Holmes Chief Chocobo Breeder, slashRoot slashRoot: Coffee House and Tech Dojo New Paltz, NY 12561 845.633.8330 From solipsis at pitrou.net Tue Oct 14 20:30:38 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 Oct 2014 20:30:38 +0200 Subject: [Web-SIG] WSGI and asyncio (tulip)? References: Message-ID: <20141014203038.7e4ca1d5@fsol> On Tue, 14 Oct 2014 09:47:35 -0700 Guido van Rossum wrote: > > I'm wondering if a small extension to the WSGI protocol might be sufficient > to support this: the special environ variable "wsgi.async_input" could > optionally be tied to a standard asyncio stream reader ( > https://docs.python.org/3/library/asyncio-stream.html#streamreader), from > which bytes can be read using "yield from stream.read([nbytes])" or "yield > from stream.readline()". I think it would be frankly better to hook at the transport/protocol level, and let people wrap that inside an asyncio stream if that's their preference. It would also allow easier interoperability with other non-blocking frameworks, since the callback-oriented nature is a common characteristic, IMHO. Regards Antoine. From sh at defuze.org Tue Oct 14 20:54:39 2014 From: sh at defuze.org (Sylvain Hellegouarch) Date: Tue, 14 Oct 2014 20:54:39 +0200 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: References: Message-ID: Hi, 2014-10-14 18:47 GMT+02:00 Guido van Rossum : > > > I'm wondering if a small extension to the WSGI protocol might be > sufficient to support this: the special environ variable "wsgi.async_input" > could optionally be tied to a standard asyncio stream reader ( > https://docs.python.org/3/library/asyncio-stream.html#streamreader), from > which bytes can be read using "yield from stream.read([nbytes])" or "yield > from stream.readline()". > > If we support async backends by simply escaping WSGI, don't you feel it kind of make most of the whole discussion moot? To me, asyncio already provides a de-facto standard API for asynchronous backends and Tornado/Twisted provide a high level API on top of it. I have to say, I don't precisely grasp what WSGI actually wishes to bring to the table. As I said in a different thread, most frameworks seem eager to wrap the environ dictionary and hide away all of the WSGI internals (wasting CPU cycles in the process). Is there rationale for continuining down that road? -- - Sylvain http://www.defuze.org http://twitter.com/lawouach -------------- next part -------------- An HTML attachment was scrubbed... URL: From justin at justinholmes.com Tue Oct 14 20:57:01 2014 From: justin at justinholmes.com (Justin Holmes) Date: Tue, 14 Oct 2014 11:57:01 -0700 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: References: Message-ID: To me, asyncio already provides a de-facto standard API for asynchronous backends and Tornado/Twisted provide a high level API on top of it. I have to say, I don't precisely grasp what WSGI actually wishes to bring to the table. I guess if we're really talking about this, the lowest common denominator is a set of cognizable abstractions for HttpRequest and HttpResponse, right? On Tue, Oct 14, 2014 at 11:54 AM, Sylvain Hellegouarch wrote: > Hi, > > 2014-10-14 18:47 GMT+02:00 Guido van Rossum : >> >> >> >> I'm wondering if a small extension to the WSGI protocol might be >> sufficient to support this: the special environ variable "wsgi.async_input" >> could optionally be tied to a standard asyncio stream reader >> (https://docs.python.org/3/library/asyncio-stream.html#streamreader), from >> which bytes can be read using "yield from stream.read([nbytes])" or "yield >> from stream.readline()". >> > > If we support async backends by simply escaping WSGI, don't you feel it kind > of make most of the whole discussion moot? > > To me, asyncio already provides a de-facto standard API for asynchronous > backends and Tornado/Twisted provide a high level API on top of it. I have > to say, I don't precisely grasp what WSGI actually wishes to bring to the > table. > > As I said in a different thread, most frameworks seem eager to wrap the > environ dictionary and hide away all of the WSGI internals (wasting CPU > cycles in the process). Is there rationale for continuining down that road? > > -- > - Sylvain > http://www.defuze.org > http://twitter.com/lawouach > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/justin%40justinholmes.com > -- jMyles Holmes Chief Chocobo Breeder, slashRoot slashRoot: Coffee House and Tech Dojo New Paltz, NY 12561 845.633.8330 From sh at defuze.org Tue Oct 14 21:01:27 2014 From: sh at defuze.org (Sylvain Hellegouarch) Date: Tue, 14 Oct 2014 21:01:27 +0200 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: References: Message-ID: 2014-10-14 20:57 GMT+02:00 Justin Holmes : > To me, asyncio already provides a de-facto standard API for > asynchronous backends and Tornado/Twisted provide a high level API on > top of it. I have to say, I don't precisely grasp what WSGI actually > wishes to bring to the table. > > I guess if we're really talking about this, the lowest common > denominator is a set of cognizable abstractions for HttpRequest and > HttpResponse, right? > > That's my feeling as well. But, considering the sheer amount of content produced in other threads, I guess I'm missing the point somehow ;) -- - Sylvain http://www.defuze.org http://twitter.com/lawouach -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Tue Oct 14 21:33:51 2014 From: robertc at robertcollins.net (Robert Collins) Date: Wed, 15 Oct 2014 08:33:51 +1300 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: References: Message-ID: On 15 October 2014 08:01, Sylvain Hellegouarch wrote: > > > 2014-10-14 20:57 GMT+02:00 Justin Holmes : >> >> To me, asyncio already provides a de-facto standard API for >> asynchronous backends and Tornado/Twisted provide a high level API on >> top of it. I have to say, I don't precisely grasp what WSGI actually >> wishes to bring to the table. >> >> I guess if we're really talking about this, the lowest common >> denominator is a set of cognizable abstractions for HttpRequest and >> HttpResponse, right? >> > > > That's my feeling as well. But, considering the sheer amount of content > produced in other threads, I guess I'm missing the point somehow ;) I've opened https://github.com/python-web-sig/wsgi-ng/issues/12 to ensure we track this. WSGI is intended to be a lowest common denominator; WSGI itself describes why it didn't have a request object, and a response object likewise. I agree about the wasted CPU in middleware in principle, but if you look closely at e.g. webob - https://github.com/Pylons/webob/blob/master/webob/descriptors.py - the Request object is just a shim onto the environ dict, it doesn't actually copy the data around or even inspect many keys. Creating a Request object there does 2 trivial checks and you're done. Not enough to worry about in a pragmatic sense. As long as the WSGI protocol continues to support such lightweight shims, I think we're fine. Supporting such shims allows frameworks to have their own aesthetic (e.g. CamelCase vs lower_case) methods, side effects and more. That said, personally I'd be very open to defining some base classes nowadays, but only if we have roaring support for that from the authors of existing middleware and application frameworks. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Tue Oct 14 21:40:05 2014 From: robertc at robertcollins.net (Robert Collins) Date: Wed, 15 Oct 2014 08:40:05 +1300 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: <20141014203038.7e4ca1d5@fsol> References: <20141014203038.7e4ca1d5@fsol> Message-ID: On 15 October 2014 07:30, Antoine Pitrou wrote: > On Tue, 14 Oct 2014 09:47:35 -0700 > Guido van Rossum wrote: >> >> I'm wondering if a small extension to the WSGI protocol might be sufficient >> to support this: the special environ variable "wsgi.async_input" could >> optionally be tied to a standard asyncio stream reader ( >> https://docs.python.org/3/library/asyncio-stream.html#streamreader), from >> which bytes can be read using "yield from stream.read([nbytes])" or "yield >> from stream.readline()". > > I think it would be frankly better to hook at the transport/protocol > level, and let people wrap that inside an asyncio stream if that's their > preference. For things like mod_wsgi and uwsgi, we're not actually implementing the transport or protocol inside of Python at all - its all happening in C and often in an entirely separate process. I think it's entirely reasonable to want to write middleware/frameworks in that context using asyncio, and today there isn't a defined protocol for doing that. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From solipsis at pitrou.net Tue Oct 14 21:41:49 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 Oct 2014 21:41:49 +0200 Subject: [Web-SIG] WSGI and asyncio (tulip)? References: <20141014203038.7e4ca1d5@fsol> Message-ID: <20141014214149.36e03d31@fsol> On Wed, 15 Oct 2014 08:40:05 +1300 Robert Collins wrote: > On 15 October 2014 07:30, Antoine Pitrou wrote: > > On Tue, 14 Oct 2014 09:47:35 -0700 > > Guido van Rossum wrote: > >> > >> I'm wondering if a small extension to the WSGI protocol might be sufficient > >> to support this: the special environ variable "wsgi.async_input" could > >> optionally be tied to a standard asyncio stream reader ( > >> https://docs.python.org/3/library/asyncio-stream.html#streamreader), from > >> which bytes can be read using "yield from stream.read([nbytes])" or "yield > >> from stream.readline()". > > > > I think it would be frankly better to hook at the transport/protocol > > level, and let people wrap that inside an asyncio stream if that's their > > preference. > > For things like mod_wsgi and uwsgi, we're not actually implementing > the transport or protocol inside of Python at all - its all happening > in C and often in an entirely separate process. You may have misunderstood me. I am talking about the Transport and Protocol abstractions defined in PEP 3156. Regards Antoine. From robertc at robertcollins.net Tue Oct 14 22:22:28 2014 From: robertc at robertcollins.net (Robert Collins) Date: Wed, 15 Oct 2014 09:22:28 +1300 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: <20141014214149.36e03d31@fsol> References: <20141014203038.7e4ca1d5@fsol> <20141014214149.36e03d31@fsol> Message-ID: On 15 October 2014 08:41, Antoine Pitrou wrote: > On Wed, 15 Oct 2014 08:40:05 +1300 > Robert Collins > wrote: >> On 15 October 2014 07:30, Antoine Pitrou wrote: >> > On Tue, 14 Oct 2014 09:47:35 -0700 >> > Guido van Rossum wrote: >> >> >> >> I'm wondering if a small extension to the WSGI protocol might be sufficient >> >> to support this: the special environ variable "wsgi.async_input" could >> >> optionally be tied to a standard asyncio stream reader ( >> >> https://docs.python.org/3/library/asyncio-stream.html#streamreader), from >> >> which bytes can be read using "yield from stream.read([nbytes])" or "yield >> >> from stream.readline()". >> > >> > I think it would be frankly better to hook at the transport/protocol >> > level, and let people wrap that inside an asyncio stream if that's their >> > preference. >> >> For things like mod_wsgi and uwsgi, we're not actually implementing >> the transport or protocol inside of Python at all - its all happening >> in C and often in an entirely separate process. > > You may have misunderstood me. I am talking about the Transport and > Protocol abstractions defined in PEP 3156. Lets assume I did. Given say nginx + uwsgi + asyncio, you're proposing that there be a uwsgi-asyncio module that listens to the uwsgi socket and demuxes packets from that into something that then exposes a ReadTransport + WriteTransport pair and a Protocol on top of that. That Protocol would have a 1:1 correspondence with a WSGI request, and would *not* be HTTP itself but rather the subset that is exposed via uwsgi? -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From solipsis at pitrou.net Tue Oct 14 23:04:32 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 Oct 2014 23:04:32 +0200 Subject: [Web-SIG] WSGI and asyncio (tulip)? References: <20141014203038.7e4ca1d5@fsol> <20141014214149.36e03d31@fsol> Message-ID: <20141014230432.31899e3c@fsol> On Wed, 15 Oct 2014 09:22:28 +1300 Robert Collins wrote: > > > > You may have misunderstood me. I am talking about the Transport and > > Protocol abstractions defined in PEP 3156. > > Lets assume I did. Given say nginx + uwsgi + asyncio, you're proposing > that there be a uwsgi-asyncio module that listens to the uwsgi socket > and demuxes packets from that into something that then exposes a > ReadTransport + WriteTransport pair and a Protocol on top of that. Let's call it uwsgi-pep3156. It shouldn't be asyncio-specific. [Note Guido's original remark: """However, the situation is different for reading the request body. The app can read this in dribs and drabs if it wants to, by reading limited amounts of data from environ['wsgi.input'], but in an asyncio-driven world reading operations should really be marked with "yield from" (or perhaps just yield -- again, I'm sure an adaptor won't have a problem with this).""" Ergo, this is about streaming the request and response bodies. Not asynchronously receiving the headers, etc.] The server would implement a Transport for the input body stream + output body stream. It would accept a Protocol factory from the application/middleware. Then when a request comes: protocol = protocol_factory() transport = Transport(...) protocol.connection_made(transport) and when an input body chunk is available: protocol.data_received(chunk) and when the input body is finished: protocol.eof_received() The protocol would be able to call transport.extra_info(...) to get HTTP-specific information, e.g. transport.extra_info('headers') perhaps. (that's for the HTTP part; a websockets layer would probably implement a separate transport and accept a separate protocol factory; actually, it could be implemented as a protocol that would parse the websockets protocol and provide its own transport on top of that... there may already be such a thing on the Internet :-)) Regards Antoine. From robertc at robertcollins.net Tue Oct 14 23:48:37 2014 From: robertc at robertcollins.net (Robert Collins) Date: Wed, 15 Oct 2014 10:48:37 +1300 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: <20141014230432.31899e3c@fsol> References: <20141014203038.7e4ca1d5@fsol> <20141014214149.36e03d31@fsol> <20141014230432.31899e3c@fsol> Message-ID: On 15 October 2014 10:04, Antoine Pitrou wrote: ... > (that's for the HTTP part; a websockets layer would probably implement > a separate transport and accept a separate protocol factory; actually, > it could be implemented as a protocol that would parse the websockets > protocol and provide its own transport on top of that... there may > already be such a thing on the Internet :-)) So thats the bit that I'm having conceptual trouble with - servers may well implement the framing (and I rather think that they have to do so in some deployments), so we need to make sure that we don't do that in this context - we're behind an abstraction. With the HTTP example you gave, it looks fine, and I'm sure that equivalents can be made for websockets etc too. The question in mind is whether that is amenable to the same unification and layering that WSGI brought to the HTTP synchronous case. It looks like you'd make a protocol factory decorator. Will it be sufficiently flexible and interoperable to be something that becomes a lingua franca? I don't know pep-3156 well enough to judge that myself. Seems to me that there are two broad directions here: we can have a WSGI-thing where it looks just a little different to WSGI, or we can have a pep-3156 Protocol interface. We can share a bunch of logic either way - e.g. CONTENT_LENGTH etc, but the mechanics of writing middleware might be quite different. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From solipsis at pitrou.net Wed Oct 15 00:10:34 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 15 Oct 2014 00:10:34 +0200 Subject: [Web-SIG] WSGI and asyncio (tulip)? References: <20141014203038.7e4ca1d5@fsol> <20141014214149.36e03d31@fsol> <20141014230432.31899e3c@fsol> Message-ID: <20141015001034.5b049eb0@fsol> On Wed, 15 Oct 2014 10:48:37 +1300 Robert Collins wrote: > On 15 October 2014 10:04, Antoine Pitrou wrote: > ... > > (that's for the HTTP part; a websockets layer would probably implement > > a separate transport and accept a separate protocol factory; actually, > > it could be implemented as a protocol that would parse the websockets > > protocol and provide its own transport on top of that... there may > > already be such a thing on the Internet :-)) > > So thats the bit that I'm having conceptual trouble with - servers may > well implement the framing (and I rather think that they have to do so > in some deployments), so we need to make sure that we don't do that in > this context - we're behind an abstraction. If the new WSGI specifies that the server implements websocket support, then indeed the application shouldn't redo it. > The question in mind is whether that is amenable to the same > unification and layering that WSGI brought to the HTTP synchronous > case. It looks like you'd make a protocol factory decorator. I'm not sure why you would need a decorator. In the PEP 3156 mindset, a protocol factory can be any 0-argument Python callable: a class, a closure, a bound method, a global function, etc. Just pass that object it to the server when asking it to listen (or when it tells you it wants to listen, which is the WSGI approach AFAIU). > Will it > be sufficiently flexible and interoperable to be something that > becomes a lingua franca? I don't know pep-3156 well enough to judge > that myself. Yes, it should. That's the formalism already adopted for serving stuff on listening connections, e.g. (from PEP 3156): """create_server(protocol_factory, host, port, ). Enters a serving loop that accepts connections. [...] Each time a connection is accepted, protocol_factory is called without arguments(**) to create a Protocol, a bidirectional stream Transport is created to represent the network side of the connection, and the two are tied together by calling protocol.connection_made(transport).""" Regards Antoine. From robertc at robertcollins.net Wed Oct 15 00:28:42 2014 From: robertc at robertcollins.net (Robert Collins) Date: Wed, 15 Oct 2014 11:28:42 +1300 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: <20141015001034.5b049eb0@fsol> References: <20141014203038.7e4ca1d5@fsol> <20141014214149.36e03d31@fsol> <20141014230432.31899e3c@fsol> <20141015001034.5b049eb0@fsol> Message-ID: On 15 October 2014 11:10, Antoine Pitrou wrote: > Each time a connection is accepted, protocol_factory is called without > arguments(**) to create a Protocol, a bidirectional stream Transport is > created to represent the network side of the connection, and the two > are tied together by calling protocol.connection_made(transport).""" So where would headers etc be supplied to the protocol for reads (and for outputs)? Since the transport isn't the raw socket, its the bodies only. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From solipsis at pitrou.net Wed Oct 15 00:58:52 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 15 Oct 2014 00:58:52 +0200 Subject: [Web-SIG] WSGI and asyncio (tulip)? References: <20141014203038.7e4ca1d5@fsol> <20141014214149.36e03d31@fsol> <20141014230432.31899e3c@fsol> <20141015001034.5b049eb0@fsol> Message-ID: <20141015005852.2f49ef20@fsol> On Wed, 15 Oct 2014 11:28:42 +1300 Robert Collins wrote: > On 15 October 2014 11:10, Antoine Pitrou wrote: > > > Each time a connection is accepted, protocol_factory is called without > > arguments(**) to create a Protocol, a bidirectional stream Transport is > > created to represent the network side of the connection, and the two > > are tied together by calling protocol.connection_made(transport).""" > > So where would headers etc be supplied to the protocol for reads (and > for outputs)? Since the transport isn't the raw socket, its the bodies > only. For reads, it could be provided on e.g. transport.get_extra_info('headers'). Or if you want a flat mapping API, transport.get_extra_info('http_content_type'), etc. As for out-bound headers, it's a good question. But I think it's ok for the transport to have HTTP-specific methods, so transport.write_header(name, value) could be ok too. Regards Antoine. From robertc at robertcollins.net Mon Oct 20 23:05:36 2014 From: robertc at robertcollins.net (Robert Collins) Date: Tue, 21 Oct 2014 10:05:36 +1300 Subject: [Web-SIG] WSGI and asyncio (tulip)? In-Reply-To: <20141015005852.2f49ef20@fsol> References: <20141014203038.7e4ca1d5@fsol> <20141014214149.36e03d31@fsol> <20141014230432.31899e3c@fsol> <20141015001034.5b049eb0@fsol> <20141015005852.2f49ef20@fsol> Message-ID: On 15 October 2014 11:58, Antoine Pitrou wrote: > On Wed, 15 Oct 2014 11:28:42 +1300 > Robert Collins > wrote: >> On 15 October 2014 11:10, Antoine Pitrou wrote: >> >> > Each time a connection is accepted, protocol_factory is called without >> > arguments(**) to create a Protocol, a bidirectional stream Transport is >> > created to represent the network side of the connection, and the two >> > are tied together by calling protocol.connection_made(transport).""" >> >> So where would headers etc be supplied to the protocol for reads (and >> for outputs)? Since the transport isn't the raw socket, its the bodies >> only. > > For reads, it could be provided on e.g. > transport.get_extra_info('headers'). Or if you want a flat mapping API, > transport.get_extra_info('http_content_type'), etc. > > As for out-bound headers, it's a good question. But I think it's ok for > the transport to have HTTP-specific methods, so > transport.write_header(name, value) could be ok too. Ok so on balance I suspect that this is different enough to the needs for a blocking API - even with generators in use - that we're better off keeping them separate. But, for all the variables, headers etc, we can write a single definition of the semantics at play - e.g. see the REMOTE_ADDR one. Would that make sense as separate specs, the way RFC2616 has split into separate RFCs in in the 723x update ? Or one big spec with sections, and when a pep-3156 based protocol/api is put together just add a section? -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud