[Flask] HTTP/1.1 connection re-use broken in Flask/werkzeug development server?

Daniel Lenski dlenski at gmail.com
Mon May 24 20:36:42 EDT 2021


On Mon, May 24, 2021 at 3:27 PM Daniel Lenski <dlenski at gmail.com> wrote:
> I recently discovered that the Flask development server appears to be unable to correctly handle HTTP/1.1 connection reuse [2]. When a connection to the development server is reused for multiple requests, the *body* of one request will be misinterpreted as a new HTTP/1.1 request line.

In attempting to come up with a minimal example of this problem, I
both overstated the scope, as well as sent a non-functional examples.
Sorry for anyone who read it already.

Here's one that I've just tested, which I believe clarifies the scope
of the problem:

     from flask import Flask, abort, request
     from werkzeug.serving import WSGIRequestHandler
     WSGIRequestHandler.protocol_version = "HTTP/1.1"

     app = Flask(__name__)

     # Accepts POST and consumes body
     @app.route('/consume_body', methods=('POST',))
     def consume_body():
         n = len(request.form)
         return 'Form contained %d fields\n' % n, 200

     # Accepts POST but doesn't consume body
     @app.route('/ignore_body', methods=('POST',))
     def ignore_body():
         return 'Ignored body', 200

     # Accepts POST but fails before consuming body
     @app.route('/fail_first', methods=('POST',))
     def fail_first():
         abort(500)
         n = len(request.form)
         return 'Form contained %d fields\n' % n, 200

     # Run it
     app.run(host='localhost', port='8080', debug=True)

If I reuse an HTTP/1.1 connection to issue multiple requests to a
handler where the request body is consumed, it WORKS FINE:
    curl -X POST http://localhost:8080/consume_body -d foo=bar -: -X
POST http://localhost:8080/consume_body -d 'a=b&x=y'
    Form contained 1 fields
    Form contained 2 fields

    server log shows:
    127.0.0.1 - - [24/May/2021 15:59:26] "POST /consume_body HTTP/1.1" 200 -
    127.0.0.1 - - [24/May/2021 15:59:26] "POST /consume_body HTTP/1.1" 200 -

However, if I reuse an HTTP/1.1 connection to issue multiple requests
to a route where the request body ISN'T consumed, the body of earlier
requests is misinterpreted as the beginning of a subsequent request:
    curl -X POST http://localhost:8080/consume_body -d foo=bar -: -X
POST http://localhost:8080/consume_body -d 'a=b&x=y'

    server log shows:
    127.0.0.1 - - [24/May/2021 16:01:00] "POST /fail_first HTTP/1.1" 500 -
    127.0.0.1 - - [24/May/2021 16:01:00] "foo=barPOST /consume_body
HTTP/1.1" 405 -
    127.0.0.1 - - [24/May/2021 16:01:00] code 400, message Bad request
syntax ('a=b&x=y')
    127.0.0.1 - - [24/May/2021 16:01:00] "None /consume_body HTTP/0.9"
HTTPStatus.BAD_REQUEST -

What I now understand is going on here: the development server isn't
consuming the request body until explicitly used referenced. Which
happens in werkzeug.wrappers.Request.get_data() in
https://github.com/pallets/werkzeug/blob/main/src/werkzeug/wrappers/request.py#L367-L373

This means that if the request body isn't consumed, then the request
body is never read and the stream is at the wrong place for the next
request. This could happen due to a mistake in the code, due to an
error in the handler, or due to the fact that a request body wasn't
expected (e.g. a Content-Length header and body added teo a GET
request).

One workaround is to wrap EVERY handler in 'try... finally:
request.data' to ensure that the request body is ALWAYS read before
returning to the server. However, that seems like a fairly large
burden to place on the user.

Ideas I had to make this behavior less-surprising…

1. Ensure that werkzeug's WSGIRequestHandler.run_wsgi forcibly closes
the connection if the input stream hasn't been fully read after the
WSGI handler runs
(https://github.com/pallets/werkzeug/blob/main/src/werkzeug/serving.py#L275-L280)
2. Ensure that werkzeug's WSGIRequestHandler.run_wsgi reads to the end
of the input stream if it hasn't already been read after the WSGI
handler runs.

Problems:
- Deciding that the input stream "has been fully read" is only easy in
the cases where there's an explicit 'Content-Length' header in the
*request*.
- "Fully reading" the stream might take a lot of time and memory if
the request 'Content-Length' is large.
- "Fully reading" the stream could take infinite time and memory with
a malicious/infinite chunked Transfer-Encoding.
- In the case of HTTP CONNECT method (which turns the HTTP connection
back into a standard 2-way TCP socket), the *only* solution is to
close the connection.

My preferred solution would be to read to the end of the input stream
if it 'Content-Length' is set and not too large (perhaps following the
MAX_CONTENT_LENGTH configuration as in werkzeug.wrappers.Request;
https://github.com/pallets/flask/blob/main/src/flask)/wrappers.py#L57)
and to simply close the connection otherwise.

I'd be interested in hearing any input before I try to code this up…

Thanks,
Dan


More information about the Flask mailing list