[Web-SIG] PEP 333 and gzipping of responses

Tue Aug 11 04:11:36 CEST 2009

Earlier today I posted an article on my blog following up on some
discussions of WSGI; one criticism presented was of language in PEP
333 regarding gzipping of responses by WSGI applications. Ian posted a
comment which stated that the criticism was not correct, but I'm at a
loss to figure out what *is* correct, so I'll bring up the question
here.

In a parenthetical at the end of the section entitled "Handling the
Content-Length Header", PEP 333 states:

> Note: applications and middleware must not apply any kind of
> Transfer-Encoding to their output, such as chunking or gzipping; as
> "hop-by-hop" operations, these encodings are the province of the
> actual web server/gateway. See Other HTTP Features below, for more
> details.

In the section "Other HTTP Features", PEP 333 states, in part:

> However, because WSGI servers and applications do not communicate
> via HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to
> WSGI internal communications. WSGI applications must not generate
> any "hop-by-hop" headers [4], attempt to use HTTP features that
> would require them to generate such headers, or rely on the content
> of any incoming "hop-by-hop" headers in the environ dictionary.

My criticism of this is that this is at best ambiguous, and quite
possibly openly misleading to readers of the PEP.

The ambiguity here is that "gzip" is a valid value for the
Transfer-Encoding header in HTTP (RFC 2616, Sections 3.6 and 14.41),
but is also a valid value for the Content-Encoding header (RFC 2616,
Sections 3.5 and 14.11).

Web frameworks and libraries (in many languages, not just Python)
which support gzipping of responses all seem to opt for the latter
method. Additionally, Apache's mod_deflate -- which so far as I know
is overwhelmingly the most common mechanism for enabling gzipping at
the server level -- also opts for this method, and uses the
Content-Encoding header.

Given this, gzipping of responses seems to be rather universally
associated, in the minds of web developers, with the Content-Encoding
header, which is not a "hop-by-hop" header (RFC 2616, Section
13.5.1). As such, the immediate (and misleading) impression given to
readers of PEP 333 will likely be one of:

1. PEP 333 forbids applications using Content-Encoding to signal
   gzipped response bodies (since it mentions gzipping as something
   applications specifically must not do), or

2. PEP 333 is ambiguous or contradictory on account of mentioning
   Transfer-Encoding and "hop-by-hop" headers in a context in which
   no-one uses Transfer-Encoding or a "hop-by-hop" header, or

3. This text in PEP 333 is based upon a misunderstanding of this
   feature of HTTP or of its use in the real world.

None of these seem particularly good, and this is why I took that
section of the spec to task (albeit in a much briefer and more cursory
fashion, since this message is already starting to run a bit long).

If I'm misreading or misunderstanding either PEP 333 or RFC 2616, I'd
appreciate it if someone would explain where I've gone astray. But as
it stands, I believe the text of PEP 333 quoted above is problematic
and likely to lead to confusion, and (if I'm not misreading or
misunderstanding it) should probably be revised to address these
concerns.

-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."