Ok, so in the sort term you are suggesting to change Request.URLPath (uppercased method? Hmm) to use Host header instead of getRequestHostname and to change Klein to use it instead of Request.getHost(), right?
Sounds wise and reasonable :)

But I'd like to add one more thing. In order to build correct external URL we need to know is it http or https. Currently URLPath is using Request.isSecure(), but it is not sufficient since Request.isSecure() just checks if backend connection is SSL while encryption is often terminated at a reverse proxy. Can we add "useXForwardedProto=False" argument to Request.URLPath() and check X-Forwarded-Proto header if it is true? And may be add "useXForwardedHost=False" too to simplify setting up a reverse proxy (with a bold red warning in docstring that it can be set to True only if reverse proxy is correctly configured to drop corresponding client-specified headers). What do you think?

-- ilya

ср, 15 мар. 2017 г. в 9:10, Glyph Lefkowitz <glyph@twistedmatrix.com>:
On Mar 14, 2017, at 3:00 PM, Ilya Skriblovsky <ilyaskriblovsky@gmail.com> wrote:

Tickets you have mentioned and forwarded-for-5807 branch are mostly about parsing X-Forwarded-For in order to obtain correct client IP. While it is valuable task, it is not what strikes me right now.

Sorry, I was pretty tired when I wrote that message, and I realize that I was getting server identification and client identification mixed up.

I'm now more concerned with an absence of API for getting user-visible server's name, not client's ip.

Yes.  My mistake.  (Although totally fix those other bugs too. They're also bad. :))

Look, I'm currently porting my app from Django to Klein and noticed strange behavior of Klein. For example:
@app.route('/alias', alias=True)
@app.route('/path')
def path(request): return b'42'

When /alias is requested werkzeug generates a redirect to /path. But Klein is passing Request.getHost() to Werkzeug, so redirect gets internal hostname and exposes backend's internal hostname and port to the user. Seems like Klein is passing incorrect hostname to Werkzeug. But how can we fix that?

There are two methods in Request:
• Request.getHost() — "Get my originally requesting transport's host" as doc says. Ok, seems like this method intentionally returns server's internal address.
• Request.getRequestHostname() —doc says:
>> "Get the hostname that the user passed in to the request. This will either use the Host: header (if it is available) or the host we are listening on if the header is unavailable."
Cool, but why does this method only returns a hostname without a port? It intentionally strips out the port number from Host header. What is the point of such implementation?

Request is one of the oldest parts of Twisted, so the likely reason is "it looked like a good idea at the time".  Request predates the requirement for test coverage, documentation coverage, and, in many cases, the author (me) having any idea what they were doing :).  If you find something that looks bad, it's probably just bad, there is unlikely any deeper reason.

Long term, we need to overhaul the API to have fewer methods and be generally less confusing. See for example the infamous https://twistedmatrix.com/trac/ticket/288 ticket.  However, before we do that, we should make all the stuff that is there already behave correctly and be documented even in its weird shape; then we can transition to a new good thing confident in the knowledge that no old applications will break and that users can move over to the new APIs without massive disruption.

This method is used only a couple of times inside Twisted itself, and in both places Twisted gets what getRequestHostname() returned and mixes it with request.getHost().port which is *definitely* incorrect, because the former is user-visible while latter is internal. So if my backend server is using different port than a fronend, it is impossible to use getRequestHostname() to build user-visible URL. I think current getRequestHostname() implementation is broken.

So I have two proposals:

Proposal #1 (fixing current behavior):
• Variant #1: Change Request.getRequestHostname() to return b"hostname:port". I think this is the correct thing to do, but this is a backward-incompatible change.
- or -
• Variant #2: Change Klein to use Request.getHeader(b'Host') with fallback to Request.getHost()

Proposal #2 (adding new feature if Variant #1 is choosed):
• Add useXForwardedHost=False argument to Request.getRequestHostname() and useXForwardedProto=False to Request.isSecure(). If True is passed, these methods will obey corresponding request headers that are de-facto standard for reverse proxies. Also add corresponding options to Klein app. This can simplify reverse proxy configuration a bit.

I have a third proposal.

Ideally if we want to know about the URL for the request, we could ask the request to just give us the URL.  And in fact the URL does have a method, URLPath(), which is both (A) unambiguously broken (the case could be made that getRequestHostname() is supposed to really just be a host, not for URL generation, and maybe there is even a case where that makes sense; origin comparisons perhaps) and (B) returning a data structure which could be fixed to be correct without concern for client compatibility.

In the long term, we should get rid of all these methods and have a single 'request.url()' method which cleanly and correctly returns a https://twistedmatrix.com/documents/17.1.0/api/twisted.python.url.URL.html object, which is better than a string or a URLPath (basically, it's what URLPath should have been if we had designed it carefully).  In the meanwhile, without adding a bunch of new API surface and abandoning existing methods, Request.URLPath() is the easiest place to put this fix.

getRequestHostname is, as you correctly called out, probably useless, but we should just adjust its docstring to direct users to the URLPath method instead.

Klein should then be changed to use Request.URLPath() to build any URLs.

What do you think of this proposal?  Does my reasoning make sense?

-glyph

_______________________________________________
Twisted-web mailing list
Twisted-web@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web