[Web-SIG] urlparse method behaviour when handing abs/rel urls

Robert Brewer fumanchu at aminus.org
Fri Jun 27 21:36:33 CEST 2008


Fred Drake wrote:
> On Fri, Jun 27, 2008 at 3:01 PM, O.R.Senthil Kumaran
> <orsenthil at gmail.com> wrote:
> > BTW, commonly when someone writes 'www.python.org', we tend to
> > understand that he is referring to net_loc. Is it not?
> > And also, when we type 'www.python.org' at Address Location in the
> > Browser, it automatically gets translated to http://www.python.org
as
> > the full url and www.python.org becomes net_loc in this case.
> 
> There are two cases here:
> 
> 1. Relative URLs in a context that has a base URL (inside a resource
> loaded from a URL, or in an (X)HTML document that includes a <base>
> element).
> 
> 2. Abreviated URLs in a user interface that implies no context with a
> base URL (like the browser's address bar).
> 
> I'd suggest that these are completely different.  urlsplit and
> urlparse support 1.  If we want the second, that should be a separate
> function.  It would be reasonable to add that to the urlparse module
> (urllib.parse in Python 3).

There's even a 3rd case: HTTP's Request-URI. For example, '//path' must
be treated as an abs_path consisting of two path_segments ['', 'path'],
not a net_loc, since the Request_URI must be one of ("*" | absoluteURI |
abs_path | authority).


Robert Brewer
fumanchu at aminus.org

See
http://www.cherrypy.org/browser/branches/815-urljoin/cherrypy/wsgiserver
/__init__.py#L247 for an implementation.


More information about the Web-SIG mailing list