R. David Murray <rdmurray@bitdance.com> added the comment: How about this: - If the scheme value is not specified, urlparse following the syntax - specifications from RFC 1808, expects the netloc value to start with '//', - Otherwise, it is not possible to distinguish between net_loc and path - component and would classify the indistinguishable component as path as in - a relative url. + Following the syntax specifications in RFC 1808, urlparse recognizes + a netloc only if it is properly introduced by '//'. Otherwise the + input must be presumed to be a relative URL and thus to start with + a path component. However, it seems to me there is a bug here:
urlparse.urlparse('www.k.com:80/path') ParseResult(scheme='', netloc='', path='www.k.com:80/path', params='', query='', fragment='') urlparse.urlparse('www.k.com:path') ParseResult(scheme='www.k.com', netloc='', path='path', params='', query='', fragment='')
I think the second one is correct and that the first one should produce ParseResult(scheme='www.k.com', netloc='', path='80/path', params='', query='', fragment='') I haven't read all the way through the RFC again, though. But *one* of the above is wrong. ---------- nosy: +r.david.murray _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10226> _______________________________________