[docs] [issue10226] urlparse example is wrong

Sat Oct 30 16:51:22 CEST 2010

R. David Murray <rdmurray at bitdance.com> added the comment:

How about this:

-  If the scheme value is not specified, urlparse following the syntax
-  specifications from RFC 1808, expects the netloc value to start with '//',
-  Otherwise, it is not possible to distinguish between net_loc and path
-  component and would classify the indistinguishable component as path as in
-  a relative url.

+  Following the syntax specifications in RFC 1808, urlparse recognizes
+  a netloc only if it is properly introduced by '//'.  Otherwise the
+  input must be presumed to be a relative URL and thus to start with
+  a path component.

However, it seems to me there is a bug here:

>>> urlparse.urlparse('www.k.com:80/path')
ParseResult(scheme='', netloc='', path='www.k.com:80/path', params='',
query='', fragment='')
>>> urlparse.urlparse('www.k.com:path')
ParseResult(scheme='www.k.com', netloc='', path='path', params='',
query='', fragment='')

I think the second one is correct and that the first one should produce

ParseResult(scheme='www.k.com', netloc='', path='80/path', params='',
query='', fragment='')

I haven't read all the way through the RFC again, though.  But *one*
of the above is wrong.

----------
nosy: +r.david.murray

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10226>
_______________________________________