[Python-bugs-list] [ python-Bugs-478038 ] urlparse.urlparse semicolon bug
noreply@sourceforge.net
noreply@sourceforge.net
Mon, 05 Nov 2001 09:58:50 -0800
Bugs item #478038, was opened at 2001-11-04 09:19
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=478038&group_id=5470
Category: Python Library
Group: Python 2.1.1
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: urlparse.urlparse semicolon bug
Initial Comment:
urlparse,urlparse uses obsolete parsing rules. It
expects there to
be no more than one semicolon in a URL, as in:
http://127.0.0.1:8880/semitest/foo;presentation=edit?x=y
It splits the url into parts, one of which is the part
after between
the semicolon and the question mark. This behavior is
based
on an obsolete URL spec.
Recent specs, including the RFCs referenced in the
urlparse
documentation allow semicolons in each path, as in:
http://127.0.0.1:8880/semitest/foo;presentation=edit/form/spam;eggs=1/splat
urlparse.urlparse parses as follows:
[jim@c ZServer]$ python2.2
Python 2.2b1 (#1, Oct 22 2001, 17:42:33)
[GCC 2.95.3 19991030 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
Py$ from urlparse import urlparse
Py$
urlparse("http://127.0.0.1:8880/semitest/foo%3Bbar;presentation=edit/form/spam;eggs=1/splat")
('http', '127.0.0.1:8880', '/semitest/foo%3Bbar',
'presentation=edit/form/spam;eggs=1/splat', '', '')
Py$
which is incorrect because much of the path is
incorrectly
included in the obsolete "params" part.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=478038&group_id=5470