[Python-bugs-list] [ python-Bugs-516299 ] urlparse can get fragments wrong

noreply@sourceforge.net noreply@sourceforge.net
Thu, 14 Mar 2002 09:52:03 -0800


Bugs item #516299, was opened at 2002-02-11 23:10
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=516299&group_id=5470

Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: A.M. Kuchling (akuchling)
>Assigned to: Nobody/Anonymous (nobody)
Summary: urlparse can get fragments wrong

Initial Comment:
urlparse.urlparse() goes wrong on a URL such as
'http://amk.ca#foo', where there's a fragment
identifier and 
the hostname isn't followed by a slash.  It returns
'amk.ca#foo'
as the hostname portion of the URL.

While looking at that, I realized that test_urlparse()
only tests urljoin(), not urlparse() or urlunparse(). 
The attached patch
also adds a minimal test suite for urlparse(), but it
should
be still more comprehensive.  Unfortunately the RFC
doesn't include test cases, so I haven't done this yet.

(Assigned to you at random, Michael; feel free to
unassign it
if you lack the time.)


----------------------------------------------------------------------

>Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-14 12:52

Message:
Logged In: YES 
user_id=11375

Unassigning -- anyone want to review my bug fix so I can check it 
in?

(leogah's idea of using the regex from RFC2396 is a good one, but 
that 
large a change should probably go into 2.3, not a .1 release.)


----------------------------------------------------------------------

Comment By: Richard Brodie (leogah)
Date: 2002-02-20 08:56

Message:
Logged In: YES 
user_id=356893

The current version of the URI specification (RFC2396) 
includes a regexp for parsing URIs. For evil edge cases, I 
usually cut and paste directly into re.

Would it be an idea just to incorporate it rather than 
hammer the kinks out of the ad-hoc parser? If so, I'll hack 
on it.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-02-13 05:45

Message:
Logged In: YES 
user_id=6656

Sorry, don't know *anything* about URLs and don't really
have the time to learn now...

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=516299&group_id=5470