[ python-Bugs-1313119 ] urlparse "caches" parses regardless of encoding

Tue Oct 4 19:57:40 CEST 2005

Bugs item #1313119, was opened at 2005-10-04 17:57
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1313119&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Ken Kinder (kkinder)
Assigned to: M.-A. Lemburg (lemburg)
Summary: urlparse "caches" parses regardless of encoding

Initial Comment:
The issue can be summarized with this code:

>>> urlparse.urlparse(u'http://www.python.org/doc')
(u'http', u'www.python.org', u'/doc', '', '', '')
>>> urlparse.urlparse('http://www.python.org/doc')
(u'http', u'www.python.org', u'/doc', '', '', '')

Once the urlparse library has "cached" a URL, it stores
the resulting value of that cache regardless of
datatype. Notice that in the second use of urlparse, I
passed it a STRING and got back a UNICODE object.

This can be quite confusing when, as a developer, you
think you've already encoded all your objects, you use
urlparse, and all of a sudden you have unicode objects
again, when you expected to have strings.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1313119&group_id=5470