[issue10343] urllib.parse problems with bytes vs str

Sat Nov 6 19:52:07 CET 2010

New submission from Hallvard B Furuseth <h.b.furuseth at usit.uio.no>:

urlunparse(url or params = bytes object) produces a result
with the repr of the bytes object.

urllib.parse.urlunparse(['http', 'host', '/dir', b'params', '', ''])
--> "http://host/dir;b'params'"

That's confusing since urllib/parse.py goes to a lot of trouble to
support both bytes and str.  Simplest fix is to only accept str:

Index: Lib/urllib/parse.py
@@ -219,5 +219,5 @@ def urlunparse(components):
     scheme, netloc, url, params, query, fragment = components
     if params:
-        url = "%s;%s" % (url, params)
+        url = ';'.join((url, params))
     return urlunsplit((scheme, netloc, url, query, fragment))
 
Some people at comp.lang.python tell me code shouldn't anyway do str()
just in case it is needed like urllib does, not that I can make much
sense of that discussion.  (Subject: harmful str(bytes)).

BTW, the str vs bytes code doesn't have to be quite as painful as in
urllib.parse, I enclose patch which just rearranges and factors out
some code.

----------
components: Library (Lib)
files: parse.diff
keywords: patch
messages: 120647
nosy: hfuru
priority: normal
severity: normal
status: open
title: urllib.parse problems with bytes vs str
type: behavior
versions: Python 3.2
Added file: http://bugs.python.org/file19525/parse.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10343>
_______________________________________