[New-bugs-announce] [issue35100] urllib.parse.unquote_to_bytes: needs "escape plus" option

Henry Zhu report at bugs.python.org
Sun Oct 28 23:55:54 EDT 2018


New submission from Henry Zhu <zhuhe212 at 163.com>:

`urllib.parse.unquote_to_bytes` should have an "escape plus" option, just like `urllib.parse.unquote_plus` does. 

It's very necessary in some cases:

```
# Say I have a url string: 'a+%2b%c0'. 
# In Python2, I can parse it into b'a +\xc0' with urllib.unquote_plus.
# Note that the first "+" was escaped into space, and the second "+" was decoded from "%2b".
# But in Python3, this just can't be done, either with urllib.parse.unquote, urllib.par.unquote_plus or urllib.parse.unquote_to_bytes.
# This is the example:

>>> from urllib import parse
>>> s = 'a+%2b%c0'
>>> parse.unquote(s)
'a++�'
>>> parse.unquote_plus(s)
'a +�'
>>> parse.unquote_to_bytes(s)
b'a++\xc0'
```

PS: the character "�" should be "À", but it can't be shown in command line.

The result of `urllib.parse.unquote_to_bytes` is almost what I want, except that it doesn't escape the first "+" into space.

----------
components: Library (Lib)
messages: 328786
nosy: Henry Zhu
priority: normal
severity: normal
status: open
title: urllib.parse.unquote_to_bytes: needs "escape plus" option
type: enhancement
versions: Python 3.5, Python 3.6, Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35100>
_______________________________________


More information about the New-bugs-announce mailing list