More on Py3K urllib -- urlencode()
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
Hi. I've been using Py3K successfully for a while now, and have some questions about urlencode(). 1) The docs mention that items sent to urlencode() are quoted using quote_plus(). However, instances of type "bytes" are not handled like they are with quote_plus() because urlencode() converts the parameters to strings first (which then puts a small "b" and single quotes around a textual representation of the bytes). It just seems to me that instances of type "bytes" should be passed directly to quote_plus(). That would complicate the code just a bit, but would end up being much more intuitive and useful. 2) If urlencode() relies so heavily on quote_plus(), then why doesn't it include the extra encoding-related parameters that quote_plus() takes? 3) Regarding the following code fragment in urlencode(): k = quote_plus(str(k)) if isinstance(v, str): v = quote_plus(v) l.append(k + '=' + v) elif isinstance(v, str): # is there a reasonable way to convert to ASCII? # encode generates a string, but "replace" or "ignore" # lose information and "strict" can raise UnicodeError v = quote_plus(v.encode("ASCII","replace")) l.append(k + '=' + v) I don't understand how the "elif" section is invoked, as it uses the same condition as the "if" section. Thanks in advance for any thoughts on this issue. I could submit a patch for urlencode() to better explain my ideas if that would be useful. - Dan
data:image/s3,"s3://crabby-images/b2012/b20127a966d99eea8598511fc82e29f8d180df6c" alt=""
Dan Mahn <dan.mahn@digidescorp.com> wrote:
This looks like a 2->3 bug; clearly only the second branch should be used in Py3K. And that "replace" is also a bug; it should signal an error on encoding failures. It should probably catch UnicodeError and explain the problem, which is that only Latin-1 values can be passed in the query string. So the encode() to "ASCII" is also a mistake; it should be "ISO-8859-1", and the "replace" should be a "strict", I think. Bill
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
After a harder look, I concluded there was a bit more work to be done, but still very basic modifications. Attached is a version of urlencode() which seems to make the most sense to me. I wonder how I could officially propose at least some of these modifications. - Dan Bill Janssen wrote:
def urlencode(query, doseq=0, safe='', encoding=None, errors=None): """Encode a sequence of two-element tuples or dictionary into a URL query string. If any values in the query arg are sequences and doseq is true, each sequence element is converted to a separate parameter. If the query arg is a sequence of two-element tuples, the order of the parameters in the output will match the order of parameters in the input. """ if hasattr(query,"items"): # mapping objects query = query.items() else: # it's a bother at times that strings and string-like objects are # sequences... try: # non-sequence items should not work with len() # non-empty strings will fail this if len(query) and not isinstance(query[0], tuple): raise TypeError # zero-length sequences of all types will get here and succeed, # but that's a minor nit - since the original implementation # allowed empty dicts that type of behavior probably should be # preserved for consistency except TypeError: ty,va,tb = sys.exc_info() raise TypeError("not a valid non-string sequence or mapping object").with_traceback(tb) l = [] if not doseq: # preserve old behavior for k, v in query: k = quote_plus(k if isinstance(k, (str,bytes)) else str(k), safe, encoding, errors) v = quote_plus(v if isinstance(v, (str,bytes)) else str(v), safe, encoding, errors) l.append(k + '=' + v) else: for k, v in query: k = quote_plus(k if isinstance(k, (str,bytes)) else str(k), safe, encoding, errors) if isinstance(v, str): v = quote_plus(v if isinstance(v, (str,bytes)) else str(v), safe, encoding, errors) l.append(k + '=' + v) else: try: # is this a sufficient test for sequence-ness? x = len(v) except TypeError: # not a sequence v = quote_plus(str(v)) l.append(k + '=' + v) else: # loop over the sequence for elt in v: elt = quote_plus(elt if isinstance(elt, (str,bytes)) else str(elt), safe, encoding, errors) l.append(k + '=' + elt) return '&'.join(l)
data:image/s3,"s3://crabby-images/98972/989726b670c074dad357f74770b5bbf840b6471a" alt=""
On Sat, Mar 07, 2009, Dan Mahn wrote:
Submit a patch to bugs.python.org -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
Yes, that was a good idea. I found some problems, and attached a new version. It looks more complicated than I wanted, but it is a very regular repetition, so I hope it is generally readable. I used "doctest" to include the test scenarios. I was not familiar with it before, but it seems to work quite well. The main snag I hit was that I had to jazz around with the escape sequences (backslashes) in order to get the doc string to go in properly. That is, the lines in the string are not the lines I typed at the command prompt, as Python is interpreting the escapes in the strings when the file is imported. In an effort to make fewer tests, the lines of the test strings grew pretty long. I'm not sure if I should try to cut the lengths down or not. Any suggestions would be welcome before I try to submit this as a patch. - Dan Bill Janssen wrote:
from urllib.parse import quote_plus import sys def urlencode(query, doseq=0, safe='', encoding=None, errors=None): """Encode a sequence of two-element tuples or dictionary into a URL query string. If any values in the query arg are sequences and doseq is true, each sequence element is converted to a separate parameter. If the query arg is a sequence of two-element tuples, the order of the parameters in the output will match the order of parameters in the input. >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), (1, 2), ("a:", "b$"))) '%C2%A0=%C3%81&%A0%24=%C1%24&1=2&a%3A=b%24' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), (1, 2), ("a:", "b$")), safe=":$") '%C2%A0=%C3%81&%A0$=%C1$&1=2&a:=b$' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), (1, 2), ("a:", "b$")), encoding="latin=1") '%A0=%C1&%A0%24=%C1%24&1=2&a%3A=b%24' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), (1, 2), ("a:", "b$")), safe="$:", encoding="latin=1") '%A0=%C1&%A0$=%C1$&1=2&a:=b$' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), ("d:", 0xe), (1, ("b", b'\\x0c\\x24', 0xd, "e$"))), 1) '%C2%A0=%C3%81&%A0%24=%C1%24&d%3A=14&1=b&1=%0C%24&1=13&1=e%24' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), ("d:", 0xe), (1, ("b", b'\\x0c\\x24', 0xd, "e$"))), 1, safe=":$") '%C2%A0=%C3%81&%A0$=%C1$&d:=14&1=b&1=%0C$&1=13&1=e$' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), ("d:", 0xe), (1, ("b", b'\\x0c\\x24', 0xd, "e$"))), 1, encoding="latin-1") '%A0=%C1&%A0%24=%C1%24&d%3A=14&1=b&1=%0C%24&1=13&1=e%24' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), ("d:", 0xe), (1, ("b", b'\\x0c\\x24', 0xd, "e$"))), 1, safe=":$", encoding="latin-1") '%A0=%C1&%A0$=%C1$&d:=14&1=b&1=%0C$&1=13&1=e$' >>> urlencode((("\\u00a0", "\\u00c1"),), encoding="ASCII", errors="replace") '%3F=%3F' >>> urlencode((("\\u00a0", (1, "\\u00c1")),), 1, encoding="ASCII", errors="replace") '%3F=1&%3F=%3F' """ if hasattr(query,"items"): # mapping objects query = query.items() else: # it's a bother at times that strings and string-like objects are # sequences... try: # non-sequence items should not work with len() # non-empty strings will fail this if len(query) and not isinstance(query[0], tuple): raise TypeError # zero-length sequences of all types will get here and succeed, # but that's a minor nit - since the original implementation # allowed empty dicts that type of behavior probably should be # preserved for consistency except TypeError: ty,va,tb = sys.exc_info() raise TypeError("not a valid non-string sequence or mapping object").with_traceback(tb) l = [] if not doseq: # preserve old behavior for k, v in query: if isinstance(k, bytes): k = quote_plus(k, safe) else: k = quote_plus(str(k), safe, encoding, errors) if isinstance(v, bytes): v = quote_plus(v, safe) else: v = quote_plus(str(v), safe, encoding, errors) l.append(k + '=' + v) else: for k, v in query: if isinstance(k, bytes): k = quote_plus(k, safe) else: k = quote_plus(str(k), safe, encoding, errors) if isinstance(v, str): v = quote_plus(v, safe, encoding, errors) l.append(k + '=' + v) elif isinstance(v, bytes): v = quote_plus(v, safe) l.append(k + '=' + v) else: try: # is this a sufficient test for sequence-ness? x = len(v) except TypeError: # not a sequence v = quote_plus(str(v), safe, encoding, errors) l.append(k + '=' + v) else: # loop over the sequence for elt in v: if isinstance(elt, bytes): elt = quote_plus(elt, safe) else: elt = quote_plus(str(elt), safe, encoding, errors) l.append(k + '=' + elt) return '&'.join(l) if __name__ == "__main__": import doctest doctest.testmod()
data:image/s3,"s3://crabby-images/98972/989726b670c074dad357f74770b5bbf840b6471a" alt=""
On Mon, Mar 09, 2009, Dan Mahn wrote:
Any suggestions would be welcome before I try to submit this as a patch.
Just go ahead and submit it now; it's easier to review patches when they're in the system, and it also makes sure that it won't get lost. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
Ahh ... I see. I should have done a bit more digging to find where the standard tests were. I created a few new tests that could be included in that test suite -- see the attached file. Do you think that this would be sufficient? - Dan Bill Janssen wrote:
def test_encoding(self): # Test for special character encoding given = (("\u00a0", "\u00c1"),) expect = '%3F=%3F' result = urllib.parse.urlencode(given, encoding="ASCII", errors="replace") self.assertEqual(expect, result) result = urllib.parse.urlencode(given, True, encoding="ASCII", errors="replace") self.assertEqual(expect, result) given = (("\u00a0", (1, "\u00c1")),) # ... now with default utf-8 ... given = (("\u00a0", "\u00c1"),) expect = '%C2%A0=%C3%81' result = urllib.parse.urlencode(given) self.assertEqual(expect, result) # ... now with latin-1 ... expect = '%A0=%C1' result = urllib.parse.urlencode(given, encoding="latin-1") self.assertEqual(expect, result) # ... now with sequence ... given = (("\u00a0", (1, "\u00c1")),) expect = '%3F=1&%3F=%3F' result = urllib.parse.urlencode(given, True, encoding="ASCII", errors="replace") self.assertEqual(expect, result) # ... again with default utf-8 ... given = (("\u00a0", "\u00c1"),) expect = '%C2%A0=%C3%81' result = urllib.parse.urlencode(given, True) self.assertEqual(expect, result) # ... again with latin-1 ... expect = '%A0=%C1' result = urllib.parse.urlencode(given, True, encoding="latin-1") self.assertEqual(expect, result) def test_bytes(self): # Test for encoding bytes given = ((b'\xa0\x24', b'\xc1\x24'),) expect = '%A0%24=%C1%24' result = urllib.parse.urlencode(given) self.assertEqual(expect, result) # ... now with sequence ... result = urllib.parse.urlencode(given, True) self.assertEqual(expect, result) # ... now with safe and encoding ... expect = '%A0$=%C1$' result = urllib.parse.urlencode(given, safe=":$") self.assertEqual(expect, result) result = urllib.parse.urlencode(given, safe=":$", encoding="latin-1") self.assertEqual(expect, result) # ... again with sequence ... result = urllib.parse.urlencode(given, True, safe=":$") self.assertEqual(expect, result) result = urllib.parse.urlencode(given, True, safe=":$", encoding="latin-1") self.assertEqual(expect, result) # ... now with an actual sequence ... given = ((b'\xa0\x24', (b'\xc1\x24', 0xd)),) result = urllib.parse.urlencode(given, True, safe=":$") self.assert_(expect in result, "%s not found in %s" % (expect, result)) expect2 = '%A0$=1' self.assert_(expect2 in result, "%s not found in %s" % (expect2, result))
data:image/s3,"s3://crabby-images/98972/989726b670c074dad357f74770b5bbf840b6471a" alt=""
On Tue, Mar 10, 2009, Dan Mahn wrote:
First of all, please notice from the list traffic that except for Guido (who gets special dispensation because he's BDFL), most messages do not use top-posting: A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet? Second, please follow the advice to put ALL patches on the tracker. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
I submitted an explanation of this and my proposed modification as issue 5468. http://bugs.python.org/issue5468 - Dan Bill Janssen wrote:
data:image/s3,"s3://crabby-images/b2012/b20127a966d99eea8598511fc82e29f8d180df6c" alt=""
Dan Mahn <dan.mahn@digidescorp.com> wrote:
This looks like a 2->3 bug; clearly only the second branch should be used in Py3K. And that "replace" is also a bug; it should signal an error on encoding failures. It should probably catch UnicodeError and explain the problem, which is that only Latin-1 values can be passed in the query string. So the encode() to "ASCII" is also a mistake; it should be "ISO-8859-1", and the "replace" should be a "strict", I think. Bill
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
After a harder look, I concluded there was a bit more work to be done, but still very basic modifications. Attached is a version of urlencode() which seems to make the most sense to me. I wonder how I could officially propose at least some of these modifications. - Dan Bill Janssen wrote:
def urlencode(query, doseq=0, safe='', encoding=None, errors=None): """Encode a sequence of two-element tuples or dictionary into a URL query string. If any values in the query arg are sequences and doseq is true, each sequence element is converted to a separate parameter. If the query arg is a sequence of two-element tuples, the order of the parameters in the output will match the order of parameters in the input. """ if hasattr(query,"items"): # mapping objects query = query.items() else: # it's a bother at times that strings and string-like objects are # sequences... try: # non-sequence items should not work with len() # non-empty strings will fail this if len(query) and not isinstance(query[0], tuple): raise TypeError # zero-length sequences of all types will get here and succeed, # but that's a minor nit - since the original implementation # allowed empty dicts that type of behavior probably should be # preserved for consistency except TypeError: ty,va,tb = sys.exc_info() raise TypeError("not a valid non-string sequence or mapping object").with_traceback(tb) l = [] if not doseq: # preserve old behavior for k, v in query: k = quote_plus(k if isinstance(k, (str,bytes)) else str(k), safe, encoding, errors) v = quote_plus(v if isinstance(v, (str,bytes)) else str(v), safe, encoding, errors) l.append(k + '=' + v) else: for k, v in query: k = quote_plus(k if isinstance(k, (str,bytes)) else str(k), safe, encoding, errors) if isinstance(v, str): v = quote_plus(v if isinstance(v, (str,bytes)) else str(v), safe, encoding, errors) l.append(k + '=' + v) else: try: # is this a sufficient test for sequence-ness? x = len(v) except TypeError: # not a sequence v = quote_plus(str(v)) l.append(k + '=' + v) else: # loop over the sequence for elt in v: elt = quote_plus(elt if isinstance(elt, (str,bytes)) else str(elt), safe, encoding, errors) l.append(k + '=' + elt) return '&'.join(l)
data:image/s3,"s3://crabby-images/98972/989726b670c074dad357f74770b5bbf840b6471a" alt=""
On Sat, Mar 07, 2009, Dan Mahn wrote:
Submit a patch to bugs.python.org -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
Yes, that was a good idea. I found some problems, and attached a new version. It looks more complicated than I wanted, but it is a very regular repetition, so I hope it is generally readable. I used "doctest" to include the test scenarios. I was not familiar with it before, but it seems to work quite well. The main snag I hit was that I had to jazz around with the escape sequences (backslashes) in order to get the doc string to go in properly. That is, the lines in the string are not the lines I typed at the command prompt, as Python is interpreting the escapes in the strings when the file is imported. In an effort to make fewer tests, the lines of the test strings grew pretty long. I'm not sure if I should try to cut the lengths down or not. Any suggestions would be welcome before I try to submit this as a patch. - Dan Bill Janssen wrote:
from urllib.parse import quote_plus import sys def urlencode(query, doseq=0, safe='', encoding=None, errors=None): """Encode a sequence of two-element tuples or dictionary into a URL query string. If any values in the query arg are sequences and doseq is true, each sequence element is converted to a separate parameter. If the query arg is a sequence of two-element tuples, the order of the parameters in the output will match the order of parameters in the input. >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), (1, 2), ("a:", "b$"))) '%C2%A0=%C3%81&%A0%24=%C1%24&1=2&a%3A=b%24' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), (1, 2), ("a:", "b$")), safe=":$") '%C2%A0=%C3%81&%A0$=%C1$&1=2&a:=b$' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), (1, 2), ("a:", "b$")), encoding="latin=1") '%A0=%C1&%A0%24=%C1%24&1=2&a%3A=b%24' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), (1, 2), ("a:", "b$")), safe="$:", encoding="latin=1") '%A0=%C1&%A0$=%C1$&1=2&a:=b$' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), ("d:", 0xe), (1, ("b", b'\\x0c\\x24', 0xd, "e$"))), 1) '%C2%A0=%C3%81&%A0%24=%C1%24&d%3A=14&1=b&1=%0C%24&1=13&1=e%24' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), ("d:", 0xe), (1, ("b", b'\\x0c\\x24', 0xd, "e$"))), 1, safe=":$") '%C2%A0=%C3%81&%A0$=%C1$&d:=14&1=b&1=%0C$&1=13&1=e$' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), ("d:", 0xe), (1, ("b", b'\\x0c\\x24', 0xd, "e$"))), 1, encoding="latin-1") '%A0=%C1&%A0%24=%C1%24&d%3A=14&1=b&1=%0C%24&1=13&1=e%24' >>> urlencode((("\\u00a0","\\u00c1"), (b'\\xa0\\x24', b'\\xc1\\x24'), ("d:", 0xe), (1, ("b", b'\\x0c\\x24', 0xd, "e$"))), 1, safe=":$", encoding="latin-1") '%A0=%C1&%A0$=%C1$&d:=14&1=b&1=%0C$&1=13&1=e$' >>> urlencode((("\\u00a0", "\\u00c1"),), encoding="ASCII", errors="replace") '%3F=%3F' >>> urlencode((("\\u00a0", (1, "\\u00c1")),), 1, encoding="ASCII", errors="replace") '%3F=1&%3F=%3F' """ if hasattr(query,"items"): # mapping objects query = query.items() else: # it's a bother at times that strings and string-like objects are # sequences... try: # non-sequence items should not work with len() # non-empty strings will fail this if len(query) and not isinstance(query[0], tuple): raise TypeError # zero-length sequences of all types will get here and succeed, # but that's a minor nit - since the original implementation # allowed empty dicts that type of behavior probably should be # preserved for consistency except TypeError: ty,va,tb = sys.exc_info() raise TypeError("not a valid non-string sequence or mapping object").with_traceback(tb) l = [] if not doseq: # preserve old behavior for k, v in query: if isinstance(k, bytes): k = quote_plus(k, safe) else: k = quote_plus(str(k), safe, encoding, errors) if isinstance(v, bytes): v = quote_plus(v, safe) else: v = quote_plus(str(v), safe, encoding, errors) l.append(k + '=' + v) else: for k, v in query: if isinstance(k, bytes): k = quote_plus(k, safe) else: k = quote_plus(str(k), safe, encoding, errors) if isinstance(v, str): v = quote_plus(v, safe, encoding, errors) l.append(k + '=' + v) elif isinstance(v, bytes): v = quote_plus(v, safe) l.append(k + '=' + v) else: try: # is this a sufficient test for sequence-ness? x = len(v) except TypeError: # not a sequence v = quote_plus(str(v), safe, encoding, errors) l.append(k + '=' + v) else: # loop over the sequence for elt in v: if isinstance(elt, bytes): elt = quote_plus(elt, safe) else: elt = quote_plus(str(elt), safe, encoding, errors) l.append(k + '=' + elt) return '&'.join(l) if __name__ == "__main__": import doctest doctest.testmod()
data:image/s3,"s3://crabby-images/98972/989726b670c074dad357f74770b5bbf840b6471a" alt=""
On Mon, Mar 09, 2009, Dan Mahn wrote:
Any suggestions would be welcome before I try to submit this as a patch.
Just go ahead and submit it now; it's easier to review patches when they're in the system, and it also makes sure that it won't get lost. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
Ahh ... I see. I should have done a bit more digging to find where the standard tests were. I created a few new tests that could be included in that test suite -- see the attached file. Do you think that this would be sufficient? - Dan Bill Janssen wrote:
def test_encoding(self): # Test for special character encoding given = (("\u00a0", "\u00c1"),) expect = '%3F=%3F' result = urllib.parse.urlencode(given, encoding="ASCII", errors="replace") self.assertEqual(expect, result) result = urllib.parse.urlencode(given, True, encoding="ASCII", errors="replace") self.assertEqual(expect, result) given = (("\u00a0", (1, "\u00c1")),) # ... now with default utf-8 ... given = (("\u00a0", "\u00c1"),) expect = '%C2%A0=%C3%81' result = urllib.parse.urlencode(given) self.assertEqual(expect, result) # ... now with latin-1 ... expect = '%A0=%C1' result = urllib.parse.urlencode(given, encoding="latin-1") self.assertEqual(expect, result) # ... now with sequence ... given = (("\u00a0", (1, "\u00c1")),) expect = '%3F=1&%3F=%3F' result = urllib.parse.urlencode(given, True, encoding="ASCII", errors="replace") self.assertEqual(expect, result) # ... again with default utf-8 ... given = (("\u00a0", "\u00c1"),) expect = '%C2%A0=%C3%81' result = urllib.parse.urlencode(given, True) self.assertEqual(expect, result) # ... again with latin-1 ... expect = '%A0=%C1' result = urllib.parse.urlencode(given, True, encoding="latin-1") self.assertEqual(expect, result) def test_bytes(self): # Test for encoding bytes given = ((b'\xa0\x24', b'\xc1\x24'),) expect = '%A0%24=%C1%24' result = urllib.parse.urlencode(given) self.assertEqual(expect, result) # ... now with sequence ... result = urllib.parse.urlencode(given, True) self.assertEqual(expect, result) # ... now with safe and encoding ... expect = '%A0$=%C1$' result = urllib.parse.urlencode(given, safe=":$") self.assertEqual(expect, result) result = urllib.parse.urlencode(given, safe=":$", encoding="latin-1") self.assertEqual(expect, result) # ... again with sequence ... result = urllib.parse.urlencode(given, True, safe=":$") self.assertEqual(expect, result) result = urllib.parse.urlencode(given, True, safe=":$", encoding="latin-1") self.assertEqual(expect, result) # ... now with an actual sequence ... given = ((b'\xa0\x24', (b'\xc1\x24', 0xd)),) result = urllib.parse.urlencode(given, True, safe=":$") self.assert_(expect in result, "%s not found in %s" % (expect, result)) expect2 = '%A0$=1' self.assert_(expect2 in result, "%s not found in %s" % (expect2, result))
data:image/s3,"s3://crabby-images/98972/989726b670c074dad357f74770b5bbf840b6471a" alt=""
On Tue, Mar 10, 2009, Dan Mahn wrote:
First of all, please notice from the list traffic that except for Guido (who gets special dispensation because he's BDFL), most messages do not use top-posting: A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet? Second, please follow the advice to put ALL patches on the tracker. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson
data:image/s3,"s3://crabby-images/77b5a/77b5a4fe2d0fcc78533569d105a5be1d052aea40" alt=""
I submitted an explanation of this and my proposed modification as issue 5468. http://bugs.python.org/issue5468 - Dan Bill Janssen wrote:
participants (3)
-
Aahz
-
Bill Janssen
-
Dan Mahn