[New-bugs-announce] [issue26045] Improve error message for http.client when posting unicode string

Emil Stenström report at bugs.python.org
Thu Jan 7 17:27:17 EST 2016


New submission from Emil Stenström:

This issue is in response to this thread on python-ideas: https://mail.python.org/pipermail/python-ideas/2016-January/037678.html

Note that Cory did a lot of encoding background work here:
https://mail.python.org/pipermail/python-ideas/2016-January/037680.html

---
Bug description:

When posting an unencoded unicode string directly with python-requests you get the following stacktrace:

import requests
r = requests.post("http://example.com", data="Celebrate 🎉") 
...
  File "../lib/python3.4/http/client.py", line 1127, in _send_request
    body = body.encode('iso-8859-1')
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 14-15: ordinal not in range(256) 

This is because requests uses http.client, and http.client assumes the encoding to be latin-1 if given a unicode string. This is a very common source of bugs for beginners who assume sending in unicode would automatically encode it in utf-8, like in the libraries of many other languages.

The simplest fix here is to catch the UnicodeEncodeError and improve the error message to something that points beginners in the right direction.

Another option would be to:
- Keep encoding in latin-1 first, and if that fails try utf-8

Other possible solutions (that would be backwards incompatible) includes:
- Changing the default encoding to utf-8 instead of latin-1
- Detect an unencoded unicode string and fail without encoding it with a descriptive error message

---

Just to show that this is a problem that exists in the wild, here are a few examples that all crashes on the same line in http.client (not all going through the requests library:

- https://github.com/kennethreitz/requests/issues/2838
- https://github.com/kennethreitz/requests/issues/1822
- http://stackoverflow.com/questions/34618149/post-unicode-string-to-web-service-using-python-requests-library
- https://www.reddit.com/r/learnpython/comments/3violw/unicodeencodeerror_when_searching_ebay_with/
- https://github.com/codecov/codecov-python/issues/35
- https://github.com/google/google-api-python-client/issues/145
- https://bugs.launchpad.net/ubuntu/+source/lazr.restfulclient/+bug/1414063

----------
components: Unicode
messages: 257721
nosy: Emil Stenström, ezio.melotti, haypo
priority: normal
severity: normal
status: open
title: Improve error message for http.client when posting unicode string
type: enhancement
versions: Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue26045>
_______________________________________


More information about the New-bugs-announce mailing list